当前位置：网站首页>Ai21 labs | standing on the shoulders of giant frozen language models

Ai21 labs | standing on the shoulders of giant frozen language models

2022-04-23 13:30:00 【Zhiyuan community】

author ：Yoav Levine , Itay Dalmedigos , Ori Ram , etc.

brief introduction ： Huge pre training language model (LM) Demonstrated surprisingly excellent zero sample ability in various tasks . This creates a single 、 Attractive vision of a multifunctional model , The model has a wide range of functions in different applications . However , At present, we use “ frozen ”LM Leading technology —— namely , Keep its weight unchanged —— It's still often better to fine tune these weights in a task related way . In turn, , If you endure forgetting and impairing versatility , This indicates that there will be a trade-off between performance and versatility . The main content of this paper is , Current freezing model technology （ For example, quick adjustment ） It's just the tip of the iceberg , More powerful use of freezing LM Our approach can be fine tuned in challenging areas , Without sacrificing the versatility of the underlying model . To prove it , The author introduces three new methods of using freezing model ： The prompt adjustment depends on the input PromptTuning、 Freeze reader frozen readers、 And recursive language model recursive LMs; Each method greatly improves the current freezing model method . in fact , Some of the author's methods are even better than the fine-tuning method in the current dominant field . The computational cost of each method is higher than that of the existing freezing model methods , But compared to a single pass through a huge freeze LM Still negligible . Each of these methods constitutes a meaningful contribution in itself . Please refer to the paper for details .