当前位置:网站首页>AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)
AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)
2022-04-23 13:26:00 【智源社区】
作者:Yoav Levine , Itay Dalmedigos , Ori Ram ,等
简介:巨型的预训练语言模型 (LM) 在各种任务中展示了令人惊讶的出色零样本能力。这就产生了一个单一的、多功能的模型的吸引人的愿景,该模型在不同的应用程序中具有广泛的功能。然而,目前利用“冻结”LM 的领先技术——即,保持其权重不变——仍然常常不如以任务相关方式修改这些权重的微调方法。反过来,如若忍受遗忘与损害多功能性,这表明将需在性能和多功能性之间进行权衡。本论文期望表达的主要内容是,当前的冻结模型技术(例如快速调整)只是冰山一角,更强大的利用冻结 LM 的方法可以在具有挑战性的领域中进行微调,而不会牺牲底层模型的多功能性。为了证明这一点,作者介绍了三种利用冻结模型的新方法:依赖于输入的提示调整PromptTuning、冻结阅读器frozen readers、和递归语言模型 recursive LMs;每种方法都大大改进了当前的冻结模型方法。事实上,作者的部分方法甚至在目前其主导的领域中优于微调方法。每种方法的计算成本都高于现有的冻结模型方法,但相对于单次通过一个巨大的冻结 LM 仍然可以忽略不计。这些方法中的每一种本身都构成了有意义的贡献。详情请参阅论文。




论文下载:https://arxiv.org/pdf/2204.10019
版权声明
本文为[智源社区]所创,转载请带上原文链接,感谢
https://hub.baai.ac.cn/views/16619
边栏推荐
- How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!
- 100 GIS practical application cases (34) - splicing 2020globeland30
- Unified task distribution scheduling execution framework
- [multi screen interaction] realize dual multi screen display II: startactivity mode
- [tensorflow] sharing mechanism
- @优秀的你!CSDN高校俱乐部主席招募!
- kettle庖丁解牛第16篇之输入组件周边讲解
- Longitude and latitude position of provincial capitals in China
- Mui wechat payment pit
- 数据仓库—什么是OLAP
猜你喜欢
![[point cloud series] so net: self organizing network for point cloud analysis](/img/3c/6136b7aa322c42089f40ce13a2d747.png)
[point cloud series] so net: self organizing network for point cloud analysis

榜样专访 | 孙光浩:高校俱乐部伴我成长并创业

超40W奖金池等你来战!第二届“长沙银行杯”腾讯云启创新大赛火热来袭!

The difference between string and character array in C language

Loading and using image classification dataset fashion MNIST in pytorch

Servlet of three web components

MySQL 8.0.11下载、安装和使用可视化工具连接教程

vscode小技巧

Imx6ull QEMU bare metal tutorial 2: usdhc SD card
![[official announcement] Changsha software talent training base was established!](/img/ee/0c2775efc4578a008c872022a95559.png)
[official announcement] Changsha software talent training base was established!
随机推荐
"Play with Lighthouse" lightweight application server self built DNS resolution server
[point cloud series] unsupervised multi task feature learning on point clouds
Longitude and latitude position of provincial capitals in China
The first lesson is canvas, showing a small case
FatFs FAT32 learning notes
Data warehouse - what is OLAP
Detailed explanation of ADB shell top command
playwright控制本地谷歌浏览打开,并下载文件
RTOS mainstream assessment
[point cloud series] foldingnet: point cloud auto encoder via deep grid deformation
web三大组件之Servlet
[point cloud series] Introduction to scene recognition
PyTorch 21. NN in pytorch Embedding module
XML
mui 微信支付 排坑
X509 parsing
[notes de marche]
[point cloud series] full revolutionary geometric features
EMMC / SD learning notes
You and the 42W bonus pool are one short of the "Changsha bank Cup" Tencent yunqi innovation competition!