当前位置:网站首页>AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)
AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)
2022-04-23 13:26:00 【智源社区】
作者:Yoav Levine , Itay Dalmedigos , Ori Ram ,等
简介:巨型的预训练语言模型 (LM) 在各种任务中展示了令人惊讶的出色零样本能力。这就产生了一个单一的、多功能的模型的吸引人的愿景,该模型在不同的应用程序中具有广泛的功能。然而,目前利用“冻结”LM 的领先技术——即,保持其权重不变——仍然常常不如以任务相关方式修改这些权重的微调方法。反过来,如若忍受遗忘与损害多功能性,这表明将需在性能和多功能性之间进行权衡。本论文期望表达的主要内容是,当前的冻结模型技术(例如快速调整)只是冰山一角,更强大的利用冻结 LM 的方法可以在具有挑战性的领域中进行微调,而不会牺牲底层模型的多功能性。为了证明这一点,作者介绍了三种利用冻结模型的新方法:依赖于输入的提示调整PromptTuning、冻结阅读器frozen readers、和递归语言模型 recursive LMs;每种方法都大大改进了当前的冻结模型方法。事实上,作者的部分方法甚至在目前其主导的领域中优于微调方法。每种方法的计算成本都高于现有的冻结模型方法,但相对于单次通过一个巨大的冻结 LM 仍然可以忽略不计。这些方法中的每一种本身都构成了有意义的贡献。详情请参阅论文。
论文下载:https://arxiv.org/pdf/2204.10019
版权声明
本文为[智源社区]所创,转载请带上原文链接,感谢
https://hub.baai.ac.cn/views/16619
边栏推荐
- [point cloud series] so net: self organizing network for point cloud analysis
- MySQL 8.0.11 download, install and connect tutorials using visualization tools
- The interviewer dug a hole for me: what's the use of "/ /" in URI?
- mui 微信支付 排坑
- 在 pytorch 中加载和使用图像分类数据集 Fashion-MNIST
- “湘见”技术沙龙 | 程序员&CSDN的进阶之路
- [point cloud series] foldingnet: point cloud auto encoder via deep grid deformation
- CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
- 100 GIS practical application cases (34) - splicing 2020globeland30
- NPM err code 500 solution
猜你喜欢
According to the salary statistics of programmers in June 2021, the average salary is 15052 yuan. Are you holding back?
The first lesson is canvas, showing a small case
Machine learning -- PCA and LDA
MySQL5.5安装教程
数据仓库—什么是OLAP
集简云 x 飞书深诺,助力企业运营部实现自动化办公
[point cloud series] multi view neural human rendering (NHR)
How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!
Example interview | sun Guanghao: College Club grows and starts a business with me
"Xiangjian" Technology Salon | programmer & CSDN's advanced road
随机推荐
"Xiangjian" Technology Salon | programmer & CSDN's advanced road
mui 关闭其他页面,只保留首页面
[Technical Specification]: how to write technical documents?
5道刁钻的Activity生命周期面试题,学完去吊打面试官!
./gradlew: Permission denied
榜样专访 | 孙光浩:高校俱乐部伴我成长并创业
uniapp image 引入本地图片不显示
Is Hongmeng system plagiarism? Or the future? Professional explanation that can be understood after listening in 3 minutes
[point cloud series] deepmapping: unsupervised map estimation from multiple point clouds
Data warehouse - what is OLAP
The filter() traverses the array, which is extremely friendly
MySQL 8.0.11下载、安装和使用可视化工具连接教程
mysql 基本语句查询
Android clear app cache
[wechat applet] flex layout usage record
Common interview questions and detailed analysis of the latest Android developers in 2020
Conflict between Mui picker and drop-down refresh
filter()遍历Array异常友好
9419页最新一线互联网Android面试题解析大全
Short name of common UI control