当前位置:网站首页>Ai21 labs | standing on the shoulders of giant frozen language models
Ai21 labs | standing on the shoulders of giant frozen language models
2022-04-23 13:30:00 【Zhiyuan community】
author :Yoav Levine , Itay Dalmedigos , Ori Ram , etc.
brief introduction : Huge pre training language model (LM) Demonstrated surprisingly excellent zero sample ability in various tasks . This creates a single 、 Attractive vision of a multifunctional model , The model has a wide range of functions in different applications . However , At present, we use “ frozen ”LM Leading technology —— namely , Keep its weight unchanged —— It's still often better to fine tune these weights in a task related way . In turn, , If you endure forgetting and impairing versatility , This indicates that there will be a trade-off between performance and versatility . The main content of this paper is , Current freezing model technology ( For example, quick adjustment ) It's just the tip of the iceberg , More powerful use of freezing LM Our approach can be fine tuned in challenging areas , Without sacrificing the versatility of the underlying model . To prove it , The author introduces three new methods of using freezing model : The prompt adjustment depends on the input PromptTuning、 Freeze reader frozen readers、 And recursive language model recursive LMs; Each method greatly improves the current freezing model method . in fact , Some of the author's methods are even better than the fine-tuning method in the current dominant field . The computational cost of each method is higher than that of the existing freezing model methods , But compared to a single pass through a huge freeze LM Still negligible . Each of these methods constitutes a meaningful contribution in itself . Please refer to the paper for details .
Paper download :https://arxiv.org/pdf/2204.10019
版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231326414398.html
边栏推荐
- [point cloud series] learning representations and generative models for 3D point clouds
- Super 40W bonus pool waiting for you to fight! The second "Changsha bank Cup" Tencent yunqi innovation competition is hot!
- Solve the problem of Oracle Chinese garbled code
- 【动态规划】221. 最大正方形
- [point cloud series] so net: self organizing network for point cloud analysis
- mui picker和下拉刷新冲突问题
- Playwright contrôle l'ouverture de la navigation Google locale et télécharge des fichiers
- 面试官给我挖坑:URI中的 “//” 有什么用?
- 顶级元宇宙游戏Plato Farm,近期动作不断利好频频
- [point cloud series] Introduction to scene recognition
猜你喜欢
Nodejs + Mysql realize simple registration function (small demo)
MySQL 8.0.11下载、安装和使用可视化工具连接教程
@Excellent you! CSDN College Club President Recruitment!
@优秀的你!CSDN高校俱乐部主席招募!
EMMC / SD learning notes
100000 college students have become ape powder. What are you waiting for?
2020年最新字节跳动Android开发者常见面试题及详细解析
【动态规划】221. 最大正方形
ESP32 VHCI架构传统蓝牙设置scan mode,让设备能被搜索到
Stack protector under armcc / GCC
随机推荐
Nodejs + Mysql realize simple registration function (small demo)
@优秀的你!CSDN高校俱乐部主席招募!
"Xiangjian" Technology Salon | programmer & CSDN's advanced road
What do the raddr and rport in webrtc ice candidate mean?
Request和Response及其ServletContext总结
解决Oracle中文乱码的问题
According to the salary statistics of programmers in June 2021, the average salary is 15052 yuan. Are you holding back?
Mysql数据库的卸载
集简云 x 飞书深诺,助力企业运营部实现自动化办公
TCP 复位gongji原理和实战复现
TERSUS笔记员工信息516-Mysql查询(2个字段的时间段唯一性判断)
十万大学生都已成为猿粉,你还在等什么?
【行走的笔记】
9419页最新一线互联网Android面试题解析大全
Learning notes of AMBA protocol
面试官给我挖坑:URI中的 “//” 有什么用?
uniapp image 引入本地图片不显示
Summary of request and response and their ServletContext
mui 微信支付 排坑
Hbuilderx + uniapp packaging IPA submission app store stepping on the pit