当前位置:网站首页>Ai21 labs | standing on the shoulders of giant frozen language models
Ai21 labs | standing on the shoulders of giant frozen language models
2022-04-23 13:30:00 【Zhiyuan community】
author :Yoav Levine , Itay Dalmedigos , Ori Ram , etc.
brief introduction : Huge pre training language model (LM) Demonstrated surprisingly excellent zero sample ability in various tasks . This creates a single 、 Attractive vision of a multifunctional model , The model has a wide range of functions in different applications . However , At present, we use “ frozen ”LM Leading technology —— namely , Keep its weight unchanged —— It's still often better to fine tune these weights in a task related way . In turn, , If you endure forgetting and impairing versatility , This indicates that there will be a trade-off between performance and versatility . The main content of this paper is , Current freezing model technology ( For example, quick adjustment ) It's just the tip of the iceberg , More powerful use of freezing LM Our approach can be fine tuned in challenging areas , Without sacrificing the versatility of the underlying model . To prove it , The author introduces three new methods of using freezing model : The prompt adjustment depends on the input PromptTuning、 Freeze reader frozen readers、 And recursive language model recursive LMs; Each method greatly improves the current freezing model method . in fact , Some of the author's methods are even better than the fine-tuning method in the current dominant field . The computational cost of each method is higher than that of the existing freezing model methods , But compared to a single pass through a huge freeze LM Still negligible . Each of these methods constitutes a meaningful contribution in itself . Please refer to the paper for details .




Paper download :https://arxiv.org/pdf/2204.10019
版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231326414398.html
边栏推荐
- [point cloud series] foldingnet: point cloud auto encoder via deep grid deformation
- ECDSA signature verification principle and C language implementation
- 9419 page analysis of the latest first-line Internet Android interview questions
- Plato farm, a top-level metauniverse game, has made frequent positive moves recently
- Machine learning -- model optimization
- How to build a line of code with M4 qprotex
- [quick platoon] 215 The kth largest element in the array
- uniapp image 引入本地图片不显示
- 基于uniapp异步封装接口请求简介
- torch. Where can transfer gradient
猜你喜欢
随机推荐
解决Oracle中文乱码的问题
MySQL5. 5 installation tutorial
mysql 基本语句查询
Machine learning -- model optimization
【动态规划】221. 最大正方形
X509 parsing
【官宣】长沙软件人才实训基地成立!
QT调用外部程序
Ffmpeg common commands
C语言之字符串与字符数组的区别
[quick platoon] 215 The kth largest element in the array
MySQL 8.0.11下载、安装和使用可视化工具连接教程
Super 40W bonus pool waiting for you to fight! The second "Changsha bank Cup" Tencent yunqi innovation competition is hot!
nodejs + mysql 实现简单注册功能(小demo)
The interviewer dug a hole for me: what's the use of "/ /" in URI?
Solve the problem of Oracle Chinese garbled code
Machine learning -- PCA and LDA
普通大学生如何拿到大厂offer?敖丙教你一招致胜!
How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!
9419页最新一线互联网Android面试题解析大全








