当前位置:网站首页>Study notes of deep learning (6)
Study notes of deep learning (6)
2022-04-21 12:00:00 【Bai Yanling】
This blog is mainly about self supervised learning (Self-Supervised Learning)
List of articles
introduce
Here are some examples of self supervised learning models

BERT Model 340M parameters
ELMO Model 94M parameters
GPT-2 Model 1542M parameters
GPT-3 Model 175M parameters
Megatron Model 8B parameters
Transformer Model 1.6T parameters
…
Self monitoring definition

BERT
BERT Enter a line of vectors , Output another line of vector


BERT To learn mask Places and “ bay ” It's the same category
BERT+Linear Training together

however Next Sentence Prediction It doesn't seem to work ?

BERT You can also do many downstream tasks , Later on
GLUE

In order to make NLU( natural language understanding ) The task plays the greatest role , From New York University 、 Institutions such as the University of Washington have created a multi task natural language understanding benchmark and analysis platform , That is to say GLUE(General Language Understanding Evaluation)GLUE Nine tasks involve natural language inference 、 The text contains 、 Sentiment analysis 、 Semantic similarity and other tasks . image BERT、XLNet、RoBERTa、ERINE、T5 And other well-known models will be tested on this benchmark .

Case
pre-train That is, to train students to fill in the blanks BERT

BERT yes semi-supervised Of . Because in downstream tasks , Need marked information ; But do self-supervised When , It's not marked .


case2 Follow case1 The difference is , stay pre-train When , A set of parameters has been initialized .






Other related


Why does BERT work
















GPT series




other
self-supervised It can be used not only in text , It can also be used in image and voice .





self-supervised

版权声明
本文为[Bai Yanling]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204211151347863.html
边栏推荐
- sublime主題配色
- 《深度学习》学习笔记(六)
- The instruction code for the computer to shut down at a specified time
- 【软件测试系列四】《软件测试需关注的测试点》
- LC刷题第四天
- c语言:指针二(线性表知识+例题详解)
- Redis cluster mode
- How to carry cookies in cross domain requests?
- vscode 经常弹出:尝试在目标目录创建文件时发生一个错误 重试 跳过这个文件 关闭安装程序
- [software test series vi] software system test scheme
猜你喜欢

Usage Summary of hiredis and rapidjson Libraries

The basic software products of xinghuan science and technology have been fully implemented and blossomed, bringing "Star" momentum to the digital transformation of enterprises

给定字符串提取姓名(字符串、list、re“零宽断言”)

星环科技基础软件产品全面落地开花,为企业数字化转型带来“星”动能

leaflet军事标绘-突击方向修改(leaflet篇.90)

Xinghan will become the co construction unit of finops industry promotion matrix in the future

ASP.NET Core实现JWT授权与认证(1.理论篇)

逆向爬虫30 某验四代滑块验证码

L2-005 集合相似度 (25 分)(set+容斥)

马斯克活在旧互联网时代?
随机推荐
LeetCode 每日一题:824. 山羊拉丁文
Detailed explanation of kubernetes (II) -- kubernetes structure and resource object
Hongshan MOFs distributed storage system won the "2022 Gold Award for distributed storage products"
L2-009 抢红包 (25 分)(结构体排序)
【链表】148. 排序链表
逆向爬虫30 某验四代滑块验证码
The market share growth rate of Dameng database is leading in the industry, and its profitability has been greatly improved
Basic logic summary of (line test) graphic reasoning questions in [recruitment evaluation questions] (with examples)
【软件测试系列四】《软件测试需关注的测试点》
星环科技基础软件产品全面落地开花,为企业数字化转型带来“星”动能
How does PHP determine whether the specified date is the previous day
Notepad++怎么复制多行黏贴到对应位置
基于SSM开发的医院住院管理信息系统(HIMS)-毕业设计-附源码
【MySQL】对JSON类型字段数据进行提取和查询
Interpretation of tamigou project | 100% equity transfer of Hainan outlets Tourism Development Co., Ltd
Applet static attribute assignment and dynamic attribute assignment
马斯克活在旧互联网时代?
Study notes of "deep learning" (VII)
STL 函数用法持续更新
教你轻松解决CSRF跨站请求伪造攻击