当前位置:网站首页>HuggingFace
HuggingFace
2022-04-23 10:48:00 【qq1033930618】
List of articles
One 、 Official website
huggingface.co
Two 、 Model download
Installation in the environment transformers package
conda install -n conda Virtual environment name transformers
The model is automatically downloaded In quotation marks is the model name
from transformers import BertTokenizer, BertModel
model = BertModel.from_pretrained('bert-base-chinese', output_hidden_states = True,)
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
Model auto download location
/home/ user name /.cache/huggingface/transformers
Manual Download
Search the model name at the top of the page
Click on Model card On the right side of the Files and Versions
The local path where the incoming model is saved
model = BertModel.from_pretrained('./model', output_hidden_states = True,)
tokenizer = BertTokenizer.from_pretrained('./model/vocab.txt')
Be careful ,BertModel.from_pretrained Enter the path of the folder
BertTokenizer.from_pretrained The input is vocab.txt, instead of tokenizer.json.
Speed up the download
model = BertModel.from_pretrained('bert-base-chinese', mirror='tuna')
3、 ... and 、 The Conduit pipeline
Use the model directly
from transformers import pipeline
classifier = pipeline("sentiment-analysis") # Emotional analysis model
classifier("We are very happy to show you the Transformers library.")
''' Returns a list of ( Contains a dictionary Dictionary key by label and score)'''
''' Multiple can use list input '''
results = classifier(["We are very happy to show you the Transformers library.", "We hope you don't hate it."])
''' Returns a list of multiple dictionaries '''
for result in results:
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
Load from dataset
pip install datasets
''' Specify classification and model ( speech recognition ) If only the classification is specified, the model will be randomly selected '''
speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
files = dataset["file"]
speech_recognizer(files[:4])
Four 、 Mark tokenizer
''' Used to hold the model '''
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
''' Print '''
classifier("Nous sommes très heureux de vous présenter la bibliothèque Transformers.")
5、 ... and 、 Automatic class AutoClass
''' Automatically retrieve the architecture of the model in the name or path of the pre trained model relation AutoTokenizer'''
6、 ... and 、 Automatic vocabulary AutoTokenizer
Split text into multiple words To the extent that the text is understandable
from transformers import AutoTokenizer
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
encoding = tokenizer("We are very happy to show you the Transformers library.")
print(encoding)
{
'input_ids': [101, 11312, 10320, 12495, 19308, 10114, 11391, 10855, 10103, 100, 58263, 13299, 119, 102],
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
版权声明
本文为[qq1033930618]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230618497078.html
边栏推荐
- C language - custom type
- Net start MySQL MySQL service is starting MySQL service failed to start. The service did not report any errors.
- 【leetcode】107.二叉树的层序遍历II
- /etc/shadow可以破解吗?
- Leetcode22: bracket generation
- Installing MySQL with CentOS / Linux
- Notes on concurrent programming of vegetables (V) thread safety and lock solution
- 二叉树的构建和遍历
- 202、快乐数
- Learning Notes 6 - Summary of several deep learning convolutional neural networks
猜你喜欢

精彩回顾 | DEEPNOVA x Iceberg Meetup Online《基于Iceberg打造实时数据湖》

Initial exploration of NVIDIA's latest 3D reconstruction technology instant NGP

得到知识服务app原型设计比较与实践

Jinglianwen technology - professional data annotation company and intelligent data annotation platform

使用zerotier让异地设备组局域网
![[provincial election joint examination 2022 d2t1] card (state compression DP, FWT convolution)](/img/e4/3c47edbc3241ba86f10a1ac8a963fd.png)
[provincial election joint examination 2022 d2t1] card (state compression DP, FWT convolution)

Cve-2019-0708 vulnerability exploitation of secondary vocational network security 2022 national competition

Swagger2 接口如何导入Postman

Ueditor -- limitation of 4m size of image upload component

【leetcode】102.二叉树的层序遍历
随机推荐
Simple thoughts on the design of a microblog database
Linked list intersection (linked list)
A diary of dishes | 238 Product of arrays other than itself
[Niuke challenge 47] C. conditions (BitSet acceleration Floyd)
Esp32 learning - use and configuration of GPIO
IDEA——》每次启动都会Indexing或 scanning files to index
Yarn resource scheduler
Embedded related surface (I)
Ansible playbook syntax and format automate cloud computing
Sim Api User Guide(5)
Arm debugging (1): two methods to redirect printf to serial port in keil
What are the system events of Jerry's [chapter]
域名和IP地址的联系
997. Square of ordered array (array)
Full stack cross compilation x86 completion process experience sharing
精彩回顾 | DEEPNOVA x Iceberg Meetup Online《基于Iceberg打造实时数据湖》
Swagger2 自定义参数注解如何不显示
得到知识服务app原型设计比较与实践
203、移出链表元素(链表)
LeetCode-608. Tree node