当前位置:网站首页>Deploying sbert model based on torchserve < semantic similarity task >
Deploying sbert model based on torchserve < semantic similarity task >
2022-04-23 02:29:00 【Weiyaner】
List of articles
This is a question about how to use TorchServe Deploy pre trained HuggingFace Sentence transformers Model guide .
Mission : semantic similarity ( Return to )
Algorithm model :Sentence BERT
Pre training model :TinyBERT_L6_ch
frame :torch 1.11
System environment :Ubuntu 20.04 ( Cloud server CPU)
1 Get the model
The channel of the model comes from 2 aspect
- Get the pre training model directly
stay huggingface transformers Official website , You can download the relevant pre training model . - Fine tuned model on task dataset
After fine tuning the local dataset , Save better models , Get the model and configuration file .
from sentence_transformers import SentenceTransformer
smodel = SentenceTransformer(model_dir_path)
smodel.save('model_name')
obtain :

Mainly use the three files inside , Namely pytorch_model.bin,config.json,vocab.txt.
2 install torchserve
TorchServe from Java Realization , Therefore, the latest version of OpenJDK To run the .
apt install openjdk-11-jdk
If win Environmental Science , Refer to here for manual installation :https://blog.csdn.net/qq_41873673/article/details/108027074
Next , In order to ensure a clean environment , by TorchServe Create a new Conda Environment and activate .
conda create -n torchserve
source activate torchserve
Next install TorchServe The dependencies of .
conda install sentencepiece torch-model-archiver psutil pytorch torchserve torchvision torchtext
If you want to use GPU example , Additional software packages are required .
conda install cudatoolkit=10.1
Now the dependencies have been installed , Can be cloned TorchServe The repository .
git clone https://github.com/pytorch/serve.git
cd serve
Environment setting is complete , Next, continue to model encapsulation .
3 Encapsulate models and interfaces
Before encapsulating the model and interface, you need to prepare the model and interface files
3.1 Prepare the model
stay serve Create a new folder in the directory , Used to save the files related to the above three models .
mkdir Transformer_model/TinyBERT_L6_ch
mv pytorch_model.bin vocab.txt config.json Transformer_model/TinyBERT_L6_ch/
3.2 Prepare interface documents
Handler It can be TorchServe The name of the built-in processor or py Path to file , To handle customized TorchServe Reasoning logic . It is mainly divided into three parts , Namely preprocess,inference and postprocess, Represents the processing of data , Model reasoning and post-processing .
TorchServe The library supports the following handlers :image_classifier, object_detector, text_classifier, image_segmenter.
There is no supporting interface for semantic similarity tasks , Need to write based on personal data .
handler.py
import json
import zipfile
from json import JSONEncoder
import numpy as np
import os
class NumpyArrayEncoder(JSONEncoder):
def default(self, obj):
if isinstance(obj, np.ndarray):
return obj.tolist()
return JSONEncoder.default(self, obj)
class SentenceTransformerHandler(object):
def __init__(self):
super(SentenceTransformerHandler, self).__init__()
self.initialized = False
self.embedder = None
def initialize(self, context):
properties = context.system_properties
model_dir = properties.get("model_dir")
self.embedder = SentenceTransformer(model_dir)
self.initialized = True
def preprocess(self, data):
##
inputs = data[0].get("data")
if inputs is None:
inputs = data[0].get("body")
inputs = inputs.decode('utf-8')
inputs = json.loads(inputs)
sentences= inputs['queries']
return sentences
def inference(self, data):
query_embeddings = self.embedder.encode(data)
return query_embeddings
def postprocess(self, data):
return [json.dumps(data,cls=NumpyArrayEncoder)]
And then through run_handler.py call , As an interface
run_handler.py
from handler import SentenceTransformerHandler
_service = SentenceTransformerHandler()
def handle(data, context):
""" Entry point for SentenceTransformerHandler handler """
try:
if not _service.initialized:
print('ENTERING INITIALIZATION')
if data is None:
return None
data = _service.preprocess(data)
data = _service.inference(data)
data = _service.postprocess(data)
return data
except Exception as e:
raise Exception("Unable to process input data. " + str(e))
3.3 encapsulation
stay TinyBERT_L4_ch Under the folder , Execute the encapsulation command
Three model files
1. Model parameter file (torch) pytorch_model.bin
2. Model configuration file config.json
3. Thesaurus vocab.txt chinese 21128, english 30522
Two interface files :
1. handler.py Define interface functions
2. run_handler.py Call the interface class defined above , As handler Pass in
Encapsulate models and interfaces
torch-model-archiver --model-name sbert_tiny4ch \
--version 1.0 \
--serialized-file pytorch_model.bin \
--export-path /root/weiyan/serve/model_store \
--handler run_handler.py \
--extra-files "handler.py,config.json,vocab.txt" \
--runtime python3 -f
Encapsulate the command
torch-model-archiver --model-name sbert_tiny4ch --version 1.0 --serialized-file pytorch_model.bin --export-path /root/weiyan/serve/model_store --handler run_handler.py --extra-files "handler.py,config.json,vocab.txt" --runtime python3 -f
stay model-store Folder gets a sbert_tiny4ch.mar file .
4 Deployment model
4.1 start-up torchserve
stay serve Under the table of contents , Create a new folder , Store the configuration file config.properties, Record about the model path and file name .
mkdir Configs
vi sbert_tiny4_config.properties
sbert_tiny4_config.properties
model_store=/root/weiyan/serve/model_store
load_models=sbert_tiny4ch.mar
start-up torchserve
torchserve --ncs --start --ts-config sbert_tiny4_config.properties
Then print out a series of model information

If there is no error in the middle , Then the model deployment is successful .
4.2 Model reasoning
Use the relevant commands to view the model deployment
1.ping Connection status
curl http://localhost:8080/ping
{
"status": "Healthy"
}
2. Model loading
curl http://localhost:8081/models
{
"models": [
{
"modelName": "sbert_tiny4ch",
"modelUrl": "sbert_tiny4ch.mar"
}
]
}
4.3 Semantic similarity reasoning
because SBERT about querys the embeddings expression , So the output of the model is also embeddings, The similarity between the two can be calculated in the above handler.py In file , You can also get the reasoning result, that is embeddings after , Operate again .
Calculate the similarity after selecting here
sbert_tiny4ch.py
import requests
import json
from scipy import spatial
sentences = [' I go skiing '," I'm going skiing "]
data = {
'data':json.dumps({
'queries':sentences})}
print(data)
response = requests.post('http://localhost:8080/predictions/sbert_tiny4ch',data = data)
print(response)
if response.status_code==200:
vectors = response.json()
sim = 1 - (spatial.distance.cosine(vectors[0], vectors[1]))
print(' The similarity is :',sim)
Execute the script
python sbert_tiny4ch.py
Return the similarity result
{
'data': '{"queries": ["\\u4eca\\u5929\\u53bb\\u6ed1\\u96ea", "\\u6211\\u8981\\u53bb\\u6ed1\\u51b0"]}'}
<Response [200]>
The similarity is : 0.9452057554006862
Relevant error reports and solutions
Query results 404
Report errors
{
"code": 404,
"type": "ResourceNotFoundException",
"message": "Requested resource is not found, please refer to API document."
}
reason :
Wrong port used , There are three ports 8080,8081,8082, It's really not good. Try it all .
commonly ,:8080/predictions.
see models Use 8081.
ping Use 8080
Query results 503
prediction error , Generally, there is an error when encapsulating the model and interface , My experience is handler.py Among them process Wrong function , Cause problems in processing data , Naturally, there is no way to return the result of reasoning .
Generally, the models are tested locally before they are deployed online , So make sure there are no errors in the model file itself .
see logs
Running torchserve Under the directory of , Will generate a logs Folder , It automatically saves some logs, among model.logs Save the operation information of the model , Including the above errors Believe in information , You can go back to the log to see , What exactly is wrong with the model .
/tmp/models
Save the latest model file in the folder , That is to say

版权声明
本文为[Weiyaner]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230228265193.html
边栏推荐
- Go language ⌈ mutex and state coordination ⌋
- 都是做全屋智能的,Aqara和HomeKit到底有什么不同?
- RT_ Thread ask and answer
- Network jitter tool clumsy
- [nk]牛客月赛48 D
- C语言中*与&的用法与区别 以及关键字static和volatile 的含义
- Fast and robust multi person 3D pose estimation from multiple views
- 【无标题】
- 下载正版Origin Pro 2022 教程 及 如何 激 活
- 从开源爱好者到 Apache 董事,一共分几步?
猜你喜欢
Rich intelligent auxiliary functions and exposure of Sihao X6 security configuration: it will be pre sold on April 23

89 logistic回歸用戶畫像用戶響應度預測

005_ redis_ Set set

010_StringRedisTemplate

一、序列模型-sequence model

013_基于Session实现短信验证码登录流程分析

小程序 canvas 画布半圆环

So library dependency

【无标题】

SQL server2019 cannot download the required files, which may indicate that the version of the installer is no longer supported. What should I do
随机推荐
JSP page nesting
Campus transfer second-hand market source code
Explain JS prototype and prototype chain in detail
Execute external SQL script in MySQL workbench and report error
PTA: 点赞狂魔
Consider defining a bean of type 'com netflix. discovery. AbstractDiscoveryClientOptionalArgs‘
一个国产图像分割项目重磅开源!
002_ Redis_ Common operation commands of string type
Rich intelligent auxiliary functions and exposure of Sihao X6 security configuration: it will be pre sold on April 23
010_StringRedisTemplate
How to recognize products from the perspective of Dialectics
A domestic image segmentation project is heavy and open source!
解决 注册谷歌邮箱 gmail 手机号无法用于验证
On LAN
005_ redis_ Set set
Fast and robust multi person 3D pose estimation from multiple views
Consider defining a bean of type ‘com.netflix.discovery.AbstractDiscoveryClientOptionalArgs‘
If 404 page is like this | daily anecdotes
RT_Thread自问自答
Go language ⌈ mutex and state coordination ⌋