当前位置:网站首页>Machine learning model too slow?Look at Intel (R) extension to accelerate
Machine learning model too slow?Look at Intel (R) extension to accelerate
2022-08-08 10:03:00 【ShowMeAI】

作者:韩信子@ShowMeAI
机器学习实战系列:http://www.showmeai.tech/tutorials/41
本文地址:http://www.showmeai.tech/article-detail/295
声明:版权所有,转载请联系平台与作者并注明出处
收藏ShowMeAI查看更多精彩内容
We are applying machine learning model,除了最终效果,Also very concerned about their performance.And the performance of the machine learning model,Depends not only on the way we applied(What characteristics of the、模型复杂度),Also is closely related and hardware.
在本篇内容中,ShowMeAI To introduce the is Intel 针对 Scikit-Learn The acceleration of library of machine learning tool for uncertain,Can very big degree to accelerate our forecast model training and speed.
In the case of the our machine learning application,使用intelExpand speed way,Just take the original modeling way1/5的时间,Can accomplish the same task to achieve a consistent effect.

Scikit-Learn (SKlearn) 机器学习工具库
Scikit-Learn (Sklearn) 是 Python The most useful and most powerful machine learning repository. 它通过 Python The interface for machine learning and statistical modeling provides a series of effective tools,包括分类、回归、聚类和降维.

SKLearn 的快速使用方法也推荐大家查看[ShowMeAI](http://www.showmeai.tech/)的文章和速查手册:
面向 Scikit-Learn Intel's extension
Scikit-LearnIs a large tool library,But its performance is not always the best,有时候一些 ML Algorithm takes hours to run,时间成本很高.
面向 Scikit-Learn Intel's extension(Intel Extension for Scikit-learn)At the familiar modeling method based on,添加几行代码,Can significantly improve performance,而且它也是开源的.

scikit-learn-intelex 提速效果
通过 scikit-learn-intelex Optimization algorithm works,可以获得 1-3 The efficiency of the order of magnitude improvement,Ultimately depends on the data set and algorithm used.
- Intel Extension for Scikit-Learn 提供了许多 Scikit-Learn 算法(下表)的优化实现,These algorithms are consistent with the original version,Having the same end result.
- Even if you use the current expansion does not support algorithm or parameters,Kit will be automatically returned to the original Scikit-Learn,Ensure all seamless,Most of the code can maintain the original form,无需重写代码.

The application of Intel Extension for Scikit-learn Can the above Scikit-Learn 算法效率优化,具体的可以参考官网介绍.
工具库安装&配置
Intel Extension for Scikit-Learn 支持 x86 架构上的 Linux、Windows 和 Mac 系统. 它可以使用 PyPI 或 Anaconda Cloud 下载:
从 PyPI 安装
只需在命令行运行 pip 命令进行安装:
pip install scikit-learn-intelex
从 Anaconda 安装
◉ Conda-Forge方式:
сonda install scikit-learn-intelex -c conda-forge
◉ Intel channel:
conda install scikit-learn-intelex -c intel
◉ 默认方式:
conda install scikit-learn-intelex
从容器安装
请注意,Proper access link to DockerHub 帐户.
You can through the following command,The latest Intel Scikit-Learn Extensions for the installation of Docker 容器:
docker pull intel/intel-optimized-ml:scikit-learn
英特尔 Scikit-Learn 扩展使用方法:
打补丁 patch_sklearn
◉ patch
Is a kind of keep Scikit-Learn The method of inventory version for use,You can add it at the beginning of the code( patch_sklearn()
函数调用),如下所示:
############### Insert here the patch##########################
from sklearnex import patch_sklearn
patch_sklearn()
##################################################################
◉ The patch after allimportStatement of the imported fromsklearn
的算法,The first importScikit-Learn优化版本
# Importing sklearn optimised version of LogisticRegression
from sklearn.linear_model import LogisticRegression
# Creating an object for model and fitting it on training data set
logmodel = LogisticRegression()
logmodel.fit(X_train_sm, y_train_sm)
# Predicting the Target variable
predicted = logmodel.predict(X_test)# Classification Report
report = metrics.classification_report(y_test, predicted)
️ 注意:The import order is important,Please remember that the importScikit-LearnPatch tool library before!
If you don't want to use the patch,可以随时通过
sklearnex.unpatch_sklearn()
取消intel的补丁.
其他替代方案
You have many alternatives,The same can enable Intel Extension for Scikit-Learn优化:
◉ At the command line directly using expand run the original Scikit-Learn The application code for accelerating:
python -m sklearnex my_application.py
◉ Import the specified algorithm to speed up:
# Specify an algorithm to accelerate
from sklearnex import patch_sklearn
patch_sklearn("SVC")
# Specify a number of algorithms accelerate
patch_sklearn(["SVC", "PCA"])
# The acceleration of cancellation for algorithm patch
from sklearnex import unpatch_sklearn
unpatch_sklearn("SVC")
View the optimization algorithm list
# Can be directly check list can accelerate algorithm
from sklearnex import get_patch_names
get_patch_names()
>>
['pca','kmeans','dbscan', 'distances','linear','ridge','elasticnet','lasso',
'logistic','log_reg','knn_classifier','nearest_neighbors',
'knn_regressor', 'random_forest_classifier','random_forest_regressor',
'train_test_split', 'fin_check','roc_auc_score', 'tsne', 'logisticregression',
'kneighborsclassifier', 'nearestneighbors','kneighborsregressor',
'randomrorestclassifier', 'randomforestregressor','svr', 'svc', 'nusvr',
'nusvc','set_config', 'get_config','config_context']
Global patch
If you want to use global patch to optimize all Scikit-Learn 应用程序,无需任何额外操作,Only need to run at the command line:
python sklearnex.glob patch_sklearn
案例
We use a case to explain,使用英特尔 Extension for Scikit-Learn,相比原始版本,How much improve,The business scenario used here isCredit card fraud identification scene.

数据可以在 Kaggle场景数据 下载,也可以在ShowMeAI 的百度网盘地址直接下载.
实战数据集下载(百度网盘):公众号『ShowMeAI研究中心』回复『实战』,或者点击 这里 获取本文 [8] 面向 Scikit-Learn Intel accelerated expansion patch 『creditcard Credit card fraud scenario data set』
ShowMeAI官方GitHub:https://github.com/ShowMeAI-Hub
Cancel the original version of the patch
Because our computer configured,intelOptimization algorithm will be global effect,In order to compare the original version of thesklearn速度,我们这里使用unpatch_sklearn()
Function to cancel the patch,The logistic regression model training and testing cost 35.5
秒.
################ Insert Patch here ############################
from sklearnex import unpatch_sklearn
unpatch_sklearn()
##########################################################
# 导入
from sklearn.linear_model import LogisticRegression
# 开始时间
start_time = time.time()
# Initialize the model and you and data set
logmodel = LogisticRegression()
logmodel.fit(X_train_sm, y_train_sm)
# 测试集预估
predicted = logmodel.predict(X_test)
patched_time = time.time() - start_time
print("Time to calculate \033[1m logmodel.predict in Unpatched scikit-learn {:4.1f}\033[0m seconds".format(patched_time))
# 效果评估
report = metrics.classification_report(y_test, predicted)
print(f"Classification report for Logistic Regression with SMOTE:\n{report}\n")
Run results screenshot as shown below:

The acceleration of add patch version
The following is our testIntel优化版Logistic Regression算法, 代码使用patch_sklearn()
添加补丁,最后 执行时间 7.1
秒,可以看到,Only took the original version20%的时间,Is exactly the same results have been achieved!

总结
This paper introduces the Intel forScikit-LearnThe acceleration of expansion,它的一些特点包括:
- Optimize the common ML 算法的性能
- 减少 ML 训练和推理时间
- 提供无缝体验(Just add two lines of code to enable speed up)

参考资料
- 实战数据集下载(百度网盘):公众号『ShowMeAI研究中心』回复『实战』,或者点击 这里 获取本文 [8] 面向 Scikit-Learn Intel accelerated expansion patch 『creditcard Credit card fraud scenario data set』
- ShowMeAI官方GitHub:https://github.com/ShowMeAI-Hub
- SKLearn 官网:https://scikit-learn.org/stable/
- SKLearn Github:https://github.com/scikit-learn/scikit-learn
- scikit-learn-intelex:https://github.com/intel/scikit-learn-intelex
- 机器学习实战 | SKLearn入门与简单应用案例:http://www.showmeai.tech/article-detail/202
- 机器学习实战 | SKLearn最全应用指南:http://www.showmeai.tech/article-detail/203
- AIVertical tool library quick table | Scikit-Learn 速查表:http://www.showmeai.tech/article-detail/108
边栏推荐
- 电视机画质问题--PQ问题
- Web优化躬行记(6)——优化闭环实践
- 中原银行实时风控体系建设实践
- 定时任务框架Quartz-(一)Quartz入门与Demo搭建
- 面试突击72:输入URL之后会执行什么流程?
- 代码检查工具
- 正确使用灯光 安全文明出行
- Categorized input and output, Go lang1.18 introductory refining tutorial, from Bai Ding to Hongru, go lang basic data types and input and output EP03
- 分门别类输入输出,Go lang1.18入门精炼教程,由白丁入鸿儒,go lang基本数据类型和输入输出EP03
- VPP源地址NAT
猜你喜欢
随机推荐
22-08-06 Xi'an EasyExcel implements dictionary table import and export
Web optimization experience (6) - optimization closed-loop practice
Web优化躬行记(6)——优化闭环实践
Mobile/Embedded-CV Model-2018: MobileFaceNets
文档数据库于键值数据库有什么不同吗?
语音聊天app开发——对用户更具吸引力的设计
实例存储之shelve
【AGC】开放式测试示例
VPP static mapping to realize DNAT
Loadrunner的录制event为0的问题解决方法与思路
买股票用同花顺安全吗?资金会不会被转走?
C# api 将base64编码 上传至fastdfs转成文件
Simple Mixed Operations Calculator
.net开发中,C# DateTime.Now 取出的时间含有星期解决办法
面试突击72:输入URL之后会执行什么流程?
code inspection tool
Is it safe to buy stocks with a straight flush?Will the funds be transferred?
2万字50张图玩转Flink面试体系
Database Tuning: The Impact of Mysql Indexes on Group By Sorting
LVS负载均衡群集及NAT模式群集