当前位置:网站首页>unhandled system error, NCCL version 2.7.8
unhandled system error, NCCL version 2.7.8
2022-04-23 06:12:00 【wujpbb7】
在 宿主机上运行基于 DDP 的 pytorch 训练程序没问题,
进入 docker 后运行,出现 "unhandled system error, NCCL version 2.7.8" 的错误。
解决方法:
在 python -m torch.distributed.launch --nproc_per_node=4 ...前加上 NCCL_DEBUG=INFO
可以看到:
s215:623:649 [3] include/shm.h:48 NCCL WARN Error while creating shared memory segment nccl-shm-send-404da1ec128dc62d-0-3-2 (size 4104)
进入 docker 时,带上 --ipc=host 即可。
版权声明
本文为[wujpbb7]所创,转载请带上原文链接,感谢
https://blog.csdn.net/blueblood7/article/details/122969027
边栏推荐
- [2021 book recommendation] effortless app development with Oracle visual builder
- c语言编写一个猜数字游戏编写
- Chapter 3 pytoch neural network toolbox
- Record WebView shows another empty pit
- MySQL数据库安装与配置详解
- N states of prime number solution
- MySQL notes 5_ Operation data
- MySQL的安装与配置——详细教程
- 第4章 Pytorch数据处理工具箱
- Machine learning notes 1: learning ideas
猜你喜欢
【点云系列】PnP-3D: A Plug-and-Play for 3D Point Clouds
SSL/TLS应用示例
Raspberry Pie: two color LED lamp experiment
What did you do during the internship
【2021年新书推荐】Professional Azure SQL Managed Database Administration
给女朋友写个微信双开小工具
[2021 book recommendation] artistic intelligence for IOT Cookbook
【点云系列】DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds
C language, a number guessing game
第8章 生成式深度学习
随机推荐
Pytorch trains the basic process of a network in five steps
Easyui combobox 判断输入项是否存在于下拉列表中
[dynamic programming] different binary search trees
最简单完整的libwebsockets的例子
Markdown basic grammar notes
DCMTK(DCM4CHE)与DICOOGLE协同工作
torch. mm() torch. sparse. mm() torch. bmm() torch. Mul () torch The difference between matmul()
Bottom navigation bar based on bottomnavigationview
【2021年新书推荐】Effortless App Development with Oracle Visual Builder
[point cloud series] pnp-3d: a plug and play for 3D point clouds
[2021 book recommendation] red hat rhcsa 8 cert Guide: ex200
PyTorch最佳实践和代码编写风格指南
Component based learning (1) idea and Implementation
[3D shape reconstruction series] implicit functions in feature space for 3D shape reconstruction and completion
Pymysql connection database
Chapter 5 fundamentals of machine learning
【点云系列】FoldingNet:Point Cloud Auto encoder via Deep Grid Deformation
[point cloud series] sg-gan: advantageous self attention GCN for point cloud topological parts generation
【点云系列】Neural Opacity Point Cloud(NOPC)
【2021年新书推荐】Learn WinUI 3.0