当前位置:网站首页>unhandled system error, NCCL version 2.7.8
unhandled system error, NCCL version 2.7.8
2022-04-23 06:12:00 【wujpbb7】
在 宿主机上运行基于 DDP 的 pytorch 训练程序没问题,
进入 docker 后运行,出现 "unhandled system error, NCCL version 2.7.8" 的错误。
解决方法:
在 python -m torch.distributed.launch --nproc_per_node=4 ...前加上 NCCL_DEBUG=INFO
可以看到:
s215:623:649 [3] include/shm.h:48 NCCL WARN Error while creating shared memory segment nccl-shm-send-404da1ec128dc62d-0-3-2 (size 4104)
进入 docker 时,带上 --ipc=host 即可。
版权声明
本文为[wujpbb7]所创,转载请带上原文链接,感谢
https://blog.csdn.net/blueblood7/article/details/122969027
边栏推荐
- 1.2 preliminary pytorch neural network
- [2021 book recommendation] Red Hat Certified Engineer (RHCE) Study Guide
- 【点云系列】 场景识别类导读
- 1.1 pytorch and neural network
- Miscellaneous learning
- 1.1 PyTorch和神经网络
- Summary of image classification white box anti attack technology
- 第8章 生成式深度学习
- 【2021年新书推荐】Practical Node-RED Programming
- Computer shutdown program
猜你喜欢

Visual Studio 2019安装与使用

机器学习笔记 一:学习思路

图像分类白盒对抗攻击技术总结

Chapter 1 numpy Foundation

【点云系列】PnP-3D: A Plug-and-Play for 3D Point Clouds

Use originpro express for free
![[point cloud series] sg-gan: advantageous self attention GCN for point cloud topological parts generation](/img/1d/92aa044130d8bd86b9ea6c57dc8305.png)
[point cloud series] sg-gan: advantageous self attention GCN for point cloud topological parts generation

Fill the network gap

免费使用OriginPro学习版

【2021年新书推荐】Enterprise Application Development with C# 9 and .NET 5
随机推荐
ArcGIS license server administrator cannot start the workaround
Record WebView shows another empty pit
Machine learning III: classification prediction based on logistic regression
[2021 book recommendation] artistic intelligence for IOT Cookbook
【点云系列】PnP-3D: A Plug-and-Play for 3D Point Clouds
树莓派:双色LED灯实验
免费使用OriginPro学习版
torch_geometric学习一,MessagePassing
Three methods to realize the rotation of ImageView with its own center as the origin
[dynamic programming] triangle minimum path sum
WebRTC ICE candidate里面的raddr和rport表示什么?
MySQL notes 3_ Restraint_ Primary key constraint
xcode 编译速度慢的解决办法
电脑关机程序
Bottom navigation bar based on bottomnavigationview
Compression and acceleration technology of deep learning model (I): parameter pruning
Component based learning (1) idea and Implementation
MySQL notes 4_ Primary key auto_increment
【2021年新书推荐】Professional Azure SQL Managed Database Administration
利用官方torch版GCN训练并测试cora数据集