当前位置:网站首页>Assertionerror: invalid device ID and runtimeerror: CUDA error: invalid device ordinal
Assertionerror: invalid device ID and runtimeerror: CUDA error: invalid device ordinal
2022-04-23 20:48:00 【NuerNuer】
I am using torch These two problems occur when multiple cards are parallel .
## Question 1 :AssertionError: Invalid device id, Invalid device id
The reason for this : Combined with code interpretation :
import ...
os.environ["CUDA_VISIBLE_DEVICES"] = "2,3"
model = model(...)
torch.cuda.set_device(2)
model = torch.nn.DataParrel(model, device_ids=[2,3])
The reason for the error :os.environ["CUDA_VISIBLE_DEVICES"] = "2,3" This statement will the original divice:2 and device:3 The number of is mapped to device:0 and device:1, So in set_device Error reporting at time , Invalid device id
terms of settlement : Method 1 : take os.environ Delete . If some cards are occupied, you have to use os.environ To set up available devices , Use method 2 : Follow the remapped number , example set_device(0)
## Question two :RuntimeError: CUDA error: invalid device ordinal
The reason for this : Combining with the code
import ...
os.environ["CUDA_VISIBLE_DEVICES"] = "2,3"
model = model(...)
model = torch.nn.DataParrel(model, device_ids=[2,3])
The reason for the error : Same as above ,os.environ["CUDA_VISIBLE_DEVICES"] = "2,3" This statement will the original divice:2 and device:3 The number of is mapped to device:0 and device:1, So in execution torch.nn.DataParrel When an error
terms of settlement : Method 1 : take os.environ Delete . If some cards are occupied, you have to use os.environ To set up available devices , Use method 2 : Follow the remapped number , example :
model = torch.nn.DataParrel(model, device_ids=[0,1])
Recommend an article , Simple implementation of multi card parallel :
https://muzhan.blog.csdn.net/article/details/109318226
https://www.codeleading.com/article/2345206500
https://blog.csdn.net/weixin_34233421/article/details/91396978
版权声明
本文为[NuerNuer]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204210545522657.html
边栏推荐
- Solve importerror: cannot import name 'imread' from 'SciPy misc‘
- 缓存淘汰算法初步认识(LRU和LFU)
- Learn to C language fourth day
- 打新债中签以后怎么办,网上开户安全吗
- UnhandledPromiseRejectionwarning:CastError: Cast to ObjectId failed for value
- go slice
- Gsi-ecm digital platform for engineering construction management
- Selenium 显示等待WebDriverWait
- Leetcode 232, queue with stack
- ros功能包内自定义消息引用失败
猜你喜欢
Common problems in deploying projects with laravel and composer for PHP
电脑越用越慢怎么办?文件误删除恢复方法
GSI-ECM工程建设管理数字化平台
Recommended usage scenarios and production tools for common 60 types of charts
中创存储|想要一个好用的分布式存储云盘,到底该怎么选
Syntax Error: TypeError: this. getOptions is not a function
Lunch on the 23rd day at home
缓存淘汰算法初步认识(LRU和LFU)
一些接地气的话儿
On the three paradigms of database design
随机推荐
Factory mode
go struct
An error occurs when the addressable assets system project is packaged. Runtimedata is null
JSX syntax rules
Psychological formula for converting RGB to gray value
What about laptop Caton? Teach you to reinstall the system with one click to "revive" the computer
Leetcode 709, convert to lowercase
软件测试要怎么学?自学还是培训看完这篇文章你就懂了
Prim、Kruskal
Fastdfs mind map
Matlab: psychtoolbox installation
go defer
Identifier CV is not defined in opencv4_ CAP_ PROP_ FPS; CV_ CAP_ PROP_ FRAME_ COUNT; CV_ CAP_ PROP_ POS_ Frames problem
Unity Odin ProgressBar add value column
Introduction to standardization, regularization and normalization
Addition, deletion, modification and query of advanced MySQL data (DML)
Matlab matrix index problem
Create vs project with MATLAB
41. 缺失的第一个正数
The more you use the computer, the slower it will be? Recovery method of file accidental deletion