当前位置:网站首页>Assertionerror: invalid device ID and runtimeerror: CUDA error: invalid device ordinal
Assertionerror: invalid device ID and runtimeerror: CUDA error: invalid device ordinal
2022-04-23 20:48:00 【NuerNuer】
I am using torch These two problems occur when multiple cards are parallel .
## Question 1 :AssertionError: Invalid device id, Invalid device id
The reason for this : Combined with code interpretation :
import ...
os.environ["CUDA_VISIBLE_DEVICES"] = "2,3"
model = model(...)
torch.cuda.set_device(2)
model = torch.nn.DataParrel(model, device_ids=[2,3])
The reason for the error :os.environ["CUDA_VISIBLE_DEVICES"] = "2,3" This statement will the original divice:2 and device:3 The number of is mapped to device:0 and device:1, So in set_device Error reporting at time , Invalid device id
terms of settlement : Method 1 : take os.environ Delete . If some cards are occupied, you have to use os.environ To set up available devices , Use method 2 : Follow the remapped number , example set_device(0)
## Question two :RuntimeError: CUDA error: invalid device ordinal
The reason for this : Combining with the code
import ...
os.environ["CUDA_VISIBLE_DEVICES"] = "2,3"
model = model(...)
model = torch.nn.DataParrel(model, device_ids=[2,3])
The reason for the error : Same as above ,os.environ["CUDA_VISIBLE_DEVICES"] = "2,3" This statement will the original divice:2 and device:3 The number of is mapped to device:0 and device:1, So in execution torch.nn.DataParrel When an error
terms of settlement : Method 1 : take os.environ Delete . If some cards are occupied, you have to use os.environ To set up available devices , Use method 2 : Follow the remapped number , example :
model = torch.nn.DataParrel(model, device_ids=[0,1])
Recommend an article , Simple implementation of multi card parallel :
https://muzhan.blog.csdn.net/article/details/109318226
https://www.codeleading.com/article/2345206500
https://blog.csdn.net/weixin_34233421/article/details/91396978
版权声明
本文为[NuerNuer]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204210545522657.html
边栏推荐
- 41. The first missing positive number
- 【SQL】字符串系列2:将一个字符串根据特定字符分拆成多行
- Vulnhub DC: 1 penetration notes
- Fastdfs mind map
- Go限制深度遍历目录下文件
- Syntax Error: TypeError: this. getOptions is not a function
- 2021-09-02 unity project uses rider to build hot change project failure record of ilruntime
- Thinking after learning to type
- Selenium displays webdriverwait
- 100天拿下11K,转岗测试的超全学习指南
猜你喜欢
随机推荐
Elastic box model
wait、waitpid
Communication between RING3 and ring0
vulnhub DC:1渗透笔记
pikachuxss如何获取cookie靶场,返回首页总是失败
Thinking after learning to type
Leetcode 20. Valid parentheses
Introduction to standardization, regularization and normalization
Addition, deletion, modification and query of MySQL advanced table
[matlab 2016 use mex command to find editor visual studio 2019]
A login and exit component based on token
DOS command of Intranet penetration
Rust更适合经验较少的程序员?
MySQL基础之写表(创建表)
go interface
How to learn software testing? Self study or training? After reading this article, you will understand
【SQL】字符串系列2:将一个字符串根据特定字符分拆成多行
The more you use the computer, the slower it will be? Recovery method of file accidental deletion
Minecraft 1.12.2模组开发(四十三) 自定义盾牌(Shield)
Addition, deletion, modification and query of advanced MySQL data (DML)








