-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
请提出你的问题
单卡时训练可以正常完成,在相同配置下,单机双GPU运行 python -u -m paddle.distributed.launch --gpus "0,1" run_pretrain.py ./config/qwen/pretrain_argument_0p5b.json
会有如下报错:
OSError: [Errno 101] Network is unreachable
urllib3.exceptions.NewConnectionError: HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))
OSError: Can't load the model for 'Qwen/Qwen2.5-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'Qwen/Qwen2.5-0.5B' is the correct path to a directory containing one of the ['added_tokens.json']
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Qwen/Qwen2.5-0.5B/added_tokens.json (Caused by NewConnectionError("HTTPSConnection(host='bj.bcebos.com', port=443): Failed to establish a new connection: [Errno 101] Network is unreachable"))
OSError: Can't load the model for 'Qwen/Qwen2.5-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'Qwen/Qwen2.5-0.5B' is the correct path to a directory containing one of the ['added_tokens.json']