4
submitted 1 year ago* (last edited 1 year ago) by kernelPanic@lemmy.ml to c/machinelearning@lemmy.ml

When I train my PyTorch Lightning model on two GPUs on jupyter lab with strategy="ddp_notebook", only two CPUs are used and their usages are 100%. How can I overcome this CPU bottleneck?

Edit: I tested with PyTorchProfiler and it was because of old ssds used on the server

you are viewing a single comment's thread
view the rest of the comments
[-] troye888@lemmy.one 3 points 1 year ago

Yup this, if you would like more help we need the code, or at least a minimal viable reproduction scenario.

this post was submitted on 16 Aug 2023
4 points (100.0% liked)

Machine Learning

1788 readers
1 users here now

founded 4 years ago
MODERATORS