You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected behavior:
All operations and memory usage should be confined to the specified GPU (cuda:1 in this case).
Actual behavior:
GPU0 consistently shows a 266MB memory usage, despite operations being directed to other GPUs.
Questions:
Is this 266MB memory usage on GPU0 expected behavior?
If not, what could be causing this persistent memory allocation on GPU0?
Are there any known workarounds or solutions to ensure all operations and memory usage are isolated to the specified GPU?
I would greatly appreciate any insights or solutions the repository maintainers could provide to address this issue. Thank you for your time and assistance.
The text was updated successfully, but these errors were encountered:
Your current environment information
Environment:
accelerate==1.0.0rc1
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
aiosqlite==0.20.0
annotated-types==0.7.0
anyio==4.4.0
appdirs==1.4.4
asgiref==3.8.1
async-timeout==4.0.3
attrs==24.2.0
bentoml==1.3.5
cattrs==23.1.2
certifi==2024.8.30
charset-normalizer==3.3.2
circus==0.18.0
click==8.1.7
click-option-group==0.5.6
cloudpickle==3.0.0
DeepCache==0.1.1
deepmerge==2.0
Deprecated==1.2.14
diffusers @ git+https://github.com/huggingface/diffusers@95a7832879a3ca7debd3f7a4ee05b08ddd19a8a7
exceptiongroup==1.2.2
filelock==3.16.0
frozenlist==1.4.1
fs==2.4.16
fsspec==2024.9.0
h11==0.14.0
httpcore==1.0.5
httpx==0.27.2
httpx-ws==0.6.0
huggingface-hub==0.24.7
idna==3.9
importlib-metadata==6.11.0
inflection==0.5.1
inquirerpy==0.3.4
Jinja2==3.1.4
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mdurl==0.1.2
mpmath==1.3.0
multidict==6.1.0
networkx==3.3
numpy==1.24.1
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-ml-py==11.525.150
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.6.68
nvidia-nvtx-cu12==12.1.105
omegaconf==2.4.0.dev3
onediff==1.2.1.dev23
onediffx==1.2.1.dev23
oneflow==0.9.1.dev20240913+cu121
onefx==0.0.3
opentelemetry-api==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-asgi==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
opentelemetry-util-http==0.41b0
packaging==24.1
pathspec==0.12.1
pfzy==0.3.4
pillow==10.4.0
pip-requirements-parser==32.0.1
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==5.28.1
psutil==6.0.0
pydantic==2.9.2
pydantic_core==2.23.4
Pygments==2.18.0
pyparsing==3.1.4
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
python-multipart==0.0.9
PyYAML==6.0.2
pyzmq==26.2.0
regex==2024.9.11
requests==2.32.3
rich==13.8.1
safetensors==0.4.5
schema==0.7.7
sentencepiece==0.2.0
simple-di==0.1.5
six==1.16.0
sniffio==1.3.1
starlette==0.38.5
sympy==1.13.2
tokenizers==0.13.3
tomli==2.0.1
tomli_w==1.0.0
torch==2.3.0
tornado==6.4.1
tqdm==4.66.5
transformers==4.27.1
triton==2.3.0
typing_extensions==4.12.2
urllib3==2.2.3
uv==0.4.12
uvicorn==0.30.6
watchfiles==0.24.0
wcwidth==0.2.13
wrapt==1.16.0
wsproto==1.2.0
yarl==1.11.1
zipp==3.20.2
🐛 Describe the bug
I'm attempting to implement the acceleration method described in this article: https://github.com/siliconflow/onediff/tree/7c325253d4e280e470613be43fa3e582a476923e/onediff_diffusers_extensions/examples/kolors
When specifying a particular device to load the model, compile, and run inference, there's always an additional memory usage observed on GPU0.
To troubleshoot, I modified the code to explicitly specify
device=cuda:1
. However, during inference, GPU0 still shows a 266MB memory occupation.Steps to reproduce:
Expected behavior:
All operations and memory usage should be confined to the specified GPU (cuda:1 in this case).
Actual behavior:
GPU0 consistently shows a 266MB memory usage, despite operations being directed to other GPUs.
Questions:
Is this 266MB memory usage on GPU0 expected behavior?
If not, what could be causing this persistent memory allocation on GPU0?
Are there any known workarounds or solutions to ensure all operations and memory usage are isolated to the specified GPU?
I would greatly appreciate any insights or solutions the repository maintainers could provide to address this issue. Thank you for your time and assistance.
The text was updated successfully, but these errors were encountered: