Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runs always stucks on INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler... #431

Open
kazakovaanastasia opened this issue Jan 12, 2025 · 0 comments

Comments

@kazakovaanastasia
Copy link

kazakovaanastasia commented Jan 12, 2025

I followed instructions and installed all this way:

Terminal:

mlspace environments create --env name_test --python 3.10 --cuda 12.4
conda activate name_test

Code: (go through Plan):

!pip install xfuser # Basic installation
!pip install "xfuser[diffusers,flash-attn]" # With both diffusers and flash attention
!git clone https://github.com/xdit-project/xDiT.git
cd xDiT
!pip install -e .
'#' Or optionally, with diffusers
!pip install -e ".[diffusers,flash-attn]"

when I run scripts:

!bash examples/run_cogvideo.sh '#' with MODEL_ID in bash script changed to THUDM/CogVideoX1.5-5B

or

!torchrun --nproc_per_node=8
examples/pixartalpha_example.py
--model PixArt-alpha/PixArt-XL-2-1024-MS \ '# ' --model models/PixArt-XL-2-1024-MS
gives an error

--pipefusion_parallel_degree 2
--ulysses_degree 2
--num_inference_steps 20
--warmup_steps 0
--prompt "A cute dog"
--use_cfg_parallel

or (--model PixArt-alpha/PixArt-Sigma-XL-2-1024-MS )

!torchrun --nproc_per_node=8
examples/pixartsigma_example.py
--model PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
--pipefusion_parallel_degree 2
--ulysses_degree 2
--num_inference_steps 20
--warmup_steps 0
--prompt "A cute dog"
--use_cfg_parallel

My run is stucked on line

DEBUG 01-12 20:17:29 [parallel_state.py:179] world_size=-1 rank=-1 local_rank=-1 distributed_init_method=env:// backend=nccl
WARNING 01-12 20:17:29 [args.py:344] Distributed environment is not initialized. Initializing...
...
INFO 01-12 20:17:39 [base_model.py:83] [RANK 6] Wrapping transformer_blocks.41.attn1 in model class CogVideoXTransformer3DModel with xFuserAttentionWrapper
INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler...
INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler...
INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler...

INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler...

I waited about 30 minutes - 1 hour with different scripts, and it was still in INFO 01-12 20:17:39 [base_pipeline.py:343] Scheduler found, paralleling scheduler... .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant