Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] RAS Runtime Optimization #2664

Open
isgallagher opened this issue Feb 18, 2025 · 0 comments
Open

[FEAT] RAS Runtime Optimization #2664

isgallagher opened this issue Feb 18, 2025 · 0 comments

Comments

@isgallagher
Copy link

Feature Request: Implement Region-Adaptive Sampling (RAS) in Stable Diffusion Web UI

Overview

I'd like to request the implementation of Region-Adaptive Sampling (RAS) in Stable Diffusion Web UI. RAS is Microsoft's recently published inference optimization technique that significantly improves generation speed for diffusion models without requiring model retraining.

Description

Region-Adaptive Sampling is a novel sampling strategy that introduces regional variability in sampling steps. Unlike conventional methods that uniformly process all image regions, RAS dynamically adjusts sampling ratios based on regional attention and noise metrics. This approach prioritizes computational resources for intricate regions while reusing previous outputs for less complex areas, achieving faster inference with minimal loss in image quality.

Benefits

  1. Faster Generation: RAS optimizes the sampling process by focusing computational resources where they're most needed
  2. Maintained Quality: Preserves high-quality results in complex regions while economizing in simpler areas
  3. Training-Free: Works with existing checkpoints without any retraining required
  4. Flexible Parameters: Offers tuning options to balance throughput vs quality based on user preference

Implementation Details

The implementation would require:

  1. Integration of RAS sampling logic into the SD Web UI sampling process
  2. Addition of RAS-specific parameters to the UI:
    • Sample ratio: Controls the average proportion of tokens updated per step
    • Metric selection: Allows choosing between "l2norm" and "std" for identifying important regions
    • High ratio: Controls balance between main subject and background sampling
    • Starvation scale: Prevents excessive dropping of the same regions
    • Scheduler start/end steps: Defines the range where RAS is applied
    • Error reset steps: Allows periodic dense sampling to reset accumulated errors
    • Flash Attention toggle: Option to use Flash Attention for additional speedup when available
    • Index fusion toggle: Enables kernel fusion for higher generation speed (requires PyCuda)

Example Implementation Approach

The implementation could be added as an extension or integrated directly into the codebase by wrapping the existing sampling process similar to the examples in RAS's documentation:

from ras.utils import ras_manager
from ras.utils.Stable_Diffusion_3.update_pipeline_sd3 import update_sd3_pipeline

# In existing code where pipeline is created:
pipeline = update_sd3_pipeline(pipeline)

# Set RAS parameters
ras_manager.MANAGER.set_parameters(args)

# Continue with normal inference

User Interface

I suggest adding a "Region-Adaptive Sampling" section to the generation settings with:

  • A checkbox to enable/disable RAS
  • Sliders for continuous parameters (sample ratio, high ratio, starvation scale)
  • Text input for error reset steps
  • Dropdown for metric selection
  • Numeric inputs for scheduler start/end steps
  • Checkboxes for flash attention and index fusion options

References

Additional Notes

This feature would be particularly valuable for users with limited GPU resources who want faster generation times without sacrificing quality on important image regions. It would also benefit power users who often generate large batches of images.


Side note: Feature request written by Claude, some things may be inaccurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant