Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Processing Feature #40

Open
Blaizzy opened this issue Jun 11, 2024 · 6 comments
Open

Batch Processing Feature #40

Blaizzy opened this issue Jun 11, 2024 · 6 comments
Labels
good first issue Good for newcomers

Comments

@Blaizzy
Copy link
Owner

Blaizzy commented Jun 11, 2024

Overview

The goal is to add support for efficient batch processing of inputs to the MLX-VLM library. This will allow users to process multiple images and text prompts simultaneously to generate corresponding outputs in a single batch, improving performance.

Use cases:

  1. Generating captions for a large dataset of images.
  2. Localizing objects or regions in a batch of images based on textual descriptions.
  3. Classifying a large number of images into predefined categories, considering accompanying text information.
  4. Answering questions based on a batch of images (single and multiple question prompts).
  5. Video processing.

Note: Tag @Blaizzy for code reviews and questions.

Requirements

Support batched inputs:

  • Accept a batch of images as input, provided as a list or array of image objects.
  • Accept a batch of text prompts as input, provided as a list or array of strings.
  • Accept a single text prompt as input, provided as a string.

Perform batch processing:

  • Process the batch of images and text prompts simultaneously (async) using the MLX-VLM model.
  • Utilize parallel processing or GPU acceleration to optimize batch processing performance.
  • Ensure that the processing of one input in the batch does not affect the processing of other inputs.

Generate batched outputs:

  • Return the generated outputs for each input in the batch.
  • Maintain the order of the outputs corresponding to the order of the inputs.
  • Support different output formats such as text, embeddings, or visual representations based on the specific task.

Error handling:

  • Handle errors gracefully during batch processing.
  • Provide informative error messages for invalid inputs or processing failures.
  • Continue processing the remaining inputs in the batch if an error occurs for a specific input.

API design:

  • Provide a clear and intuitive API for users to perform batch processing.
  • Allow users to specify the maximum batch size supported by their system.
  • Provide options to control the batch processing behavior, such as enabling/disabling parallel processing.

Documentation and examples:

  • Update the library documentation to include information about the batch processing feature.
  • Provide code examples demonstrating how to use the batch processing API effectively.
  • Include performance benchmarks and guidelines for optimal batch sizes based on system resources.

Implementation

  • Modify the existing input handling logic to accept batches of images and text prompts.
  • Implement batch processing functionality using parallel processing techniques or GPU acceleration libraries.
  • Optimize memory usage and performance for efficient batch processing.
  • Update the output generation logic to handle batched outputs and maintain the correct order.
  • Implement error handling mechanisms to gracefully handle and report errors during batch processing.
  • Design and expose a user-friendly API for performing batch processing.
  • Write unit tests to verify the correctness and performance of the batch processing implementation.
  • Update the library documentation and provide code examples for using the batch processing feature.

Testing

  • Prepare a comprehensive test suite to validate the batch processing functionality.
  • Test with different batch sizes and input variations to ensure robustness.
  • Verify that the generated outputs match the expected results for each input in the batch.
  • Measure the performance improvement gained by batch processing compared to individual processing.
  • Conduct error handling tests to ensure graceful handling of invalid inputs and processing failures.

Delivery

  • Integrate the batch processing feature into the existing MLX-VLM library codebase.
  • Ensure backward compatibility with previous versions of the library.
  • Provide release notes highlighting the new batch processing capability and any breaking changes.
  • Update the library version number following semantic versioning conventions.
  • Publish the updated library package to the relevant package repositories or distribution channels.

By implementing this batch processing feature, MLX-VLM will provide users with the ability to efficiently process multiple inputs simultaneously, improving performance and usability of the library for various vision-language tasks.

@Blaizzy Blaizzy added the good first issue Good for newcomers label Jun 11, 2024
@eDeveloperOZ
Copy link

Will take it for implementation! hope to meet the standards :)

@Blaizzy
Copy link
Owner Author

Blaizzy commented Jul 3, 2024

Here are some extra details:
@willccbb

@willccbb
Copy link

Sorry, just saw this -- will take a swing when #53 is merged.

@Blaizzy
Copy link
Owner Author

Blaizzy commented Jul 26, 2024

@willccbb done ✅

#53 is merged

@Benjoyo
Copy link

Benjoyo commented Nov 15, 2024

Hey @willccbb, any update on this? Would be super helpful to have

@Blaizzy
Copy link
Owner Author

Blaizzy commented Nov 16, 2024

@willccbb doesn't have the bandwidth.

This feature is now open and back in backlog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants