-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Enhance streaming API timeout handling with mathematical modeling #485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
qizwiz
wants to merge
28
commits into
QwenLM:main
Choose a base branch
from
qizwiz:ci-test-branch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,627
−106
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit addresses GitHub issue QwenLM#239 by implementing a comprehensive mathematical model for predicting and preventing streaming API timeouts. Key changes include: - Created StreamingTimeoutModel with adaptive timeout calculations based on request characteristics - Enhanced OpenAIContentGenerator with improved timeout handling and error messaging - Added CLI options for configuring timeout and retry behavior - Added configuration recommendations based on request analysis - Included comprehensive tests for the new timeout model - Added documentation explaining the modeling approach
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This commit adds a new Model-Context Protocol (MCP) server for timeout analysis that provides tools for analyzing and predicting streaming API timeouts based on mathematical modeling. Key changes include: - Created timeout-analysis-server.ts with MCP tools for timeout analysis and configuration suggestions - Added tests for the new MCP server - Updated core index.ts to export the new server - Updated CLI configuration to automatically include the timeout analysis MCP server - Added documentation for the new MCP server - Marked @modelcontextprotocol/sdk as external in esbuild config to avoid bundling issues
Fixed the path configuration for the timeout analysis MCP server to point to the correct location in the built distribution files.
Removing the MCP server changes as they are not part of the solution for the PR. The MCP server is a tool for self-improvement, not part of the timeout fix.
Removing all MCP server related changes as they are not part of the PR solution.
This PR fixes the streaming API timeout issue that occurs after 64 seconds by improving timeout handling and error messaging. Changes include: - Enhanced OpenAIContentGenerator timeout error handling - Better error messages with specific troubleshooting guidance - Improved timeout detection and reporting - Added configuration recommendations Fixes QwenLM#239
This PR fixes the streaming API timeout issue that occurs after 64 seconds by improving timeout handling and error messaging. Changes include: - Enhanced OpenAIContentGenerator timeout error handling - Better error messages with specific troubleshooting guidance - Improved timeout detection and reporting - Added configuration recommendations Fixes QwenLM#239
This PR addresses GitHub issue QwenLM#239 by implementing a comprehensive mathematical modeling approach to understand and solve the streaming API timeout issue that occurs after 64 seconds. Key changes include: - Created StreamingTimeoutModel with adaptive timeout calculations based on request characteristics - Enhanced OpenAIContentGenerator with improved timeout handling and error messaging - Added CLI options for configuring timeout and retry behavior (--openai-timeout, --openai-max-retries) - Added configuration recommendations based on request analysis - Included comprehensive tests for the new timeout model - Added documentation explaining the modeling approach The solution transforms a frustrating timeout issue into an opportunity for intelligent, adaptive system behavior that improves the user experience for large and complex requests. Fixes QwenLM#239
|
Hi, @tanzhenxin Could anyone review this patch? This issue has been bothering me for a long time. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
status/need-information
More information is needed to resolve this issue.
type/bug
Something isn't working as expected
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR: feat: Enhance streaming API timeout handling with mathematical modeling
Overview
This PR addresses GitHub issue #239 by implementing a comprehensive mathematical modeling approach to understand and solve the streaming API timeout issue that occurs after 64 seconds.
Problem
The streaming API setup was timing out after 64 seconds, causing user frustration and limiting the tool's effectiveness for large requests. The error message provided generic troubleshooting tips but didn't offer specific solutions based on the request characteristics.
Solution
We've implemented a comprehensive mathematical modeling approach to understand and solve this timeout issue:
1. Mathematical Modeling
We created a
StreamingTimeoutModelthat calculates expected streaming request times based on:This allows us to predict when timeouts will occur and recommend appropriate solutions.
2. Adaptive Timeout Calculation
Instead of fixed timeouts, we now calculate adaptive timeouts based on request characteristics:
3. Enhanced Error Messaging
When timeouts occur, we now provide more specific troubleshooting guidance based on the request characteristics:
4. CLI Configuration Options
New CLI options allow users to configure timeout behavior:
--openai-timeout: Set API timeout in milliseconds--openai-max-retries: Set maximum retry attempts5. Configuration Recommendations
The system now provides configuration recommendations based on analysis of current settings.
Technical Implementation
Core Changes
--openai-timeoutand--openai-max-retriesconfiguration optionsFiles Modified
packages/core/src/models/streamingTimeoutModel.ts- New mathematical modelpackages/core/src/models/streamingTimeoutModel.test.ts- Tests for the modelpackages/core/src/models/streamingTimeoutModel.verification.test.ts- Formal verification testspackages/core/src/core/openaiContentGenerator.ts- Enhanced timeout handlingpackages/cli/src/config/config.ts- Added CLI optionsUsage Examples
CLI Usage
Configuration File
{ "contentGenerator": { "timeout": 120000, "maxRetries": 3, "samplingParams": { "temperature": 0.7, "max_tokens": 2048 } } }Testing
All tests pass, including new tests for the streaming timeout model:
Future Improvements
This solution transforms a frustrating timeout issue into an opportunity for intelligent, adaptive system behavior that improves the user experience for large and complex requests.
Fixes #239