-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storyboarding agent #90
base: main
Are you sure you want to change the base?
Conversation
LLM default timeout to 120s
WalkthroughThe changes introduce a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ChatHandler
participant StoryboardingAgent
participant VideoDBTool
participant LLM
participant AudioService
participant VideoDatabase
User->>ChatHandler: Request storyboard generation
ChatHandler->>StoryboardingAgent: Run storyboard generation
StoryboardingAgent->>LLM: Generate voiceover scripts
LLM-->>StoryboardingAgent: Return scripts
StoryboardingAgent->>VideoDBTool: Generate images
VideoDBTool-->>StoryboardingAgent: Return images
StoryboardingAgent->>AudioService: Generate audio
AudioService-->>StoryboardingAgent: Return audio
StoryboardingAgent->>VideoDatabase: Combine assets
VideoDatabase-->>StoryboardingAgent: Return video URL
StoryboardingAgent-->>ChatHandler: Return video URL
ChatHandler-->>User: Provide video URL
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (7)
backend/director/llm/base.py (1)
45-45
: Consider making timeout configurable per operation typeWhile increasing the timeout to 120s might be necessary for storyboarding, this change affects all LLM operations since it's in the base configuration. Consider:
- Making timeout configurable per operation type or agent
- Using a reasonable default timeout (e.g., 30s) and allowing specific agents to override it
- Adding a separate configuration for long-running operations like storyboarding
This would provide better flexibility and prevent potential resource issues from unnecessarily long timeouts in simpler operations.
class BaseLLMConfig(BaseSettings): # ... other fields ... - timeout: int = 120 + timeout: int = 30 + operation_timeouts: Dict[str, int] = {"storyboarding": 120} + + def get_timeout(self, operation_type: str = None) -> int: + """Get timeout for specific operation type or default timeout.""" + return self.operation_timeouts.get(operation_type, self.timeout)backend/director/handler.py (1)
LLM timeout configuration needs error handling improvement
The default timeout of 120 seconds is properly configured in
backend/director/llm/base.py
and propagated to LLM implementations. However, the chat handler's error handling needs improvement:
- The current generic
except Exception
block inhandler.py
doesn't specifically handle timeout scenarios- Add specific handling for timeout exceptions before the generic catch-all to provide better error messages and potentially implement retry logic
Example locations that need attention:
backend/director/handler.py
: Add specific timeout exception handling in the chat method's try-except block- Consider catching
openai.error.Timeout
and other LLM-specific timeout exceptions🔗 Analysis chain
Line range hint
89-106
: Verify LLM timeout configuration.The PR objective mentions setting default LLM timeout to 120 seconds. Let's verify this configuration is properly propagated to the StoryboardingAgent:
The chat method should handle potential timeout scenarios gracefully. Consider adding specific error handling for timeout exceptions in the global try-catch block.
Let me verify the error handling for timeout scenarios by checking the exception handling in the codebase.
Let me check the exception handling in the chat handler specifically.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify LLM timeout configuration # Look for timeout configuration in base agent or LLM wrapper rg -A 5 "timeout.*=.*120|DEFAULT_TIMEOUT" # Check for any existing timeout configurations rg "timeout" --type pyLength of output: 734
Script:
#!/bin/bash # Check for timeout-related exception handling rg -B 2 -A 5 "except.*Timeout|except.*timeout" --type py # Look for general exception handling in the chat method ast-grep --pattern 'try { $$$ } except $_ as $_: { $$$ }'Length of output: 122
Script:
#!/bin/bash # Look for exception handling in handler.py rg -B 5 -A 10 "try:" backend/director/handler.py # Check for any error handling related to LLM calls rg -B 2 -A 5 "except" backend/director/handler.pyLength of output: 1055
backend/director/agents/storyboarding.py (5)
96-96
: ReplaceUsing
print(steps)
with a logging statement.Apply this diff:
-print(steps) +logger.debug(steps)
130-130
: ReplaceUsing
print(steps)
with a logging statement.Apply this diff:
-print(steps) +logger.debug(steps)
144-146
: Parameterize thevoice_id
in audio generationThe
voice_id
is hardcoded, which limits flexibility and reusability. Consider making thevoice_id
configurable or passing it as a parameter to support different voice options.Apply this diff:
- config={ - "voice_id": "IsEXLHzSvLH9UMB6SLHj", - }, + config={ + "voice_id": self.voice_id, + },And initialize
self.voice_id
in the__init__
method or pass it as an argument.
162-162
: ReplaceUsing
print(steps)
with a logging statement.Apply this diff:
-print(steps) +logger.debug(steps)
284-286
: Replace hardcoded base video URL with a configurable parameterThe base video URL is hardcoded, which may limit flexibility. Consider making the base video URL a configurable parameter or passing it as an argument to the method.
Apply this diff:
- base_video = self.videodb_tool.upload( - "https://www.youtube.com/watch?v=4dW1ybhA5bM" - ) + base_video_url = kwargs.get("base_video_url", "default_url") + base_video = self.videodb_tool.upload(base_video_url)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
backend/director/agents/storyboarding.py
(1 hunks)backend/director/handler.py
(2 hunks)backend/director/llm/base.py
(1 hunks)backend/director/tools/videodb_tool.py
(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
backend/director/agents/storyboarding.py
147-147: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
219-219: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
261-261: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
290-290: Loop control variable index
not used within loop body
Rename unused index
to _index
(B007)
336-336: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
🔇 Additional comments (5)
backend/director/llm/base.py (1)
45-45
: Verify the timeout requirements
The 120-second timeout seems unusually high for API calls. Let's verify:
- If this timeout value is based on actual measurements
- If there are any underlying performance issues that could be addressed
- The impact on system resources if multiple requests timeout at 120s
✅ Verification successful
120-second timeout is appropriate for LLM API calls
The timeout value of 120 seconds is actually reasonable in this context because:
- It's used for LLM API calls (OpenAI, Anthropic, VideoDB) which can take longer than typical web API calls, especially for:
- Complex completions with large context windows
- Streaming responses
- Models that require more processing time
- The implementation includes proper error handling across all LLM clients
- The timeout is defined in the base class and consistently used across different LLM implementations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for timeout-related patterns and configurations
# Look for timeout-related error handling or retry logic
echo "Checking for timeout handling patterns..."
rg -A 3 "timeout|ConnectionError|RequestException"
# Look for any existing timeout configurations
echo "Checking for other timeout configurations..."
rg -A 2 "timeout.*=|TIMEOUT"
# Look for comments or documentation about timeouts
echo "Checking for timeout-related documentation..."
rg -A 2 "timeout" -g "*.md"
Length of output: 1863
Script:
#!/bin/bash
# Let's check for any LLM-related error handling and actual API calls
# to understand the context of this timeout better
# Check for LLM API call implementations and error handling
echo "Checking LLM API call patterns..."
ast-grep --pattern 'async def $FUNC($$$) {
$$$
response = $$$
$$$
}'
# Look for any retry mechanisms or error handling specifically in LLM files
echo "Checking retry mechanisms..."
rg -A 5 "retry|backoff|sleep|wait" backend/director/llm/
# Check for any configuration files that might explain the timeout value
echo "Looking for config files..."
fd -e yaml -e json -e toml -e ini . backend/director/ --exec rg -l "timeout"
Length of output: 427
Script:
#!/bin/bash
# Let's examine the actual LLM API calls to understand the timeout usage
# Look for completion/chat completion implementations
echo "Checking completion implementations..."
rg -A 10 "async def (chat_completion|completion)" backend/director/llm/
# Check for any HTTP client configurations or actual API calls
echo "Checking HTTP client usage..."
rg -A 5 "(aiohttp|httpx|requests|client\.)" backend/director/llm/
# Look for any comments or docstrings explaining the timeout
echo "Checking for timeout documentation in code..."
rg -B 2 -A 1 "timeout.*=.*120" backend/director/llm/
Length of output: 1925
backend/director/handler.py (3)
25-25
: LGTM! Clean import addition.
The import statement follows the established pattern and is placed logically with other agent imports.
64-64
: LGTM! Agent registration follows existing pattern.
The StoryboardingAgent is properly registered in the agents list.
Line range hint 36-65
: Verify agent initialization and error handling.
The new StoryboardingAgent will be initialized along with other agents in the chat method. Let's verify its integration:
Consider implementing the following best practices:
- Ensure proper timeout handling in the StoryboardingAgent's run/execute methods
- Add appropriate error handling for storyboard generation failures
- Consider implementing a retry mechanism for transient failures
✅ Verification successful
StoryboardingAgent implementation follows proper error handling and base agent patterns
The StoryboardingAgent properly extends BaseAgent and implements comprehensive error handling:
- Follows the base agent pattern with proper initialization and run method implementation
- Implements thorough error handling with try-catch blocks at multiple levels:
- Main run method has a top-level try-catch that handles any unexpected errors
- Individual steps (voiceover scripts, image generation, audio generation, asset combination) have their own error handling
- Each error case properly updates the output message status and returns appropriate AgentResponse
- Includes proper status updates through OutputMessage for progress tracking
- Validates required environment variables before proceeding
- Returns standardized AgentResponse objects with appropriate status and messages
The implementation aligns well with other agents in the codebase like VideoGenerationAgent and ImageGenerationAgent, following the same patterns for error handling and status updates.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify StoryboardingAgent implementation and error handling
# Check if StoryboardingAgent properly extends BaseAgent
ast-grep --pattern 'class StoryboardingAgent($_) {
$$$
}'
# Check for proper error handling in the agent
rg -A 5 "class StoryboardingAgent" | rg -A 10 "def run|def execute"
# Look for any existing error handling patterns in other agents
rg -l "try.*except" --type py
Length of output: 163
Script:
#!/bin/bash
# Let's try a different approach to verify the StoryboardingAgent implementation
# First, let's find where StoryboardingAgent is defined
fd -e py | xargs rg "class StoryboardingAgent"
# Check for imports of StoryboardingAgent
rg "from.*StoryboardingAgent|import.*StoryboardingAgent"
# Look for BaseAgent pattern to understand the expected structure
fd -e py | xargs rg "class.*BaseAgent"
# Check for common error handling patterns in agents
fd -e py | xargs rg "class.*Agent.*:.*\n.*def (run|execute).*\n.*try:"
Length of output: 2284
Script:
#!/bin/bash
# Let's check the StoryboardingAgent implementation and error handling patterns
# Check the StoryboardingAgent implementation
cat backend/director/agents/storyboarding.py
# Check the BaseAgent implementation to understand the required structure
cat backend/director/agents/base.py
# Let's look at a few other agents for error handling patterns
cat backend/director/agents/video_generation.py
cat backend/director/agents/image_generation.py
Length of output: 26906
backend/director/tools/videodb_tool.py (1)
77-83
: Verify integration with StoryboardingAgent.
Let's verify how this method is used by the StoryboardingAgent to ensure the returned fields are sufficient for its needs.
✅ Verification successful
Let me gather more context about the ImageAsset usage to ensure we have a complete picture.
Integration with StoryboardingAgent is verified and working correctly
The minimal fields returned by get_image()
are sufficient for the StoryboardingAgent's needs. The agent only uses the id
field from the returned image object to create an ImageAsset
, which is exactly what's being provided by the current implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for usage of get_image in StoryboardingAgent
ast-grep --pattern 'get_image($_)'
# Search for any direct references to VideoDBTool and image handling
rg -A 5 'VideoDBTool.*image|get_image'
Length of output: 1575
Script:
#!/bin/bash
# Search for ImageAsset class definition and usage
ast-grep --pattern 'class ImageAsset'
# Search for additional context around ImageAsset instantiation
rg -B 5 -A 10 'ImageAsset\('
Length of output: 2613
def get_image(self, image_id): | ||
"""Get a image by ID.""" | ||
image = self.collection.get_image(image_id) | ||
return { | ||
"id": image.id, | ||
"name": image.name, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider enhancing the image metadata and error handling.
The implementation could be improved in several ways:
- Add error handling for invalid image_id and uninitialized collection
- Return additional useful metadata fields (url, collection_id) for consistency with other getter methods
Consider applying this enhancement:
def get_image(self, image_id):
"""Get a image by ID."""
+ if not self.collection:
+ raise ValueError("Collection not initialized")
image = self.collection.get_image(image_id)
return {
"id": image.id,
"name": image.name,
+ "collection_id": image.collection_id,
+ "url": image.url
}
Committable suggestion skipped: line range outside the PR's diff.
return None | ||
image_url = flux_output[0].url | ||
return image_url | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused variable e
in exception handling
The variable e
is assigned but never used in the except
block. You can remove as e
to clean up the code.
Apply this diff:
- except Exception as e:
+ except Exception:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
except Exception: |
🧰 Tools
🪛 Ruff (0.8.0)
261-261: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
# Parse the response | ||
step_data = json.loads(llm_response.content)["steps"] | ||
return step_data | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused variable e
in exception handling
The variable e
is assigned but never used in the except
block. You can remove as e
to clean up the code.
Apply this diff:
- except Exception as e:
+ except Exception:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
except Exception: |
🧰 Tools
🪛 Ruff (0.8.0)
219-219: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
"voice_id": "IsEXLHzSvLH9UMB6SLHj", | ||
}, | ||
) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused variable e
in exception handling
The variable e
is assigned but never used in the except
block. You can remove as e
to clean up the code.
Apply this diff:
- except Exception as e:
+ except Exception:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
except Exception: |
🧰 Tools
🪛 Ruff (0.8.0)
147-147: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
# Generate the video | ||
video_url = timeline.generate_stream() | ||
return video_url | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused variable e
in exception handling
The variable e
is assigned but never used in the except
block. You can remove as e
to clean up the code.
Apply this diff:
- except Exception as e:
+ except Exception:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
except Exception: |
🧰 Tools
🪛 Ruff (0.8.0)
336-336: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
base_video_id = base_video["id"] | ||
|
||
seeker = 0 | ||
for index, step in enumerate(steps): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Unused loop variable index
The loop variable index
is not used within the loop body. You can remove it to clean up the code.
Apply this diff:
- for index, step in enumerate(steps):
+ for step in steps:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for index, step in enumerate(steps): | |
for step in steps: |
🧰 Tools
🪛 Ruff (0.8.0)
290-290: Loop control variable index
not used within loop body
Rename unused index
to _index
(B007)
consistent_part = "Create a stippling black ballpoint pen illustration of a Nigerian woman with a tight afro, living in her minimalist New York apartment. Keep the illustration simple with minimal elements. Generate less than 600 characters" | ||
variable_part = f"This illustration is a part of a storyboard to explain the user journey of an app built for {app_description}. This image will portray the '{step['step']}' stage in the app. Step description: {step['step_description']}. This illustration is meant for professional storyboarding, so understand the requirements accordingly and create a suitable illustration with the woman as a central character in the frame, but include other supporting props that can indicate that she's in the '{step['step']}' step in the user flow." | ||
prompt = f"{consistent_part}\n- {variable_part}" | ||
return prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make image prompt generation more dynamic
The consistent_part
of the prompt includes hardcoded details about a specific character and setting, which may not be appropriate for all applications. Consider making the prompt more dynamic to account for different user scenarios and app descriptions.
For example, you could incorporate character details as parameters or generate them based on the app_description
.
- consistent_part = "Create a stippling black ballpoint pen illustration of a Nigerian woman with a tight afro, living in her minimalist New York apartment. Keep the illustration simple with minimal elements. Generate less than 600 characters"
+ consistent_part = f"Create a stippling black ballpoint pen illustration relevant to {app_description}. Keep the illustration simple with minimal elements."
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (2)
backend/director/agents/storyboarding.py (2)
144-145
: Move hardcoded voice_id to configurationThe voice_id should be configurable rather than hardcoded to allow for different voices without code changes.
+ VOICE_ID = os.getenv("ELEVENLABS_VOICE_ID", "IsEXLHzSvLH9UMB6SLHj") def run(self, ...): # ... self.audio_gen_tool.text_to_speech( text=step["voiceover_script"], save_at=voiceover_path, config={ - "voice_id": "IsEXLHzSvLH9UMB6SLHj", + "voice_id": VOICE_ID, }, )
197-241
: Consider making prompt templates configurableThe prompt structure is currently hardcoded. Consider moving it to a template file or configuration to allow for easier modifications and potential A/B testing of different prompt structures.
+ VOICEOVER_PROMPT_TEMPLATE = """ + Generate a structured response for an app that {app_description}. Here are the steps involved in the user journey: + {steps_prompt} + Return a response in JSON format, with key 'steps', and value being a list of dicts, where each dict has two keys 'step_description' and 'voiceover_script', like this: + {json_format} + """ def prompt_voiceover_scripts(self, steps, app_description): - prompt = f"Generate a structured response..." + steps_prompt = self._generate_steps_prompt(steps) + return VOICEOVER_PROMPT_TEMPLATE.format( + app_description=app_description, + steps_prompt=steps_prompt, + json_format=self._get_json_format_example() + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
backend/director/agents/storyboarding.py
(1 hunks)
🔇 Additional comments (1)
backend/director/agents/storyboarding.py (1)
1-37
: LGTM! Clean initialization with proper inheritance
The imports and class initialization are well-structured with appropriate base class inheritance and initialization of required attributes.
|
||
# Step 2: Generate images using Flux | ||
self.output_message.actions.append("Generating images...") | ||
self.output_message.push_update() | ||
|
||
for index, step in enumerate(steps): | ||
image_url = self.generate_image_flux(step, app_description) | ||
if not image_url: | ||
video_content.status = MsgStatus.error | ||
video_content.status_message = ( | ||
f"Error in generating image for step {index+1}." | ||
) | ||
self.output_message.publish() | ||
error_message = f"Agent failed with error in generating image for step {index+1}." | ||
return AgentResponse( | ||
status=AgentStatus.ERROR, message=error_message | ||
) | ||
|
||
# Download webp image and convert to PNG | ||
response = requests.get(image_url) | ||
if response.status_code == 200: | ||
image = Image.open(io.BytesIO(response.content)) | ||
os.makedirs(DOWNLOADS_PATH, exist_ok=True) | ||
png_path = f"{DOWNLOADS_PATH}/{str(uuid.uuid4())}.png" | ||
image.save(png_path, "PNG") | ||
media = self.videodb_tool.upload( | ||
png_path, source_type="file_path", media_type="image" | ||
) | ||
steps[index]["image_id"] = media["id"] | ||
else: | ||
logger.error("Failed to download image") | ||
return None | ||
|
||
print(steps) | ||
|
||
# Step 3: Generate voiceover audio using ElevenLabs | ||
self.output_message.actions.append("Generating voiceover audio...") | ||
self.output_message.push_update() | ||
|
||
for index, step in enumerate(steps): | ||
try: | ||
os.makedirs(DOWNLOADS_PATH, exist_ok=True) | ||
voiceover_path = f"{DOWNLOADS_PATH}/{str(uuid.uuid4())}.mp3" | ||
self.audio_gen_tool.text_to_speech( | ||
text=step["voiceover_script"], | ||
save_at=voiceover_path, | ||
config={ | ||
"voice_id": "IsEXLHzSvLH9UMB6SLHj", | ||
}, | ||
) | ||
except Exception as e: | ||
video_content.status = MsgStatus.error | ||
video_content.status_message = ( | ||
f"Error in generating audio for step {step['step']}." | ||
) | ||
self.output_message.publish() | ||
error_message = f"Agent failed with error in generating audio for step {index+1}. Error: {str(e)}" | ||
return AgentResponse( | ||
status=AgentStatus.ERROR, message=error_message | ||
) | ||
media = self.videodb_tool.upload( | ||
voiceover_path, source_type="file_path", media_type="audio" | ||
) | ||
steps[index]["audio_id"] = media["id"] | ||
|
||
print(steps) | ||
|
||
# # Step 4: Combine assets using VideoDB | ||
self.output_message.actions.append( | ||
"Combining assets into storyboard video..." | ||
) | ||
self.output_message.push_update() | ||
|
||
video_url = self.combine_assets(steps) | ||
if not video_url: | ||
video_content.status = MsgStatus.error | ||
video_content.status_message = "Error in combining assets into video." | ||
self.output_message.publish() | ||
error_message = ( | ||
"Agent failed with error in combining assets into video." | ||
) | ||
return AgentResponse(status=AgentStatus.ERROR, message=error_message) | ||
|
||
video_content.video = VideoData(stream_url=video_url) | ||
video_content.status = MsgStatus.success | ||
video_content.status_message = "Here is your generated storyboard video." | ||
self.output_message.publish() | ||
except Exception as e: | ||
logger.exception(f"Error in {self.agent_name}") | ||
video_content.status = MsgStatus.error | ||
video_content.status_message = "Error in generating storyboard." | ||
self.output_message.publish() | ||
error_message = f"Agent failed with error {e}" | ||
return AgentResponse(status=AgentStatus.ERROR, message=error_message) | ||
return AgentResponse( | ||
status=AgentStatus.SUCCESS, | ||
message=f"Agent {self.agent_name} completed successfully.", | ||
data={"video_url": video_url}, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider breaking down the run method into smaller, focused methods
The method is handling multiple responsibilities (script generation, image generation, audio generation, and video creation). Consider extracting these into separate private methods for better maintainability and testing.
Example refactor:
def run(self, collection_id: str, app_description: str, raw_steps: List[str], *args, **kwargs) -> AgentResponse:
- # Current implementation
+ try:
+ self._initialize_tools(collection_id)
+ steps = self._prepare_steps(raw_steps)
+
+ if not self._generate_scripts(steps, app_description):
+ return self._error_response("Error in generating voiceover scripts.")
+
+ if not self._generate_images(steps, app_description):
+ return self._error_response("Error in generating images.")
+
+ if not self._generate_audio(steps):
+ return self._error_response("Error in generating audio.")
+
+ video_url = self._combine_assets(steps)
+ if not video_url:
+ return self._error_response("Error in combining assets.")
+
+ return self._success_response(video_url)
+ except Exception as e:
+ return self._handle_exception(e)
Committable suggestion skipped: line range outside the PR's diff.
def combine_assets(self, steps): | ||
""" | ||
Combines images and audio into a storyboard video using VideoDB. | ||
:param steps: List of dicts containing image and audio URLs. | ||
:return: URL of the generated video. | ||
""" | ||
# Initialize VideoDB client | ||
timeline = self.videodb_tool.get_and_set_timeline() | ||
# Upload base video | ||
base_video = self.videodb_tool.upload( | ||
"https://www.youtube.com/watch?v=4dW1ybhA5bM" | ||
) | ||
base_video_id = base_video["id"] | ||
|
||
seeker = 0 | ||
for step in steps: | ||
# Upload image and audio | ||
audio = self.videodb_tool.get_audio(step["audio_id"]) | ||
image = self.videodb_tool.get_image(step["image_id"]) | ||
|
||
audio_duration = float(audio["length"]) | ||
|
||
image_asset = ImageAsset( | ||
asset_id=image["id"], | ||
duration=audio_duration, | ||
x="(main_w-overlay_w)/2", | ||
y="(main_h-overlay_h)/2", | ||
height="w=iw/3", | ||
width="h=ih/3", | ||
) | ||
audio_asset = AudioAsset(asset_id=audio["id"], disable_other_tracks=True) | ||
text_asset = TextAsset( | ||
step["step"], | ||
duration=audio_duration, | ||
style=TextStyle( | ||
x="(w-text_w)/2", | ||
y="(h-text_h*2)", | ||
font="League Spartan", | ||
fontsize="(h/20)", | ||
fontcolor="Snow", | ||
boxcolor="OrangeRed", | ||
boxborderw=10, | ||
), | ||
) | ||
|
||
# Add assets to timeline | ||
timeline.add_overlay(seeker, image_asset) | ||
timeline.add_overlay(seeker, audio_asset) | ||
timeline.add_overlay(seeker, text_asset) | ||
|
||
seeker += audio_duration | ||
|
||
# Add base video to timeline | ||
base_vid_asset = VideoAsset(base_video_id, end=seeker) | ||
timeline.add_inline(base_vid_asset) | ||
|
||
# Generate the video | ||
video_url = timeline.generate_stream() | ||
return video_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple improvements needed in asset combination
- The base video URL is hardcoded
- Text styling contains magic numbers
- Missing validation for asset existence
+ BASE_VIDEO_URL = os.getenv("STORYBOARD_BASE_VIDEO_URL", "https://www.youtube.com/watch?v=4dW1ybhA5bM")
+ TEXT_STYLE_CONFIG = {
+ "x": "(w-text_w)/2",
+ "y": "(h-text_h*2)",
+ "font": "League Spartan",
+ "fontsize": "(h/20)",
+ "fontcolor": "Snow",
+ "boxcolor": "OrangeRed",
+ "boxborderw": 10,
+ }
def combine_assets(self, steps):
+ def validate_asset(asset, asset_type):
+ if not asset:
+ raise ValueError(f"Missing {asset_type} asset")
+ return asset
timeline = self.videodb_tool.get_and_set_timeline()
base_video = self.videodb_tool.upload(
- "https://www.youtube.com/watch?v=4dW1ybhA5bM"
+ BASE_VIDEO_URL
)
for step in steps:
- audio = self.videodb_tool.get_audio(step["audio_id"])
- image = self.videodb_tool.get_image(step["image_id"])
+ audio = validate_asset(self.videodb_tool.get_audio(step["audio_id"]), "audio")
+ image = validate_asset(self.videodb_tool.get_image(step["image_id"]), "image")
text_asset = TextAsset(
step["step"],
duration=audio_duration,
- style=TextStyle(
- x="(w-text_w)/2",
- y="(h-text_h*2)",
- font="League Spartan",
- fontsize="(h/20)",
- fontcolor="Snow",
- boxcolor="OrangeRed",
- boxborderw=10,
- ),
+ style=TextStyle(**TEXT_STYLE_CONFIG),
)
Committable suggestion skipped: line range outside the PR's diff.
def generate_image_flux(self, step, app_description): | ||
""" | ||
Generates an image using Flux. | ||
:param step: Dict containing step information. | ||
:param app_description: Description of the app. | ||
:return: URL of the generated image. | ||
""" | ||
prompt = self.prompt_image_generation(step, app_description) | ||
flux_output = flux_dev(prompt) | ||
if not flux_output: | ||
logger.error("Error in generating image with Flux") | ||
return None | ||
image_url = flux_output[0].url | ||
return image_url | ||
|
||
def prompt_image_generation(self, step, app_description): | ||
""" | ||
Generates the prompt for image generation. | ||
""" | ||
consistent_part = "Create a stippling black ballpoint pen illustration of a Nigerian woman with a tight afro, living in her minimalist New York apartment. Keep the illustration simple with minimal elements. Generate less than 600 characters" | ||
variable_part = f"This illustration is a part of a storyboard to explain the user journey of an app built for {app_description}. This image will portray the '{step['step']}' stage in the app. Step description: {step['step_description']}. This illustration is meant for professional storyboarding, so understand the requirements accordingly and create a suitable illustration with the woman as a central character in the frame, but include other supporting props that can indicate that she's in the '{step['step']}' step in the user flow." | ||
prompt = f"{consistent_part}\n- {variable_part}" | ||
return prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make character description configurable and improve error handling
- The character description is hardcoded, limiting the versatility of the agent.
- The flux_dev call lacks proper error handling.
+ CHARACTER_DESCRIPTION = os.getenv(
+ "STORYBOARD_CHARACTER_DESCRIPTION",
+ "Create a stippling black ballpoint pen illustration of a Nigerian woman with a tight afro, living in her minimalist New York apartment."
+ )
def generate_image_flux(self, step, app_description):
prompt = self.prompt_image_generation(step, app_description)
- flux_output = flux_dev(prompt)
- if not flux_output:
+ try:
+ flux_output = flux_dev(prompt)
+ if not flux_output:
+ raise ValueError("Empty response from Flux")
+ return flux_output[0].url
+ except Exception as e:
logger.error("Error in generating image with Flux")
+ logger.exception(e)
return None
- image_url = flux_output[0].url
- return image_url
def prompt_image_generation(self, step, app_description):
- consistent_part = "Create a stippling black ballpoint pen illustration..."
+ consistent_part = f"{CHARACTER_DESCRIPTION} Keep the illustration simple with minimal elements."
Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
New Features
StoryboardingAgent
, enabling users to generate storyboard videos using generative AI models.Improvements
Documentation
get_audio
method to include a descriptive docstring for clarity.