-
Notifications
You must be signed in to change notification settings - Fork 195
Add some more options to play_w_env.py #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Claude finished @joschu's task —— View job Code Review Complete ✓I've reviewed the changes to Summary of ChangesThis PR adds two new optional parameters to customize the interactive play experience:
Positive Aspects ✓
Issues & Suggestions1. Minor Formatting Inconsistency (tinker_cookbook/rl/play_w_env.py:61)action_str = await get_async_input(colored("Your action:", "yellow"), multiline=self.multiline)The prompt changed from Suggestion: Consider restoring the trailing space for single-line mode: prompt_text = "Your action:" if self.multiline else "Your action: "
action_str = await get_async_input(colored(prompt_text, "yellow"), multiline=self.multiline)2. Observation Display Control (tinker_cookbook/rl/play_w_env.py:54-59)The
Question: Was the decision to hide observations by default intentional? For debugging purposes (which is the stated goal in the docstring), showing observations might be more helpful as the default behavior. Consider whether 3. Multiline Input Edge CaseThe multiline logic is solid, but there's a subtle behavior worth noting: if line == "" and prev_line == "":
if lines and lines[-1] == "":
lines.pop()
breakThis correctly removes the first blank line before terminating. However, 4. Documentation GapThe module docstring at the top mentions the example, but doesn't document the new parameters. Consider adding a brief note: """
To help you debug your environment, you can use the play_env function to play as the policy by typing in your responses in an environment interactively.
Options:
- multiline=True: Enable multi-line input (terminate with two blank lines)
- show_observation=False: Display the full observation at each step
We include an example of playing the Twenty Questions environment in the main function.
...
"""Alignment with Repository ConventionsBased on CLAUDE.md guidelines:
Testing RecommendationsSince this is a utility for manual testing, automated tests might be challenging. However, consider:
Final RecommendationApprove with minor suggestions. The changes are well-implemented and provide useful flexibility. The only concerns are:
These are optional improvements—the code is functional as-is and ready to merge if the defaults align with your intended use cases. |
|
@claude please make the following changes (based on your review):
|
|
Claude finished @joschu's task —— View job Changes Implemented ✓All requested changes have been completed and pushed to the branch:
Summary of Changes1. Prompt spacing fix (line 64-65) prompt_text = "Your action:" if self.multiline else "Your action: "
action_str = await get_async_input(colored(prompt_text, "yellow"), multiline=self.multiline)Now single-line mode includes a trailing space for better readability. 2.
This maintains the previous behavior of showing observations by default. 3. Documentation added (lines 4-5) Options:
- multiline=True: Enable multi-line input mode (terminate with two blank lines)As requested, only documented the | |
…tion default, add multiline docs - Add trailing space to prompt in single-line mode for better UX - Change show_observation default from False to True - Add documentation for multiline parameter in module docstring Co-authored-by: John Schulman <[email protected]>
No description provided.