-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Dict observation spaces #1065
Comments
I have a environment which has variable observation space, the text sequence. But the batch mechanism seem can't not compatible with the variable returned observation. how can i deal with this? |
I suggest to either wrap your environment such that it doesn't have a dict space as observations (this should always be possible), or to work on a PR for solving this issue (might not be easy though). Not sure what you mean by variable observation. If it's a text sequence as array, Batch can handle it, but you will need a custom Agent for processing text |
My environment is a web browser. The returned observation is a text sequence showed in the web page. So the returned text sequence length is always changed at every step. The batch mechanism will put some errors when transfer this kind of observations. I don't think the padding and truncating kind work is necessary to post process the observations. |
I see. It's not really related to this issue, which is about Dict interfaces. Your environment violates the gym/Gymnasium API, where an env is assumed to have a fixed numerical observation space. In any case, for your model training you do need to process the sequences info arrays of the same length, right? I suggest you wrap your environment with a Wrapper that does turn it into a gym-like env. Supporting non-gym envs is outside of the scope of tianshou for now, though we might come back to it in a distant future for better supporti of rlhf |
Thanks for your advice, I checked the tianshou.data module. Figuring out how to replace the Batch class seems need a lot of time. I will try to do the pad and mask to the input to solve it. |
Yes, Batch is pretty fundamental in tianshou and used everywhere ^^ |
Maybe also action spaces.
I'm not sure what the status of the current support is, and I can't estimate the complexity.
It's probably not a priority, but if an external contributor wants to look into it, we could review this. The solution should come with proper documentation
Looking at how other projects support complex action/observation spaces might be a good start.
Related issues: #1064
The text was updated successfully, but these errors were encountered: