Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install & Update Packages Dependencies #512

Closed
wants to merge 15 commits into from

Conversation

AI-Ahmed
Copy link
Contributor

Install a Specific version of Scipy scipy==1.1.0 Library to fix the error in Colab. Also, Update the downloaded GYM library on Colab and install gym for Atari & ROM Licence after downloading xvfb.

Install Specific version of Scipy Library  to fix the error in Colab. Also, Intstalled GYM library for Atari & ROM Licence after downloading xvfb
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Download and install the video game dependencies, `ROM`, and install `xvfb` correctly. Downgrade the `gym` environment for monitoring the game.
Download and install the video game dependencies, `ROM`, and install `xvfb` correctly. Downgrade the `gym` environment for monitoring the game.
Add the video lecture of Pavel Shvechikov (yandexdataschool#510)
week06_policy_based/a2c-optional.ipynb Show resolved Hide resolved
week08_pomdp/practice_pytorch.ipynb Outdated Show resolved Hide resolved
week08_pomdp/practice_pytorch.ipynb Outdated Show resolved Hide resolved
Update the hash of the imported scripts of `atari_util.py` and `env_pool.py`.
Copy link
Contributor Author

@AI-Ahmed AI-Ahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the hash and the typo version of gym

AI-Ahmed and others added 3 commits November 25, 2022 17:28
Added new monitoring script and reward scaling for the reward for having better training performance with reducing the gradient exploration.
Copy link
Contributor Author

@AI-Ahmed AI-Ahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing the notebook, but it doesn't appear where you have committed your changes since the file is not rendering, and the title doesn't tell. Would you mind if you tell me the simplified diff you made, please?

image

Copy link
Contributor Author

@AI-Ahmed AI-Ahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I see you deleted lots of things!

Copy link
Collaborator

@dniku dniku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at my suggestions.

" os.system('wget https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/xvfb -O ../xvfb')\n",
" # Download and install Video game dependencies\n",
" os.system('wget -q https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/setup_colab.sh -O- | bash')\n",
" os.system('touch .setup_complete')\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is meant to be an indicator that the entire setup block does not need to be re-run. Please put its creation at the bottom of this block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit weird the file is placed because this is what I have in my repos!!

image

Maybe this is an old version...

"\n",
" !wget https://raw.githubusercontent.com/yandexdataschool/Practical_RL/f1d7764b276cadb7365b1fdb6f6dd3fbd4e7bd8d/week08_pomdp/atari_util.py\n",
" !wget https://raw.githubusercontent.com/yandexdataschool/Practical_RL/f1d7764b276cadb7365b1fdb6f6dd3fbd4e7bd8d/week08_pomdp/env_pool.py\n",
" # Setup the attari driver for video games\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which comment are you referring to?

" !wget https://raw.githubusercontent.com/yandexdataschool/Practical_RL/f1d7764b276cadb7365b1fdb6f6dd3fbd4e7bd8d/week08_pomdp/env_pool.py\n",
" # Setup the attari driver for video games\n",
" !wget -q https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/setup_colab.sh -O- | bash\n",
" !touch .setup_complete\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put creation of this file at the bottom of this block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the shell file and didn't see any problem with having it in its current place!

Comment on lines +27 to +29
"# Restart the Runtime\n",
"print(\"Restarting the Runtime...\")\n",
"os.kill(os.getpid(), 9)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much doubt that this is necessary. Something like this should only be required if you modify the libraries that you import — but here, instead, you just install more libraries. It also prevents the user from using the "Restart & Run All" feature in Jupyter and Colab.

Please remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct! We are modifying the package of Gym by adding more extensions, such as [atari,accept-rom-license] so that we have to restart the kernel. I have tested it without the kernel killing process, and it showed me an error in gym environment.

@@ -25,9 +42,11 @@
"metadata": {},
"outputs": [],
"source": [
"import os\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os has been imported previously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! But after the restart of the kernel, I need to call it again!

" break"
"rollout_len = 10\n",
"# Change to higher number of steps after you ensure you get progress\n",
"STEPS = int(5e+4) #int(15e+3)\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5e+4 and 15e+3 is a lot harder to read than 50_000 and 15_000. Please fix.

Comment on lines +765 to +767
" if i % 500 == 0:\n",
" rollout_len = min(40, rollout_len + 1)\n",
" print(f\"\\nNumber of Interactions per steps: {rollout_len}\")"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth pointing out explicitly (in a Markdown text block) that there is this rollout length schedule and the rationale for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you guys have already done that!? And that's why I added it!!

image

" print(\"\\nYour agent has just passed the minimum homework threshold with score: \", rewards_history[-1])\n",
" break\n",
" elif 8000 <= rewards_history[-1] <= 8499:\n",
" print(\"\\nYour agent has just get 'Red' Belt with score: \", rewards_history[-1])\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has just got

" elif 8000 <= rewards_history[-1] <= 8499:\n",
" print(\"\\nYour agent has just get 'Red' Belt with score: \", rewards_history[-1])\n",
" elif 8500 <= rewards_history[-1] <= 9999:\n",
" print(\"\\nYour agent has just get 'Red-Black' Belt with score: \", rewards_history[-1])\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has just got

"metadata": {},
"outputs": [],
"source": [
"torch.save(agent, \"LOCATION TO SAVE YOUR AGENT\")"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please extract the checkpoint path into a separate variable such as checkpoint_path.
  2. This is not a good way to save PyTorch model weights. Please use torch.save(agent.state_dict()) instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

image

@dniku
Copy link
Collaborator

dniku commented Dec 4, 2022

My commit is basically a reformatting of the week08_pomdp/practice_pytorch.ipynb notebook that enables me to see the actual diff in the "Files changed" tab. Here is the exact script that I used to do that:

#!/usr/bin/env python3

import argparse
import subprocess
import sys
from pathlib import Path

import nbformat


def upgrade_notebook_version(path, new_version=4):
    # Documentation: https://nbformat.readthedocs.io/en/latest/api.html
    with path.open('r') as fp:
        nb = nbformat.read(fp, new_version)

    with path.open('w') as fp:
        nbformat.write(nb, fp)


def cleanup_fields(path, clear_outputs, notebook_full_metadata=False):
    jq_cmd = [
        # Remove execution count from inputs
        '(.cells[] | select(has("execution_count")) | .execution_count) = null',
        # Remove execution count from outputs
        '(.cells[] | select(has("outputs")) | .outputs[] | select(has("execution_count")) | .execution_count) = null',
        # Remove cell metadata
        '.cells[].metadata = {}',
    ]

    # Standardize notebook metadata
    if notebook_full_metadata:
        jq_cmd.append(
            '.metadata = {' +
                '"kernelspec": {' +
                    '"display_name": "Python 3", ' +
                    '"language": "python", ' +
                    '"name": "python3"' +
                '}, ' +
                '"language_info": {' +
                    '"codemirror_mode": {"name": "ipython", "version": 3}, ' +
                    '"file_extension": ".py", ' +
                    '"mimetype": "text/x-python", ' +
                    '"name": "python", ' +
                    '"nbconvert_exporter": "python", ' +
                    '"pygments_lexer": "ipython3"' +
                '}' +
            '}'
        )
    else:
        jq_cmd.append(
            '.metadata = {"language_info": {"name": "python", "pygments_lexer": "ipython3"}}'
        )


    if clear_outputs:
        jq_cmd.append(
            '(.cells[] | select(has("outputs")) | .outputs) = []'
        )

    cmd = [
        'jq',
        '--indent', '1',
        ' | '.join(jq_cmd),
        str(path),
    ]

    formatted = subprocess.check_output(cmd, encoding='utf8')

    with path.open('w') as fp:
        fp.write(formatted)


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('path', type=Path)
    parser.add_argument('--clear-outputs', action='store_true', help='Clear outputs of all cells')
    parser.add_argument('--no-cleanup', action='store_true', help='Do not cleanup JSON at all')
    args = parser.parse_args()

    upgrade_notebook_version(args.path)
    if not args.no_cleanup:
        cleanup_fields(args.path, args.clear_outputs)
    else:
        assert not args.clear_outputs


if __name__ == '__main__':
    main()

and the command was pretty-ipynb --clear-outputs week08_pomdp/practice_pytorch.ipynb.

Copy link
Contributor Author

@AI-Ahmed AI-Ahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man!!! That was a long discussion for a file! You can ask me to give you access to the file on Colab to test it out instead!! 😂

@@ -193,11 +193,11 @@
" # Apply the whole neural net for one step here.\n",
" # See docs on self.rnn(...).\n",
" # The recurrent cell should take the last feedforward dense layer as input.\n",
" <YOUR CODE>\n",
" # <YOUR CODE>\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will adjust this also

@@ -246,8 +246,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's play!\n",
"Let's build a function that measures agent's average reward."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly, I have them such as

image

Comment on lines +270 to +271
" if record:\n",
" if record ==\"Statistic\":\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already added a typing identification for that!

image

@@ -278,7 +294,7 @@
" if done:\n",
" break\n",
"\n",
" game_rewards.append(total_reward)\n",
" game_rewards.append(total_reward / reward_scale)\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole logic will be illogical if it is parameterised since we need reward scaling.

@@ -304,7 +322,7 @@
"source": [
"# Show video. This may not work in some setups. If it doesn't\n",
"# work for you, you can download the videos and view them locally.\n",
"\n",
"import sys\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly, I don't have that!!!

image

Do you think I have to submit the 13 commits ahead of the main repository again? Because of these problems?

image

"# adding lambda λ to make compromise between bias and variance.\n",
"lamda = 0.92 # best value according to (Schulman et al., 2018) if you are going to work with GAE\n",
"ENT_COEF = 0.01\n",
"GRADIENT_COEF = 0.5"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USED!!!

image

image

"set_seed()\n",
"rewards_history, grad_norm_history, entropy_history, entropy_loss_history, loss_history, Jhat, v_loss = [],[],[],[],[],[],[]\n",
"agent = SimpleRecurrentAgent(obs_shape, n_actions)\n",
"opt = torch.optim.Adam(agent.parameters(), lr=1e-5)\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What??? It is?!!!

image

Comment on lines +765 to +767
" if i % 500 == 0:\n",
" rollout_len = min(40, rollout_len + 1)\n",
" print(f\"\\nNumber of Interactions per steps: {rollout_len}\")"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you guys have already done that!? And that's why I added it!!

image

"metadata": {},
"outputs": [],
"source": [
"torch.save(agent, \"LOCATION TO SAVE YOUR AGENT\")"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure!

"metadata": {},
"outputs": [],
"source": [
"torch.save(agent, \"LOCATION TO SAVE YOUR AGENT\")"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

image

@AI-Ahmed
Copy link
Contributor Author

Sorry, @dniku, I have not noticed the script you wrote in the following comment – #512 – I want to point out that I added the notebook with the rendered output so that the student can see the results and understand the requirements so that they can understand the task. As shown in week_05_explore, the notebook there had rendered effects, which helped me understand the problem and the expected output.

Deleting the output results wouldn't be helpful, in my opinion, since the notebook itself is a bit hard for someone new to the topic – it took me eight weeks to figure out the case for both week_06 and week_08, only. Hence, it is better to keep the results in the notebook.

@AI-Ahmed AI-Ahmed requested a review from dniku December 19, 2022 13:30
Copy link
Contributor Author

@AI-Ahmed AI-Ahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry I have not notified you that I viewed that, too!

@AI-Ahmed
Copy link
Contributor Author

Hello @dniku ,
I hope you are doing well. We have been four months, and we haven't checked that. Please, if you have time, let me know if there's anything else you need to modify.

Regards,

@dniku
Copy link
Collaborator

dniku commented Apr 2, 2023

@AI-Ahmed

I have checked both week06_policy_based/a2c-optional.ipynb and week08_pomdp/practice_pytorch.ipynb in Colab — in particular, I have verified that both assignments are solvable end-to-end. That turned out to be not quite true because of some dependencies that have been updated since we originally wrote those notebooks. To fix those problems, I have filed two PRs:

With regards to your PR, I'm afraid I will have to close it. Despite my comments, there are still many changes that I cannot merge, and it takes quite a lot of effort to go through review rounds. I would nevertheless like to thank you for your effort, and for drawing my attention to the fact that we had multiple problems with the assignments, and I have incorporated some of your changes into the PRs I linked above, like the use of the RecordVideo wrapper. Thanks — and if you feel like submitting any further PRs, please feel free to do so in the future.

P.S.: my implementation for week08_pomdp/practice_pytorch.ipynb trains without reward scaling, although it sometimes requires more iterations than 15k to cross the 10k reward threshold.

@dniku dniku closed this Apr 2, 2023
@AI-Ahmed
Copy link
Contributor Author

AI-Ahmed commented Apr 2, 2023

First of all, @dniku, Thank you for mentioning me again. I agree; new things are happening in the community and are produced every day. Therefore, I am still open to my sub; I participated in this to improve what was written 2-4 years ago. Our goal is to have the most updated version to follow up with what is happening with the community.

I appreciate you guys working hard on this, and I have learnt from your Repos, so the least amount of appreciation will be participating in this great work.

Thank you again for opening two PRs for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants