diff --git a/README.md b/README.md index 0c3bf6711..e56787de8 100644 --- a/README.md +++ b/README.md @@ -32,6 +32,21 @@ In this project you will build such a pipeline. - [Cleaning up](#cleaning-up) ## Preliminary steps + +### Supported Operating Systems + +This project is compatible with the following operating systems: + +- **Ubuntu 22.04** (Jammy Jellyfish) - both Ubuntu installation and WSL (Windows Subsystem for Linux) +- **Ubuntu 24.04** - both Ubuntu installation and WSL (Windows Subsystem for Linux) +- **macOS** - compatible with recent macOS versions + +Please ensure you are using one of the supported OS versions to avoid compatibility issues. + +### Python Requirement + +This project requires **Python 3.10**. Please ensure that you have Python 3.10 installed and set as the default version in your environment to avoid any runtime issues. + ### Fork the Starter kit Go to [https://github.com/udacity/build-ml-pipeline-for-short-term-rental-prices.git](https://github.com/udacity/build-ml-pipeline-for-short-term-rental-prices.git) and click on `Fork` in the upper right corner. This will create a fork in your Github account, i.e., a copy of the @@ -177,27 +192,6 @@ You can see the parameters that they require by looking into their `MLproject` f - `get_data`: downloads the data. [MLproject](https://github.com/udacity/build-ml-pipeline-for-short-term-rental-prices/blob/main/components/get_data/MLproject) - `train_val_test_split`: segrgate the data (splits the data) [MLproject](https://github.com/udacity/build-ml-pipeline-for-short-term-rental-prices/blob/main/components/train_val_test_split/MLproject) -## In case of errors -When you make an error writing your `conda.yml` file, you might end up with an environment for the pipeline or one -of the components that is corrupted. Most of the time `mlflow` realizes that and creates a new one every time you try -to fix the problem. However, sometimes this does not happen, especially if the problem was in the `pip` dependencies. -In that case, you might want to clean up all conda environments created by `mlflow` and try again. In order to do so, -you can get a list of the environments you are about to remove by executing: - -``` -> conda info --envs | grep mlflow | cut -f1 -d" " -``` - -If you are ok with that list, execute this command to clean them up: - -**_NOTE_**: this will remove *ALL* the environments with a name starting with `mlflow`. Use at your own risk - -``` -> for e in $(conda info --envs | grep mlflow | cut -f1 -d" "); do conda uninstall --name $e --all -y;done -``` - -This will iterate over all the environments created by `mlflow` and remove them. - ## Instructions @@ -567,6 +561,40 @@ Then commit your change, make a new release (for example ``1.0.1``) and retry (o ``-v 1.0.1`` when calling mlflow this time). Now the run should succeed and voit la', you have trained your new model on the new data. +## In case of errors + +### Environments +When you make an error writing your `conda.yml` file, you might end up with an environment for the pipeline or one +of the components that is corrupted. Most of the time `mlflow` realizes that and creates a new one every time you try +to fix the problem. However, sometimes this does not happen, especially if the problem was in the `pip` dependencies. +In that case, you might want to clean up all conda environments created by `mlflow` and try again. In order to do so, +you can get a list of the environments you are about to remove by executing: + +``` +> conda info --envs | grep mlflow | cut -f1 -d" " +``` + +If you are ok with that list, execute this command to clean them up: + +**_NOTE_**: this will remove *ALL* the environments with a name starting with `mlflow`. Use at your own risk + +``` +> for e in $(conda info --envs | grep mlflow | cut -f1 -d" "); do conda uninstall --name $e --all -y;done +``` + +This will iterate over all the environments created by `mlflow` and remove them. + +### MLflow & Wandb + +If you see the any error while running the command: + +``` +> mlflow run . +``` + +Please, make sure all steps are using **the same** python version and that you have **conda installed**. Additionally, *mlflow* and *wandb* packages are crucial and should have the same version. + + ## License [License](LICENSE.txt) diff --git a/components/get_data/conda.yml b/components/get_data/conda.yml index c9711150c..a0affb0b2 100644 --- a/components/get_data/conda.yml +++ b/components/get_data/conda.yml @@ -3,6 +3,7 @@ channels: - conda-forge - defaults dependencies: + - python=3.10.0 - pip=23.3.1 - requests=2.24.0 - pyarrow diff --git a/cookie-mlflow-step/{{cookiecutter.step_name}}/conda.yml b/cookie-mlflow-step/{{cookiecutter.step_name}}/conda.yml index 34630a915..e9cefe7ca 100644 --- a/cookie-mlflow-step/{{cookiecutter.step_name}}/conda.yml +++ b/cookie-mlflow-step/{{cookiecutter.step_name}}/conda.yml @@ -4,6 +4,7 @@ channels: - defaults dependencies: - pip=23.3.1 + - python=3.10 - pip: - mlflow==2.8.1 - wandb==0.16.0 diff --git a/src/train_random_forest/conda.yml b/src/train_random_forest/conda.yml index 7a78e3729..466fa9e22 100644 --- a/src/train_random_forest/conda.yml +++ b/src/train_random_forest/conda.yml @@ -3,7 +3,7 @@ channels: - conda-forge - defaults dependencies: - - python=3.10 + - python=3.10.0 - hydra-core=1.3.2 - matplotlib=3.8.2 - pandas=2.1.3