HDI-Project
diff --git a/‎.github/ISSUE_TEMPLATE.md
Lines changed: 15 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE.md
Lines changed: 15 additions & 0 deletions
diff --git a/‎CLI.md
Lines changed: 97 additions & 19 deletions b/‎CLI.md
Lines changed: 97 additions & 19 deletions
@@ -0,0 +1,15 @@
+* ATM version:
+* Python version:
+* Operating System:
+
+### Description
+
+Describe what you were trying to get done.
+Tell us what happened, what went wrong, and what you expected to happen.
+
+### What I Did
+
+```
+Paste the command(s) you ran and the output.
+If there was a crash, please include the traceback here.
+```
@@ -3,36 +3,34 @@
 **ATM** provides a simple command line client that will allow you to run ATM directly
 from your terminal by simply passing it the path to a CSV file.
 
-In this example, we will use the default values that are provided in the code, which will use
-the `pollution.csv` that is being generated with the demo datasets by ATM.
+## Quickstart
 
-## 1. Generate the demo data
+In this example, we will use the default values that are provided in the code in order to generate
+classifiers.
 
-**ATM** command line allows you to generate the demo data that we will be using through this steps
-by running the following command:
+### 1. Get the demo data
 
-```bash
-atm get_demos
-```
+The first step in order to run **ATM** is to obtain the demo datasets that will be used in during
+the rest of the tutorial.
 
-A print on your console with the generated demo datasets will appear:
+For this demo we will be using the pollution csv from the
+[demos bucket](https://atm-data.s3.amazonaws.com/index.html), which you can download from
+[here](https://atm-data.s3.amazonaws.com/pollution_1.csv).
 
-```bash
-Generating file demos/iris.csv
-Generating file demos/pollution.csv
-Generating file demos/pitchfork_genres.csv
-```
 
-## 2. Create a dataset and generate it's dataruns
+### 2. Create a dataset and generate it's dataruns
 
-Once you have generated the demo datasets, now it's time to create a `dataset` object inside the
+Once you have obtained your demo dataset, now it's time to create a `dataset` object inside the
 database. Our command line also triggers the generation of `datarun` objects for this dataset in
 order to automate this process as much as possible:
 
 ```bash
-atm enter_data
+atm enter_data --train-path path/to/pollution_1.csv
 ```
 
+Bear in mind that `--train-path` argument can be a local path, an URL link to the CSV file or an
+complete S3 Bucket path.
+
 If you run this command, you will create a dataset with the default values, which is using the
 `pollution_1.csv` dataset from the demo datasets.
 
@@ -44,7 +42,7 @@ method dt has 2 hyperpartitions
 method knn has 24 hyperpartitions
 Dataruns created. Summary:
 	Dataset ID: 1
-	Training data: demos/pollution_1.csv
+	Training data: path/to/pollution_1.csv
 	Test data: None
 	Datarun ID: 1
 	Hyperpartition selection strategy: uniform
@@ -58,7 +56,7 @@ For more information about the arguments that this command line accepts, please
 atm enter_data --help
 ```
 
-## 3. Start a worker
+### 3. Start a worker
 
 **ATM** requieres a worker to process the dataruns that are not completed and stored inside the
 database. This worker process will be runing until there are no dataruns `pending`.
@@ -105,3 +103,83 @@ This command aswell offers more information about the arguments that this comman
 ```
 atm worker --help
 ```
+
+## Command Line Arguments
+
+You can specify each argument individually on the command line. The names of the
+variables are the same as those described [here](https://hdi-project.github.io/ATM/configuring_atm.html#arguments).
+SQL configuration variables must be prepended by `sql-`, and AWS config variables must be
+prepended by `aws-`.
+
+### Using command line arguments
+
+Using command line arguments is convenient for quick experiments, or for cases where you
+need to change just a couple of values from the default configuration. For example:
+
+```bash
+atm enter_data --train-path ./data/my-custom-data.csv \
+              --test-path ./data/my-custom-test-data.csv \
+              --selector bestkvel
+```
+
+You can also use a mixture of config files and command line arguments; any command line
+arguments you specify will override the values found in config files.
+
+### Using YAML configuration files
+
+You can also save the configuration as YAML files is an easy way to save complicated setups
+or share them with team members.
+
+You should start with the templates provided by the `atm make_config` command:
+
+```bash
+atm make_config
+```
+
+This will generate a folder called `config/templates` in your current working directory which
+will contain 5 files, which you will need to copy over to the `config` folder and edit according
+to your needs:
+
+```bash
+cp config/templates/*.yaml config/
+vim config/*.yaml
+```
+
+`run.yaml` contains all the settings for a single dataset and datarun. Specify the `train_path`
+to point to your own dataset.
+
+`sql.yaml` contains the settings for the ModelHub SQL database. The default configuration will
+connect to (and create if necessary) a SQLite database at `./atm.db` relative to the directory
+from which `enter_data.py` is run. If you are using a MySQL database, you will need to change
+the file to something like this:
+
+```
+dialect: mysql
+database: atm
+username: username
+password: password
+host: localhost
+port: 3306
+query:
+```
+
+`aws.yaml` should contain the settings for running ATM in the cloud. This is not necessary
+for local operation.
+
+Once your YAML files have been updated, run the datarun creation command and pass it the paths
+to your new config files:
+
+```bash
+atm enter_data --sql-config config/sql.yaml \
+              --aws-config config/aws.yaml \
+              --run-config config/run.yaml
+```
+
+It's important that the SQL configuration used by the worker matches the configuration you
+passed to `enter_data` -- otherwise, the worker will be looking in the wrong ModelHub
+database for its datarun!
+
+```
+atm worker --sql-config config/sql.yaml \
+          --aws-config config/aws.yaml \
+```