Skip to content

Commit 17cd4b0

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents c33efd0 + c9de05f commit 17cd4b0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+1826
-1876
lines changed

.circleci/config.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,4 @@ jobs:
66
- image: themattrix/tox
77
steps:
88
- checkout
9-
- run: apt-get -qq update
10-
- run: apt-get -qq -y install libmysqlclient-dev
119
- run: tox

AUTHORS.rst

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,15 @@
1-
=======
21
Credits
32
=======
43

5-
Contributors
6-
------------
7-
84
* Bennett Cyphers <[email protected]>
95
* Thomas Swearingen <[email protected]>
10-
* Kalyan Veeramachaneni <[email protected]>
11-
* Laura Gustafson <[email protected]>
126
* Carles Sala <[email protected]>
13-
* Micah Smith <[email protected]>
147
* Plamen Valentinov <[email protected]>
8+
* Kalyan Veeramachaneni <[email protected]>
9+
* Micah Smith <[email protected]>
10+
* Laura Gustafson <[email protected]>
1511
* Kiran Karra <[email protected]>
16-
* swearin3 <[email protected]>
1712
* Max Kanter <[email protected]>
18-
* cclauss <[email protected]>
1913
* Alfredo Cuesta-Infante <[email protected]>
20-
* wheuuchuu <[email protected]>
21-
* Matteo Hoch <[email protected]>
2214
* Favio André Vázquez <[email protected]>
15+
* Matteo Hoch <[email protected]>

CLI.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Command Line Interface
2+
3+
**ATM** provides a simple command line client that will allow you to run ATM directly
4+
from your terminal by simply passing it the path to a CSV file.
5+
6+
In this example, we will use the default values that are provided in the code, which will use
7+
the `pollution.csv` that is being generated with the demo datasets by ATM.
8+
9+
## 1. Generate the demo data
10+
11+
**ATM** command line allows you to generate the demo data that we will be using through this steps
12+
by running the following command:
13+
14+
```bash
15+
atm get_demos
16+
```
17+
18+
A print on your console with the generated demo datasets will appear:
19+
20+
```bash
21+
Generating file demos/iris.csv
22+
Generating file demos/pollution.csv
23+
Generating file demos/pitchfork_genres.csv
24+
```
25+
26+
## 2. Create a dataset and generate it's dataruns
27+
28+
Once you have generated the demo datasets, now it's time to create a `dataset` object inside the
29+
database. Our command line also triggers the generation of `datarun` objects for this dataset in
30+
order to automate this process as much as possible:
31+
32+
```bash
33+
atm enter_data
34+
```
35+
36+
If you run this command, you will create a dataset with the default values, which is using the
37+
`pollution_1.csv` dataset from the demo datasets.
38+
39+
A print, with similar information to this, should be printed:
40+
41+
```bash
42+
method logreg has 6 hyperpartitions
43+
method dt has 2 hyperpartitions
44+
method knn has 24 hyperpartitions
45+
Dataruns created. Summary:
46+
Dataset ID: 1
47+
Training data: demos/pollution_1.csv
48+
Test data: None
49+
Datarun ID: 1
50+
Hyperpartition selection strategy: uniform
51+
Parameter tuning strategy: uniform
52+
Budget: 100 (classifier)
53+
```
54+
55+
For more information about the arguments that this command line accepts, please run:
56+
57+
```bash
58+
atm enter_data --help
59+
```
60+
61+
## 3. Start a worker
62+
63+
**ATM** requieres a worker to process the dataruns that are not completed and stored inside the
64+
database. This worker process will be runing until there are no dataruns `pending`.
65+
66+
In order to launch such a process, execute:
67+
68+
```bash
69+
atm worker
70+
```
71+
72+
This will start a process that builds classifiers, tests them, and saves them to the `./models/`
73+
directory. The output should show which hyperparameters are being tested and the performance of
74+
each classifier (the "judgment metric"), plus the best overall performance so far.
75+
76+
Prints similar to this one will apear repeatedly on your console while the `worker` is processing
77+
the datarun:
78+
79+
```bash
80+
Classifier type: classify_logreg
81+
Params chosen:
82+
C = 8904.06127554
83+
_scale = True
84+
fit_intercept = False
85+
penalty = l2
86+
tol = 4.60893080631
87+
dual = True
88+
class_weight = auto
89+
90+
Judgment metric (f1): 0.536 +- 0.067
91+
Best so far (classifier 21): 0.716 +- 0.035
92+
```
93+
94+
Occasionally, a worker will encounter an error in the process of building and testing a
95+
classifier. When this happens, the worker will print error data to the console, log the error in
96+
the database, and move on to the next classifier.
97+
98+
You can break out of the worker with <kbd>Ctrl</kbd>+<kbd>c</kbd> and restart it with the same
99+
command; it will pick up right where it left off. You can also run the command simultaneously in
100+
different terminals to parallelize the work -- all workers will refer to the same ModelHub
101+
database. When all 100 classifiers in your budget have been built, all workers will exit gracefully.
102+
103+
This command aswell offers more information about the arguments that this command line accepts:
104+
105+
```
106+
atm worker --help
107+
```

CONTRIBUTING.rst

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
.. highlight:: shell
2+
3+
============
4+
Contributing
5+
============
6+
7+
Contributions are welcome, and they are greatly appreciated! Every little bit
8+
helps, and credit will always be given.
9+
10+
You can contribute in many ways:
11+
12+
Types of Contributions
13+
----------------------
14+
15+
Report Bugs
16+
~~~~~~~~~~~
17+
18+
Report bugs at https://github.com/HDI-Project/ATM/issues.
19+
20+
If you are reporting a bug, please include:
21+
22+
* Your operating system name and version.
23+
* Any details about your local setup that might be helpful in troubleshooting.
24+
* Detailed steps to reproduce the bug.
25+
26+
Fix Bugs
27+
~~~~~~~~
28+
29+
Look through the GitHub issues for bugs. Anything tagged with "bug" and "help
30+
wanted" is open to whoever wants to implement it.
31+
32+
Implement Features
33+
~~~~~~~~~~~~~~~~~~
34+
35+
Look through the GitHub issues for features. Anything tagged with "enhancement"
36+
and "help wanted" is open to whoever wants to implement it.
37+
38+
Write Documentation
39+
~~~~~~~~~~~~~~~~~~~
40+
41+
ATM could always use more documentation, whether as part of the
42+
official ATM docs, in docstrings, or even on the web in blog posts,
43+
articles, and such.
44+
45+
Submit Feedback
46+
~~~~~~~~~~~~~~~
47+
48+
The best way to send feedback is to file an issue at https://github.com/HDI-Project/ATM/issues.
49+
50+
If you are proposing a feature:
51+
52+
* Explain in detail how it would work.
53+
* Keep the scope as narrow as possible, to make it easier to implement.
54+
* Remember that this is a volunteer-driven project, and that contributions
55+
are welcome :)
56+
57+
Get Started!
58+
------------
59+
60+
Ready to contribute? Here's how to set up `ATM` for local development.
61+
62+
1. Fork the `ATM` repo on GitHub.
63+
2. Clone your fork locally::
64+
65+
$ git clone [email protected]:your_name_here/ATM.git
66+
67+
3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed,
68+
this is how you set up your fork for local development::
69+
70+
$ mkvirtualenv ATM
71+
$ cd ATM/
72+
$ make install-develop
73+
74+
4. Create a branch for local development::
75+
76+
$ git checkout -b name-of-your-bugfix-or-feature
77+
78+
Now you can make your changes locally.
79+
80+
5. While hacking your changes, make sure to cover all your developments with the required
81+
unit tests, and that none of the old tests fail as a consequence of your changes.
82+
For this, make sure to run the tests suite and check the code coverage::
83+
84+
$ make test # Run the tests
85+
$ make coverage # Get the coverage report
86+
87+
6. When you're done making changes, check that your changes pass flake8 and the
88+
tests, including testing other Python versions with tox::
89+
90+
$ make lint # Check code styling
91+
$ make test-all # Execute tests on all python versions
92+
93+
7. Make also sure to include the necessary documentation in the code as docstrings following
94+
the `google docstring`_ style.
95+
If you want to view how your documentation will look like when it is published, you can
96+
generate and view the docs with this command::
97+
98+
$ make viewdocs
99+
100+
8. Commit your changes and push your branch to GitHub::
101+
102+
$ git add .
103+
$ git commit -m "Your detailed description of your changes."
104+
$ git push origin name-of-your-bugfix-or-feature
105+
106+
9. Submit a pull request through the GitHub website.
107+
108+
.. _google docstring: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
109+
110+
Pull Request Guidelines
111+
-----------------------
112+
113+
Before you submit a pull request, check that it meets these guidelines:
114+
115+
1. It resolves an open GitHub Issue and contains its reference in the title or
116+
the comment. If there is no associated issue, feel free to create one.
117+
2. Whenever possible, it resolves only **one** issue. If your PR resolves more than
118+
one issue, try to split it in more than one pull request.
119+
3. The pull request should include unit tests that cover all the changed code
120+
4. If the pull request adds functionality, the docs should be updated. Put
121+
your new functionality into a function with a docstring, and add the
122+
feature to the list in README.rst.
123+
5. The pull request should work for Python2.7, 3.4, 3.5 and 3.6. Check
124+
https://travis-ci.org/HDI-Project/ATM/pull_requests
125+
and make sure that all the checks pass.
126+
127+
Unit Testing Guidelines
128+
-----------------------
129+
130+
All the Unit Tests should comply with the following requirements:
131+
132+
1. Unit Tests should be based only in unittest and pytest modules.
133+
134+
2. The tests that cover a module called ``atm/path/to/a_module.py`` should be
135+
implemented in a separated module called ``tests/atm/path/to/test_a_module.py``.
136+
Note that the module name has the ``test_`` prefix and is located in a path similar
137+
to the one of the tested module, just inside te ``tests`` folder.
138+
139+
3. Each method of the tested module should have at least one associated test method, and
140+
each test method should cover only **one** use case or scenario.
141+
142+
4. Test case methods should start with the ``test_`` prefix and have descriptive names
143+
that indicate which scenario they cover.
144+
Names such as ``test_some_methed_input_none``, ``test_some_method_value_error`` or
145+
``test_some_method_timeout`` are right, but names like ``test_some_method_1``,
146+
``some_method`` or ``test_error`` are not.
147+
148+
5. Each test should validate only what the code of the method being tested does, and not
149+
cover the behavior of any third party package or tool being used, which is assumed to
150+
work properly as far as it is being passed the right values.
151+
152+
6. Any third party tool that may have any kind of random behavior, such as some Machine
153+
Learning models, databases or Web APIs, will be mocked using the ``mock`` library, and
154+
the only thing that will be tested is that our code passes the right values to them.
155+
156+
7. Unit tests should not use anything from outside the test and the code being tested. This
157+
includes not reading or writting to any filesystem or database, which will be properly
158+
mocked.
159+
160+
Tips
161+
----
162+
163+
To run a subset of tests::
164+
165+
$ pytest tests.test_atm
166+
167+
Release Workflow
168+
----------------
169+
170+
The process of releasing a new version involves several steps combining both ``git`` and
171+
``bumpversion`` which, briefly:
172+
173+
1. Merge what is in ``master`` branch into ``stable`` branch.
174+
2. Update the version in ``setup.cfg``, ``atm/__init__.py`` and ``HISTORY.md`` files.
175+
3. Create a new git tag pointing at the corresponding commit in ``stable`` branch.
176+
4. Merge the new commit from ``stable`` into ``master``.
177+
5. Update the version in ``setup.cfg`` and ``atm/__init__.py``
178+
to open the next development iteration.
179+
180+
.. note:: Before starting the process, make sure that ``HISTORY.md`` has been updated with a new
181+
entry that explains the changes that will be included in the new version.
182+
Normally this is just a list of the Pull Requests that have been merged to master
183+
since the last release.
184+
185+
Once this is done, run of the following commands:
186+
187+
1. If you are releasing a patch version::
188+
189+
make release
190+
191+
2. If you are releasing a minor version::
192+
193+
make release-minor
194+
195+
3. If you are releasing a major version::
196+
197+
make release-major

0 commit comments

Comments
 (0)