adding documentation

steamraven · Nov 16, 2020 · c42f973 · c42f973
1 parent e66eb0b
commit c42f973
Show file tree

Hide file tree

Showing 9 changed files with 895 additions and 160 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -5,107 +5,83 @@ for developers everywhere! This document serves as a guide to help you quickly
 gain familarity with the repository, and start your development environment so
 that you can quickly hit the ground running.
 
-## Layout
+## 1. Learn the Overall Layout of the Code
 
-```
-/detect_secrets               # This is where the main code lives
-    /core                     # Powers the detect-secrets engine
-    /plugins                  # All plugins live here, modularized.
-        /common               # Common logic shared between plugins
-    main.py                   # Entrypoint for console use
-    pre_commit_hook.py        # Entrypoint for pre-commit hook
-
-/test_data                    # Sample files used for testing purposes
-/testing                      # Common logic used in test cases
-/tests                        # Mirrors detect_secrets layout for all tests
-```
+Be sure to read through the [overview of `detect-secrets`' design](/docs/design.md) before
+starting to work on it! This will give you a better idea of the different components to the
+system, and how they interact together to find secrets.
 
-## Building Your Development Environment
+## 2. Building Your Development Environment
 
 There are several ways to spin up your virtual environment:
 
+**Casual Python Developers**:
+
 ```bash
-virtualenv --python=python3 venv
+python3 -m venv venv
 source venv/bin/activate
 pip install -r requirements-dev.txt
 ```
 
-or
+**Regular Python Developers:**
 
 ```bash
-python3 -m venv venv
+virtualenv --python=python3 venv
 source venv/bin/activate
 pip install -r requirements-dev.txt
 ```
 
+> **Developer Note**: The main difference between this method and the former one (using Python's
+  in-built virtual environment) is that Python's `venv` module pins the `pip` version. However,
+  it doesn't matter too much if you're working on this repository alone, since `detect-secrets`
+  doesn't ship with many dependency requirements.
+
 or
 
 ```bash
 tox -e venv
 source venv/bin/activate
 ```
 
-Whichever way you choose, you can check to see whether you're successful by
-executing:
-
-```bash
-PYTHONPATH=`pwd` python detect_secrets/main.py --version
-```
-
-## Writing a Plugin
-
-There are many examples of existing plugins to reference, under
-`detect_secrets/plugins`. However, this is the overall workflow:
-
-1. Write your tests
-
-   Before you write your plugin, you should **know what it intends to do**:
-   what it should catch, and arguably more importantly, what it should
-   avoid. Formalize these examples in tests!
-
-   For a basic example, see `tests/plugins/basic_auth_test.py`.
-
-2. Write your plugin
+> **Developer Note**: The benefit of this is that `tox` sets up a common development environment
+  for you. The downside is that you'll need to install `tox` first -- which if you already have,
+  you wouldn't be reading this section :)
 
-   All plugins MUST inherit from `detect_secrets.plugins.base.BasePlugin`.
-   See that class' docstrings for more detailed information.
 
-   Depending on the complexity of your plugin, you may be able to inherit
-   from `detect_secrets.plugins.base.RegexBasedDetector` instead. This is
-   useful if you want to merely customize a new regex rule. Check out
-   `detect_secrets/plugins/basic_auth.py` for a good example of this.
+Whichever way you choose, you can check to see whether you're successful by executing:
 
-   Be sure to write comments about **why** your particular regex was crafted
-   as it is!
-
-3. Update documentation
-
-   Be sure to add your changes to the `README.md` and `CHANGELOG.md` so that
-   it will be easier for maintainers to bump the version and for other
-   downstream consumers to get the latest information about plugins available.
+```bash
+python -m detect_secrets --version
+```
 
-### Tips
+## 3. Run tests
 
-- There should be a total of three modified files in a minimal new plugin: the
-  plugin file, it's corresponding test, and an updated README.
-- If your plugin uses customizable options (e.g. entropy limit in `HighEntropyStrings`)
-  be sure to add default options to the plugin's `default_options`.
+Tests should succeed on master. Any code additions you contribute will also need testing
+so it's good to run tests first to make sure you have a working copy. Don't worry -- the tests
+don't take long!
 
-## Running Tests
+```bash
+$ time python -m pytest tests
+...
+real    0m10.113s
+user    0m6.848s
+sys     0m2.486s
+```
 
 ### Running the Entire Test Suite
 
-You can run the test suite in the interpreter of your choice (in this example,
-`py35`) by doing:
+You can run the test suite in the interpreter of your choice (in this example, `py36`) by doing:
 
 ```bash
-tox -e py35
+tox -e py36
 ```
 
+This will also run the code through our series of coverage tests, `mypy` rules and other linting
+checks to enforce a consistent coding style.
+
 For a list of supported interpreters, check out `envlist` in `tox.ini`.
 
-If you wanted to run **all** interpreters (might take a while), you can also
-just run:
+If you wanted to run **all** interpreters (might take a while), you can also just run:
 
 ```bash
 make test
@@ -125,113 +101,35 @@ levels. Here are a couple of examples:
 - Running a single test class
 
   ```bash
-  pytest tests/core/baseline_test.py::TestInitializeBaseline
+  pytest tests/core/baseline_test.py::TestCreate
   ```
 
 - Running a single test function, inside test class
 
   ```bash
-  pytest tests/core/baseline_test.py::TestInitializeBaseline::test_basic_usage
+  pytest tests/core/baseline_test.py::TestCreate::test_basic_usage
   ```
 
 - Running a single root level test function
 
   ```bash
-  pytest tests/plugins/base_test.py::test_fails_if_no_secret_type_defined
+  pytest tests/plugins/baseline_test.py::test_upgrade_succeeds
   ```
 
-## Technical Details
-
-### PotentialSecret
-
-This lives at the very heart of the engine, and represents a line being flagged
-for its potential to be a secret.
-
-Since the detect-secrets engine is heuristics-based, it requires a human to read
-its output at some point to determine false/true positives. Therefore, its
-representation is tailored to support **high readability**. Its attributes
-represent values that you would want to know (and keep track of) for
-each potential secret, including:
-
-1. What is it?
-2. How was it found?
-3. Where is it found?
-4. Is it a true/false positive?
-
-We can see that the JSON dump clearly shows this.
-
-```
-{
-    "type": "Base64 High Entropy String",
-    "filename": "test_data/config.yaml",
-    "line_number": 5,
-    "hashed_secret": "bc9160bc0ff062e1b2d21d2e59f6ebaba104f051",
-    "is_secret": false
-}
-```
-
-However, since it is designed for easy reading, we didn't want the baseline to
-be the single file that contained all the secrets in a given repository.
-Therefore, we mask the secret by hashing it with three core attributes:
-
-1. The actual secret
-2. The filepath where it was found
-3. How the engine determined it was a secret
-
-Any potential secret that has **all three values the same is equal**.
-
-This means that the engine will flag the following cases as separate occurrences
-to investigate:
-
-* Same secret value, but present in different files
-* Same secret value, caught by multiple plugins
-
-Furthermore, this will **not** flag on every single usage of a given secret in a
-given file, to minimize noise.
-
-**Important Note:** The line number does not play a part in the identification
-of a potential secret because code is expected to move around through continuous
-iteration. However, through the `audit` tool, these line numbers are leveraged
-to quickly identify the secret that was identified by a given plugin.
-
-### SecretsCollection
-
-A collection of `PotentialSecrets` are stored in a `SecretsCollection`. This
-contains a list of all the secrets in a given repository, as well as any other
-details needed to recreate it.
-
-A formatted dump of a `SecretsCollection` is used as the baseline file.
-
-In this way, the overall baseline logic is simple:
-
-1. Scan the repository to create a collection of known secrets.
-2. Check every new secret against this collection of known secrets.
-3. If you previously didn't know about it, alert off it.
-
-With this in mind, this class exposes three types of methods:
-
-##### 1. Creating
-
-We need to create a `SecretsCollection` object from a formatted baseline output,
-so that we can compare new secrets against it. This means that the baseline
-**must** include all information needed to initialize a `SecretsCollection`,
-such as:
+Generally speaking, we use test classes to group a series of related test cases together (e.g.
+`TestCreate` tests the `detect_secrets.core.baseline.create` functionality), but root test
+functions otherwise. If you're writing tests for your plugins, you should probably just use
+root test functions.
 
-* Secrets found,
-* Files to exclude,
-* Plugin configurations,
-* Version of detect-secrets used
+## 4. Make Your Change
 
-##### 2. Adding
+Want to contribute a new plugin? Check out more details here:
+[Writing Your Own Plugin](/docs/plugins.md#Writing%20Your%20Own%20Plugin)
 
-Once we have a collection of secrets, we can add secrets to it via various
-methods of scanning strings. The various methods of scanning strings (e.g.
-`scan_file`, `scan_diff`) should handle iterating through all plugins, and
-adding results found to the collection.
+What about contributing better false positive filters? Check out more details here:
+[Writing Your Own Filter](/docs/filters.md#Writing%20Your%20Own%20Filter)
 
-##### 3. Outputting
+## 5. Deploying Changes
 
-We need to be able to create a baseline from a SecretsCollection, so that it
-can be used for future comparisons. In the same spirit as the `PotentialSecret`
-object, it is designed for **high readability**, and may contain other metadata
-that aids human analysis of the generated output (e.g. `generated_at` time).
+Check out [more detailed upgrade instructions here](/docs/upgrades.md), and how to write
+backwards-compatible changes using the built-in upgrade infrastructure.
diff --git a/detect_secrets/__main__.py b/detect_secrets/__main__.py
@@ -0,0 +1,7 @@
+import sys
+
+from .main import main
+
+
+if __name__ == '__main__':
+    sys.exit(main())
diff --git a/detect_secrets/main.py b/detect_secrets/main.py
@@ -111,7 +111,3 @@ def handle_audit_action(args: argparse.Namespace) -> None:
                 audit.audit_baseline(args.filename[0])
     except InvalidBaselineError:
         pass
-
-
-if __name__ == '__main__':
-    sys.exit(main())