Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
mmarkakis committed Jul 31, 2024
0 parents commit 3a6dbc0
Show file tree
Hide file tree
Showing 289 changed files with 90,709 additions and 0 deletions.
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
dataset_files/*/*
evaluation/repro*
webapp/log_results

.env
.ipynb_checkpoints/
.vscode/

**/__pycache__/
**/scratch.*
20 changes: 20 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
streamlit = "*"
streamlit-extras = "*"
pandas = "*"
tqdm = "*"
pydot = "*"
dowhy = "*"
ipython = "*"
pygraphviz = "*"

[dev-packages]
ipykernel = "*"

[requires]
python_version = "3.10"
2,208 changes: 2,208 additions & 0 deletions Pipfile.lock

Large diffs are not rendered by default.

20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# LOGos

Utilizing system logs to perform causal analysis.

### Demo

You can find a quick demo of the LOGos API at [demo.ipynb](demo.ipynb).

### Documentation

To view the documentation, run `mkdocs serve` from the root of this repo and open the corresponding page.

You might need to install the following packages:
`pip install mkdocs-material mkdocs-gen-files mkdocs-literate-nav markdown_include pymdown-extensions markdown mkdocs-pymdownx Pygments mkdocs-jupyter mkdocstrings-python mkdocstrings mdx_include`

### OpenAI integration

Yo use the LLM-powered capabilites of LOGos, please add a `.env` file to the root of this repo and define `OPENAI_API_KEY` appropriately.


29 changes: 29 additions & 0 deletions dataset_files/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
## `dataset_files/`

This directory holds the following types of files, for each of the datasets `x` used in our evaluation:
- `x/datasets_raw/`: The raw logs.
- `x/datasets/`: The cached byproducts of processing the dataset with LogOS, in pickled form.
- `x/evaluation/`: The outputs produced by our experiment runners when processing the dataset in question.

Some of these files are large, which is why we have hosted them on S3 instead of distributing them
inside this repository. If you would like to access any of these datasets, please email us at markakis[at]mit[dot]edu.

Once you have been granted access, you can download the `PostgreSQL` dataset by running:
```sh
aws s3 sync s3://logos-dataset-postgresql postgresql/
```

Once you have been granted access, you can download the `XYZ` dataset by running:
```sh
aws s3 sync s3://logos-dataset-xyz xyz/
```

Once you have been granted access, you can download the datasets for the scaling microexperiments by running:
```sh
aws s3 sync s3://logos-dataset-scaling scaling/
```

The `Proprietary` dataset is not publicly available for privacy reasons. If you have an extremely compelling reason to request access, please explain it when requesting access and we may review your request on a case-by-case basis. If you have been granted access, you can download the `Proprietary` dataset by running:
```sh
aws s3 sync s3://logos-dataset-proprietary proprietary/
```
4 changes: 4 additions & 0 deletions dataset_files/pull.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
aws s3 sync s3://logos-dataset-postgresql postgresql/ --delete --exclude "repro_evaluation/*"
aws s3 sync s3://logos-dataset-proprietary proprietary/ --delete --exclude "repro_evaluation/*"
aws s3 sync s3://logos-dataset-xyz xyz/ --delete --exclude "repro_evaluation/*"
aws s3 sync s3://logos-dataset-scaling scaling/ --delete
5 changes: 5 additions & 0 deletions dataset_files/push.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# For internal use only - you won't have permissions for this.
aws s3 sync postgresql/ s3://logos-dataset-postgresql --delete --exclude "datasets_raw/*" --exclude "repro_evaluation/*"
aws s3 sync proprietary/ s3://logos-dataset-proprietary --delete --exclude "datasets_raw/*" --exclude "repro_evaluation/*"
aws s3 sync xyz/ s3://logos-dataset-xyz --delete --exclude "datasets_raw/*" --exclude "repro_evaluation/*"
aws s3 sync scaling/ s3://logos-dataset-scaling --delete
Loading

0 comments on commit 3a6dbc0

Please sign in to comment.