Contributions to this registry are welcome. We pledge to follow the Contributor Covenant Code of Conduct.
You can contribute FileType
and Extractor
entries to this registry by creating pull requests (PRs), which should include YAML files under ./data/filetypes
and ./data/extractors
, respectively. These YAML files must follow the schemas at datatractor/schemas (see the online documentation for more details).
An example of the process can be seen in this pull request (in the previous iteration of this repository). The workflow steps are as follows:
- One pull request (PR) per
Extractor
package, please.- You can include multiple versions of one
Extractor
in separate.yml
files. - Do not include multiple unrelated
Extractors
in a single PR.
- You can include multiple versions of one
- Feel free to include any missing
FileTypes
supported by theExtractor
in the same PR. - Please add the appropriate labels (
Extractor
andFileType
).
-
If you are adding new
FileTypes
in this PR, add example files of thoseFileTypes
usinggit-lfs
intoyard/data/lfs
. Do not commit example files into the repository directly. For example, to add a new file calledxmpl.tpe
(corresponding to theFileType
with IDexample-type
) you would:mkdir yard/data/lfs/example-type # create the dir using FileType ID cp xmpl.tpe yard/data/lfs/example-type # copy the example file git lfs install # setup git-lfs, only necessary if you haven't used git-lfs before git add yard/data/lfs/example-type/* # track files ... git lfs ls-files # check that your new example files are tracked before committing git commit git push
-
Your PR will have to pass the testing using our continuous integration (CI) set-up. The CI checks the following three things:
lint
, making sure theyml
files you added are properly formatted,validate
, which validates theyml
files against their schemas, andbuild
, which makes sure the new registry website can be built.
Therefore, it is mainly the
lint
andvalidate
actions you will have to pay attention to. In case anything is unclear or you cannot find the error, ping one of the Registry Maintainers. -
You can of course validate your definitions locally, by using the package, see the Usage section of the documentation.
- You should check that your
yml
definitions can be used to install theExtractor
and and extract any example files from the supportedFileTypes
locally, by using the datatractor package, see the Installation and Usage sections.
At the moment, the above CI is not checking whether the extraction of the example files actually works. See this issue for further details.
- If you are a first-time contributor to the project, we will need to pre-approve your PR so that the CI can be run. Ping one of the Registry Maintainers.
- Once the CI passes, and all issues that are raised during the review are addressed, your PR can be merged.
- Once your PR is merged into the Registry, your new definitions should be available at the Registry Website. Cheers!