SC2DatasetPreparator
This repository contains tools which can be used to create an StarCraft II dataset. The following steps are suggested:
- Obtain a number of replays to process. This can be a replaypack or your own replay folder.
- Download latest version of SC2InfoExtractorGo, or build it from source.
- Optional Using
src/directory_flattener.py
Flatten the directory structure and save the old directory tree to a mapping:{"replayUniqueHash": "whereItWasInOldStructure"}
. This is required in order to properly use the SC2InfoExtractorGo. - Optional Use the map downloader
src/sc2_map_downloader.py
to download maps that were used in the replays that you obtained. - Optional Use the SC2MapLocaleExtractor to obtain the mapping of
{"foreign_map_name": "english_map_name"}
which is required for the SC2InfoExtractorGo to translate the map names. - Perform replaypack processing using
src/sc2_replaypack_processor.py
with the SC2InfoExtractorGo in PATH, or next to the script. - Optional Using the
src/file_renamer.py
, rename the files that were generated in step 5. - Using the
src/file_packager.py
, create .zip archives containing the datasets and the supplementary files.
Customization
In order to specify different processing flags for https://github.com/Kaszanas/SC2InfoExtractorGo please modify the src/sc2_replaypack_processor
file directly
Usage
Before using this software please install Python >= 3.10 and requirements.txt
.
Please keep in mind that src/directory_flattener.py
does not contain default flag values and can be customized with the following command line flags:
usage: directory_flattener.py [-h] [--input_path INPUT_PATH] [--output_path OUTPUT_PATH]
[--file_extension FILE_EXTENSION]
Directory restructuring tool used in order to flatten the structure, map the old structure to a separate
file, and for later processing with other tools. Created primarily to define StarCraft 2 (SC2) datasets.
options:
-h, --help show this help message and exit
--input_path INPUT_PATH (default = ../processing/directory_flattener/input)
Please provide input path to the dataset that is going to be processed.
--output_path OUTPUT_PATH (default = ../processing/directory_flattener/output)
Please provide output path where sc2 map files will be downloaded.
--file_extension FILE_EXTENSION (default = .SC2Replay)
Please provide a file extension for files that will be moved and renamed.
Please keep in mind that the src/sc2_map_downloader.py
does not contain default flag values and can be customized with the following command line flags:
usage: sc2_map_downloader.py [-h] [--input_path INPUT_PATH] [--output_path OUTPUT_PATH]
Tool for downloading StarCraft 2 (SC2) maps based on the data that is available within .SC2Replay file.
options:
-h, --help show this help message and exit
--input_path INPUT_PATH (default = ../processing/directory_flattener/output)
Please provide input path to the dataset that is going to be processed.
--output_path OUTPUT_PATH (default = ../processing/sc2_map_downloader/output)
Please provide output path where sc2 map files will be downloaded.
Please keep in mind that the src/sc2_replaypack_processor.py
contains default flag values and can be customized with the following command line flags:
usage: sc2_replaypack_processor.py [-h] [--input_dir INPUT_DIR] [--output_dir OUTPUT_DIR]
[--n_processes N_PROCESSES]
Tool used for processing StarCraft 2 (SC2) datasets. with https://github.com/Kaszanas/SC2InfoExtractorGo
options:
-h, --help show this help message and exit
--input_dir INPUT_DIR (default = ../processing/directory_flattener/output)
Please provide input path to the directory containing the dataset that is going to be processed.
--output_dir OUTPUT_DIR (default = ../processing/sc2_replaypack_processor/output)
Please provide an output directory for the resulting files.
--n_processes N_PROCESSES (default = 4)
Please provide the number of processes to be spawned for the dataset processing.
Please keep in mind that the src/file_renamer.py
contains default flag values and can be customized with the following command line flags:
usage: file_renamer.py [-h] [--input_dir INPUT_DIR]
Tool used for processing StarCraft 2 (SC2) datasets with https://github.com/Kaszanas/SC2InfoExtractorGo
options:
-h, --help show this help message and exit
--input_dir INPUT_DIR (default = ../processing/sc2_replaypack_processor/output)
Please provide input path to the directory containing the dataset that is going to be processed.
Please keep in mind that the src/file_packager.py
contains default flag values and can be customized with the following command line flags:
usage: file_packager.py [-h] [--input_dir INPUT_DIR]
Tool used for processing StarCraft 2 (SC2) datasets. with https://github.com/Kaszanas/SC2InfoExtractorGo
options:
-h, --help show this help message and exit
--input_dir INPUT_DIR (default = ../processing/sc2_replaypack_processor/output)
Please provide input path to the directory containing the dataset that is going to be processed by packaging into .zip archives.
Please keep in mind that the src/json_merger.py
contains default flag values and can be customized with the following command line flags:
usage: json_merger.py [-h] [--json_one JSON_ONE] [--json_two JSON_TWO] [--output_filepath OUTPUT_FILEPATH]
Tool used for merging two .json files. Created in order to merge two mappings created by
https://github.com/Kaszanas/SC2MapLocaleExtractor
options:
-h, --help show this help message and exit
--json_one JSON_ONE (default = ../processing/json_merger/json1.json)
Please provide the path to the first .json file that is going to be merged.
--json_two JSON_TWO (default = ../processing/json_merger/json2.json)
Please provide the path to the second .json file that is going to be merged.
--output_filepath OUTPUT_FILEPATH (default = ../processing/json_merger/merged.json)
Please provide output path where sc2 map files will be downloaded.
Please keep in mind that the src/processed_mapping_copier.py
contains default flag values and can be customized with the following command line flags:
usage: processed_mapping_copier.py [-h] [--input_path INPUT_PATH] [--output_path OUTPUT_PATH]
Tool for copying the processed_mapping.json files that are required to define the StarCraft 2 (SC2) dataset.
options:
-h, --help show this help message and exit
--input_path INPUT_PATH (default = ../processing/directory_flattener/output)
Please provide input path to the flattened replaypacks that contain
procesed_mapping.json files.
--output_path OUTPUT_PATH (default = ../processing/sc2_replaypack_processor/output)
Please provide output path where processed_mapping.json will be copied.