Skip to content
Snippets Groups Projects
Commit 8d7cf574 authored by Daniel Guderian's avatar Daniel Guderian
Browse files

Merge branch 'concatenate_scripts' of git.km3net.de:ml/OrcaSong into concatenate_scripts

parents a4cfe2b0 96fd0003
No related branches found
No related tags found
1 merge request!14revive make_data_split
......@@ -8,11 +8,11 @@ Orcasong comes with some tools to further process data.
Make_data_split
---------------
Create datasets for different tasks (like classification or regression) from the files resulting from OrcaSong, based on the run_id. This is particularly helpful for a run-by-run data analysis or to generate equally large datasets per class. A toml config is used, in which the directories and ranges of runs to be considered can be specified, as well as the subdivision into training and validation sets. As output, a list in txt format with the filepaths belonging to one set is created that can be passed to the concatenate for creating one single file out of the many.
Create datasets for different tasks (like classification or regression) from the files resulting from OrcaSong, based on the run_id. This is particularly helpful for a run-by-run data analysis or to generate equally large datasets per class. A toml config is used, in which the directories and ranges of runs to be considered can be specified, as well as the subdivision into training and validation sets. Detailed descriptions for the options available can be found in the example config in the subfolder make_data_split_configs. As output, a list in txt format with the filepaths belonging to one set is created that can be passed to the concatenate for creating one single file out of the many.
Can be used via the commandline::
orcasong/tools/make_data_split.py CONFIG.toml
orcasong/tools/make_data_split.py config.toml
.. _concatenate:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment