Skip to content

Re-think sim_type

Created by: RasmusOrsoe

Is your feature request related to a problem? Please describe. In I3TruthExtractor we rely on the variable sim_type to modify the extractor's behavior. The variable is inferred by the extractor in a very crude way:

def _find_data_type(self, mc: bool, input_file: str) -> str:
        """Determine the data type.

        Args:
            mc: Whether `input_file` is Monte Carlo simulation.
            input_file: Path to I3-file.

        Returns:
            The simulation/data type.
        """
        # @TODO: Rewrite to automatically infer `mc` from `input_file`?
        if not mc:
            sim_type = "data"
        else:
            sim_type = "NuGen"
        if "muon" in input_file:
            sim_type = "muongun"
        if "corsika" in input_file:
            sim_type = "corsika"
        if "genie" in input_file or "nu" in input_file.lower():
            sim_type = "genie"
        if "noise" in input_file:
            sim_type = "noise"
        if "L2" in input_file:  # not robust
            sim_type = "dbang"
        if sim_type == "lol":
            self.info("SIM TYPE NOT FOUND!")
        return sim_type

A more elegant solution is needed.

Describe the solution you'd like We should try to come up with a way to either remove the need for thesim_typevariable, or a more robust way to infer it.

Additional context To my knowledge, it is often that i3 files doesn't contain identifying markers that we can use for this.