I just started looking at km3io and I'm also confused about the distinction between different types of files. As far as I understood, the purpose of km3io is to read files that comply with the km3net dataformat. So I would expect a generic reader. But in the documentation it says that there's an online reader, an offline reader, and additionally a special class to read gSeaGen files. On a first test I opened and did a brief inspection of a gSeaGen file using the offline reader (and not a special class for gSeaGen files) and it worked well.
About the the distinction between offline and online events, would it be possible to use names that coincide with the names given in km3net dataformat? There we have JDAQEvent objects, which correspond to events produced at the trigger level (both by the data taking and by JTriggerEfficiency), and Evt objects, which correspond to reconstructed events, as well as events produced at the event generation level and light simulation level. Perhaps an alternative to the above could be a file reader that allows to access each tree type that can be found in a km3net .root file:
import km3iof = km3io.open("/path/to/file.root")f.evt # and so on for evt objectsf.daqevent # and so on for jdaqevent objectsf.jdaqsummaryslices #for summary slicesf.jdaqtimeslicel1 #for L1 timeslices...etc
We are still not v1.0.0 so everything can be changed. The main idea is of course to make things easy to access and I agree that the user should not deal too much with internal formats. On the other hand, km3io is still a low level library, which means that it's more targeted to users who "know what they are doing".
Anyways, a new class which is hidden behind the future km3io.open() function (my approach as a first step of consolidation) could be used for that, so that we can keep the current readers and just wrap them behind a general purpose utility class.
I think what you propose is a good idea. And if possible, I would add the following methods
f.print which shows the available trees in the file
f.has(evt) which returns true if the file has an Evt tree
f.has(daqevent) which returns true if the file has JDAQEvent tree
...etc
Once things get consolidated in future versions, would you remove the specific readers and replace them by a generic one? (I mean, the wrapper is a first approach, but it will eventually disappear, right?)
On the other hand, km3io is still a low level library, which means that it's more targeted to users who "know what they are doing".
Precisely for this reason, I think that the distinction between online and offline is not appropriate and can lead to confusion: In the documentation it says that an online file is a file produced by Jpp, and any other software usually produces offline files. From this, the definition of what an online and offline file is, is not clear. I usually associate online with the real time data taking (DataQueue-JDataFilter-JDataWriter) and offline with anything else. So Jpp can also produce offline files (for example with the JGandalf chain). And someone could also go crazy and create an offline file which copies all the JDAQEvents and JDAQSummaryslices from a daq file, and adds a tree with Evt objects corresponding to the reconstructed JDAQEvents. From the point of view of the file reader, all files should be the same regardless of their origin.
Btw. you wrote that a gSeaGen ROOT file works just fine with the OfflineReader, that's true since you probably took a ROOT file which was produced using the km3net-dataformat lib (spitting out an offline ROOT format). The gSeaGen reader was implemented by @jschumann and is meant to read the gSeaGen specific ROOT format which is not related to our offline ROOT format.
aaaah, ok! Is gSeaGen still producing that kind of format? My files are formatted according to km3net-dataformat.
Now I'm confused: On theh README it says
This software provides a set of Python classes to read KM3NeT ROOT files
I interpreted KM3NeT ROOT files as files containing data compliant with the KM3NeT-dataformat. Is it the case, or does KM3NeT ROOT files refer to any ROOT file produced by any piece of software produced within KM3NeT? So not only KM3NeT dataformat? I would restrict to the first case, I'd say that restricting ourselves to the km3net-dataformat it will make things easier in the long term.
In fact, km3io is meant to provide access to any kind of ROOT files which are "relevant", including for example that special gSeaGen format. I'd like to keep this as is since we share quite a few features in all readers and ripping this project into more subprojects is I think an overkill ;)
I'd be more strict and impose that files that are not compliant with km3net-dataformat, cannot be read by km3net software. If a file is relevant, then it should be formatted according to standards. I think the overkill is to read whatever people consider relevant instead of a single standard format (in which all the relevant files for km3net should eventually be written).
I don't know the details about this special gSeaGen format. Do these files have the same data as the km3net-formatted files produced by gSeaGen, but just in a different format? Or is the content of these files different from the standard output?
If in any case km3io remains as a reader for files other than km3net-dataformat files, I would then not have a single unified reader. I would have a reader for km3net-dataformat compliant files, and other readers for other formats.
I just talked to Johannes and he confirmed that this gSeaGen format is a legacy (pre-offline-format) version. So we will keep it as a legacy reader for backwards compatibility and then focus on the official formats.