h5extract v2 (!218) · Merge requests · km3py / km3pipe

Stefan Reck requested to merge h5extract2 into master Aug 31, 2021

I decided to re-write the h5extract script by using km3io/awkward directly instead of a pipeline for the sake of speed.

The speed-up is quite nice, and I also like that all the column names are now collected in a single big dictionary, instead of being scattered all over the place (would be even nicer if we didn't have to rename some of the columns).

However, unfortunatly this means that we have to emulate the output of the hdf5sink in order to be able to read the result in with a hdf5pump again. So right now, I save an XX_indices dataset or a seperate group_id column. I think it works, but I still have 2 questions:

If I read it with a h5pump, I get this message:

ERROR ++ km3pipe.io.hdf5: Could not determine HDF5 format version: 'mcv6.1.mupage_10G.sirene.jterbr00007273.jorcarec.aanet.50.root.h5'. You may encounter unexpected errors! Good luck...

I apperciate the good luck, but I still wonder what this means.

In order to be able to read the file, I need to add a dataset "/group_info", which seemingly serves no purpose. Why is this necessary?

For reference, I developed the script using this orca6 mupage aanet file: /in2p3/km3net/mc/atm_muon/KM3NeT_00000049/v6.1/reco/mcv6.1.mupage_10G.sirene.jterbr00007273.jorcarec.50.root

Edited Sep 01, 2021 by Stefan Reck

h5extract v2

Merge request reports