Skip to content

Unintended casting

I was testing reading two files after each other and extracting some values, I got this huge error:

2020-06-23 17:06:47 CRITICAL ++ km3pipe.dataclasses: dtype mismatch! Matching field names but differing field types, no chance to reorder.
dtype of data:   (numpy.record, [('Erange_max', '<i8'), ('Erange_min', '<i8'), ('JENERGY_CHI2', '<f8'), ('JENERGY_ENERGY', '<f8'), ('JENERGY_MUON_RANGE_METRES', '<f8'), ('JENERGY_NDF', '<f8'), ('JENERGY_NOISE_LIKELIHOOD', '<f8'), ('JENERGY_NUMBER_OF_HITS', '<f8'), ('JGANDALF_BETA0_RAD', '<f8'), ('JGANDALF_BETA1_RAD', '<f8'), ('JGANDALF_CHI2', '<f8'), ('JGANDALF_LAMBDA', '<f8'), ('JGANDALF_NUMBER_OF_HITS', '<f8'), ('JGANDALF_NUMBER_OF_ITERATIONS', '<f8'), ('JSTART_LENGTH_METRES', '<f8'), ('JSTART_NPE_MIP', '<f8'), ('JSTART_NPE_MIP_TOTAL', '<f8'), ('JVETO_NPE', '<f8'), ('JVETO_NUMBER_OF_HITS', '<f8'), ('W2LIST_GSEAGEN_BX', '<f8'), ('W2LIST_GSEAGEN_BY', '<f8'), ('W2LIST_GSEAGEN_CC', '<f8'), ('W2LIST_GSEAGEN_COLUMN_DEPTH', '<f8'), ('W2LIST_GSEAGEN_EG', '<f8'), ('W2LIST_GSEAGEN_ICHAN', '<f8'), ('W2LIST_GSEAGEN_PS', '<f8'), ('W2LIST_GSEAGEN_P_EARTH', '<f8'), ('W2LIST_GSEAGEN_P_SCALE', '<f8'), ('W2LIST_GSEAGEN_WATER_INT_LEN', '<f8'), ('W2LIST_GSEAGEN_XSEC_MEAN', '<f8'), ('dir_x', '<f8'), ('dir_y', '<f8'), ('dir_z', '<f8'), ('energy', '<f8'), ('gandalf_best_chi2_red', '<f8'), ('gandalf_dir_x', '<f8'), ('gandalf_dir_y', '<f8'), ('gandalf_dir_z', '<f8'), ('gandalf_energy', '<f8'), ('gandalf_is_good', '<f8'), ('gandalf_likelihood', '<f8'), ('gandalf_pos_x', '<f8'), ('gandalf_pos_y', '<f8'), ('gandalf_pos_z', '<f8'), ('is_cc', '<i8'), ('is_neutrino', '<i8'), ('jsh_dir_x', '<f8'), ('jsh_dir_y', '<f8'), ('jsh_dir_z', '<f8'), ('jsh_energy', '<f8'), ('jsh_is_good', '<f8'), ('jsh_likelihood', '<f8'), ('jsh_pos_x', '<f8'), ('jsh_pos_y', '<f8'), ('jsh_pos_z', '<f8'), ('livetime_sec', '<f8'), ('mc_id', '<i4'), ('n_events_gen', '<f8'), ('pos_x', '<f8'), ('pos_y', '<f8'), ('pos_z', '<f8'), ('run_from_header', '<i8'), ('run_id', '<i4'), ('type', '<i4'), ('w1', '<f8'), ('w2', '<f8'), ('w3', '<f8'), ('weight_one_year', '<i8'), ('group_id', '<i8')])
requested dtype: [('Erange_max', '<i8'), ('Erange_min', '<i8'), ('JENERGY_CHI2', '<f8'), ('JENERGY_ENERGY', '<f8'), ('JENERGY_MUON_RANGE_METRES', '<f8'), ('JENERGY_NDF', '<f8'), ('JENERGY_NOISE_LIKELIHOOD', '<f8'), ('JENERGY_NUMBER_OF_HITS', '<f8'), ('JGANDALF_BETA0_RAD', '<f8'), ('JGANDALF_BETA1_RAD', '<f8'), ('JGANDALF_CHI2', '<f8'), ('JGANDALF_LAMBDA', '<f8'), ('JGANDALF_NUMBER_OF_HITS', '<f8'), ('JGANDALF_NUMBER_OF_ITERATIONS', '<f8'), ('JSTART_LENGTH_METRES', '<f8'), ('JSTART_NPE_MIP', '<f8'), ('JSTART_NPE_MIP_TOTAL', '<f8'), ('JVETO_NPE', '<f8'), ('JVETO_NUMBER_OF_HITS', '<f8'), ('W2LIST_GSEAGEN_BX', '<f8'), ('W2LIST_GSEAGEN_BY', '<f8'), ('W2LIST_GSEAGEN_CC', '<f8'), ('W2LIST_GSEAGEN_COLUMN_DEPTH', '<f8'), ('W2LIST_GSEAGEN_EG', '<f8'), ('W2LIST_GSEAGEN_ICHAN', '<f8'), ('W2LIST_GSEAGEN_PS', '<f8'), ('W2LIST_GSEAGEN_P_EARTH', '<f8'), ('W2LIST_GSEAGEN_P_SCALE', '<f8'), ('W2LIST_GSEAGEN_WATER_INT_LEN', '<f8'), ('W2LIST_GSEAGEN_XSEC_MEAN', '<f8'), ('dir_x', '<f8'), ('dir_y', '<f8'), ('dir_z', '<f8'), ('energy', '<f8'), ('gandalf_best_chi2_red', '<f8'), ('gandalf_dir_x', '<f8'), ('gandalf_dir_y', '<f8'), ('gandalf_dir_z', '<f8'), ('gandalf_energy', '<f8'), ('gandalf_is_good', '<f8'), ('gandalf_likelihood', '<f8'), ('gandalf_pos_x', '<f8'), ('gandalf_pos_y', '<f8'), ('gandalf_pos_z', '<f8'), ('is_cc', '<i8'), ('is_neutrino', '<i8'), ('jsh_dir_x', '<f8'), ('jsh_dir_y', '<f8'), ('jsh_dir_z', '<f8'), ('jsh_energy', '<f8'), ('jsh_is_good', '<f8'), ('jsh_likelihood', '<f8'), ('jsh_pos_x', '<f8'), ('jsh_pos_y', '<f8'), ('jsh_pos_z', '<f8'), ('livetime_sec', '<f8'), ('mc_id', '<i4'), ('n_events_gen', '<i8'), ('pos_x', '<f8'), ('pos_y', '<f8'), ('pos_z', '<f8'), ('run_from_header', '<i8'), ('run_id', '<i4'), ('type', '<i4'), ('w1', '<f8'), ('w2', '<f8'), ('w3', '<f8'), ('weight_one_year', '<i8'), ('group_id', '<i8')]
2020-06-23 17:06:47 CRITICAL ++ km3pipe.io.hdf5.HDF5Sink.HDF5Sink: Cannot write a table to '/summary' since its dtype is different compared to the previous table with the same HDF5 location, which was used to fix the dtype of the HDF5 compund type.
Traceback (most recent call last):
  File "pipe.py", line 44, in <module>
    pipe.drain()
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/thepipe/core.py", line 423, in drain
    return self._drain(cycles)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/thepipe/core.py", line 372, in _drain
    new_blob = module(blob_to_send)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/thepipe/core.py", line 178, in __call__
    return self.process(*args, **kwargs)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/km3pipe/io/hdf5.py", line 487, in process
    data = self._process_entry(key, entry)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/km3pipe/io/hdf5.py", line 474, in _process_entry
    self._write_table(entry.h5loc, entry, title=title)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/km3pipe/io/hdf5.py", line 396, in _write_table
    arr = Table(arr, dtype=tab.dtype)
  File "/project/antares/public_student_software/venvs/km3pipe-v9-alpha14/lib/python3.7/site-packages/km3pipe/dataclasses.py", line 175, in __new__
    raise ValueError("dtype mismatch")
ValueError: dtype mismatch
Closing remaining open files:test2.h5...done

which basically boils down to:

dtype of data: ('n_events_gen', '<f8')
requested dtype: ('n_events_gen', '<i8')

In the original GSG files the values are:

file1: genvol [...] 8.00E+05

file2: genvol [...] 7.00E+06

But when opening aanet files with km3io:

f.header.genvol.numberOfEvents
800000

for file1 and

f.header.genvol.numberOfEvents
7000000.0

for file2.

So indeed for one of the files this value is cast to float.

I solved this by casting in my own km3pipe Module:

blob.summary["n_events_gen"] = int(self.n_events_gen)

But I believe this casting to float is not supposed to happen since both files were made with the same versions of GSG, JPP, aanet, etc.

Edited by Lodewijk Nauta
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information