Skip to content

Draft: Fine tuning of data types and layout

Tamas Gal requested to merge prototype-implementation into main

This is the merge request to discuss the next iteration of tuning data types and layout of the output file.

I added two rows to each rates-matrix which hold the UTC seconds and nanoseconds, i.e. the beginning of the summaryslice time interval is based on the timestamp of the very first summaryslice in the file which is now treated as the run start.

As mentioned today in the OH meeting, the nanoseconds are now always have a constant value since the resolution of the time interval is in seconds. I don't think it makes much sense to put the nanoseconds in the file but they also do not hurt since they will be compressed away anyways. It just looks a bit weird and I also don't think we need subsecond precision for a time interval typically in the order of several dozens or hundreds of nanoseconds 😉

Three questions:

  1. do we want sub-second resolution in the time interval parameter?
  2. should we get rid of the utc_ns column in the rates matrices?

This is how the output file is generated and looks like right now (accumulating data of 600s time intervals):

~/Dev/pmt-rates-evolution main* 14s
❯ juliap scripts/extract_rates.jl 600 KM3NeT_00000133_00014728.root
Loading libraries...
Opening /Volumes/Ultraspeed/Data/Obelix/KM3NeT_00000133_00014728.root
Retrieving detector description for detector 133 from the database
Extracting summary data with a time interval of 600 seconds
Progress: 100%|████████████████████████████████████████| Time: 0:00:53
Finalising the output file: KM3NeT_00000133_00014728.pmtrates.h5

and the contents

julia> using HDF5, DataFrames

julia> f = h5open("KM3NeT_00000133_00014728.pmtrates.h5");

julia> f["808467545/max_rate"][:] |> DataFrame
18×34 DataFrame
 Row  utc_s       utc_ns     duty_cycle  ch0             ch1            ch2   
      Int32       Int32      Float32     Float32         Float32        Float 
─────┼──────────────────────────────────────────────────────────────────────────
   1  1676246415  900000000    0.955      55987.2        39368.4         3629 
   2  1676247015  900000000    1.0        10160.4         6586.84         754
   3  1676247615  900000000    1.0        53034.6        15254.0         3256
   4  1676248215  900000000    1.0        17946.1        11953.7         1261
   5  1676248815  900000000    1.0         7749.35        6586.84         695 
   6  1676249415  900000000    1.0       136875.0        73406.7         9117
   7  1676250015  900000000    1.0            3.16979e5  20549.2        14846
   8  1676250615  900000000    1.0        18945.3        14449.5         1444
   9  1676251215  900000000    1.0            2.0e6          2.0e6            
  10  1676251815  900000000    1.0            2.0e6      16999.7
  11  1676252415  900000000    1.0        15254.0        11020.6         1163
  12  1676253015  900000000    1.0        86362.3        13687.5         3532
  13  1676253615  900000000    1.0        18438.9         8873.38        1654 
  14  1676254215  900000000    1.0        10439.4        16103.2         1072
  15  1676254815  900000000    1.0        40449.4        15254.0         3256
  16  1676255415  900000000    1.0        43873.9        42701.4         4387
  17  1676256015  900000000    1.0       774935.0            4.38739e5        
  18  1676256615  900000000    0.854833   46316.5        34381.4         3831
                                                              29 columns omitted

julia> f["runinfo"][:] |> DataFrame
1×6 DataFrame
 Row  run    utc_s       utc_ns     time_interval  idx    n_rows 
      Int32  Int32       Int32      Int32          Int64  Int64  
─────┼────────────────────────────────────────────────────────────
   1  14728  1676246415  900000000            600      1      18

Let me ping @mdejong @vpestel @vkulikovskiy @laphecetche

Edited by Tamas Gal

Merge request reports