diff --git a/README.rst b/README.rst index ebc5ad03690ba2303e024cc5004795cd933e3bf8..3c21a6ef48632f4d15a3f1cdf0828b30688cf362 100644 --- a/README.rst +++ b/README.rst @@ -14,8 +14,7 @@ The km3io Python package :target: https://km3py.pages.km3net.de/km3io This software provides a set of Python classes to read KM3NeT ROOT files -without having ROOT, Jpp or aanet installed. It only depends on Python 3.5+ and -the amazing uproot package and gives you access to the data via numpy arrays. +without having ROOT, Jpp or aanet installed. It only depends on Python 3.5+ and the amazing `uproot <https://github.com/scikit-hep/uproot>`__ package and gives you access to the data via numpy arrays. It's very easy to use and according to the `uproot <https://github.com/scikit-hep/uproot>`__ benchmarks, it is able to outperform the ROOT I/O performance. @@ -59,6 +58,17 @@ Tutorial * `Offline files reader <#offline-file-reader>`__ + * `reading events data <#reading-events-data>`__ + + * `reading hits data <#reading-hits-data>`__ + + * `reading tracks data <#reading-tracks-data>`__ + + * `reading mc hits data <#reading-mc-hits-data>`__ + + * `reading mc tracks data <#reading-mc-tracks-data>`__ + + Introduction ------------ @@ -67,7 +77,7 @@ Most of km3net data is stored in root files. These root files are either created km3io is a Python package that provides a set of classes (``DaqReader`` and ``OfflineReader``) to read both daq root files and offline root files without any dependency to aanet, Jpp or ROOT. -Data in km3io is often returned as a "lazyarray", a "jagged lazyarray", "a jagged array" or a Numpy array. A lazyarray is an array-like object that reads data on demand! In a lazyarray, only the first and the last chunks of data are read in memory. A lazyarray can be used with all Numpy's universal `functions <https://docs.scipy.org/doc/numpy/referenceufuncs.html>`__. Here is how a lazyarray looks like: +Data in km3io is often returned as a "lazyarray", a "jagged lazyarray" or a `Numpy <https://docs.scipy.org/doc/numpy>`__ array. A lazyarray is an array-like object that reads data on demand! In a lazyarray, only the first and the last chunks of data are read in memory. A lazyarray can be used with all Numpy's universal `functions <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>`__. Here is how a lazyarray looks like: .. code-block:: python3 @@ -112,10 +122,10 @@ That's it! Now let's have a look at the hits data: .. code-block:: python3 - events - # Number of events: 17023 - events[23].snapshot_hits.tot - # array([28, 22, 17, 29, 5, 27, 24, 26, 21, 28, 26, 21, 26, 24, 17, 28, 23,29, 27, 24, 23, 26, 29, 25, 18, 28, 24, 28, 26, 20, 25, 31, 28, 23, 26, 21, 30, 33, 27, 16, 23, 24, 19, 24, 27, 22, 23, 21, 25, 16, 28, 22, 22, 29, 24, 29, 24, 24, 25, 25, 21, 31, 26, 28, 30, 42, 28], dtype=uint8) + >>> events + Number of events: 17023 + >>> events[23].snapshot_hits.tot + array([28, 22, 17, 29, 5, 27, 24, 26, 21, 28, 26, 21, 26, 24, 17, 28, 23,29, 27, 24, 23, 26, 29, 25, 18, 28, 24, 28, 26, 20, 25, 31, 28, 23, 26, 21, 30, 33, 27, 16, 23, 24, 19, 24, 27, 22, 23, 21, 25, 16, 28, 22, 22, 29, 24, 29, 24, 24, 25, 25, 21, 31, 26, 28, 30, 42, 28], dtype=uint8) Offline files reader @@ -123,137 +133,496 @@ Offline files reader Let's have a look at some muons data from ORCA 4 lines simulations - run id 5971 (``datav6.0test.jchain.aanet.00005971.root``). -To get a lazy ragged array of all data:: - - >>> import km3io as ki - >>> reader = ki.AanetReader('datav6.0test.jchain.aanet.00005971.root') - -That's it! Now let's take a look at all the available branches in our file:: - - >>> reader - Number of events: 145028 - Events keys are: - id - det_id - mc_id - run_id - mc_run_id - frame_index - trigger_mask - trigger_counter - overlays - hits - trks - w - w2list - w3list - mc_t - mc_hits - mc_trks - comment - index - flags - t.fSec - t.fNanoSec - Hits keys are: - hits.id - hits.dom_id - hits.channel_id - hits.tdc - hits.tot - hits.trig - hits.pmt_id - hits.t - hits.a - hits.pos.x - hits.pos.y - hits.pos.z - hits.dir.x - hits.dir.y - hits.dir.z - hits.pure_t - hits.pure_a - hits.type - hits.origin - hits.pattern_flags - Tracks keys are: - trks.fUniqueID - trks.fBits - trks.usr_data - trks.usr_names - trks.id - trks.pos.x - trks.pos.y - trks.pos.z - trks.dir.x - trks.dir.y - trks.dir.z - trks.t - trks.E - trks.len - trks.lik - trks.type - trks.rec_type - trks.rec_stages - trks.status - trks.mother_id - trks.fitinf - trks.hit_ids - trks.error_matrix - trks.comment - -Now that you have seen all the available branches, you can choose any key from -the above (key refers to a branch name) and display the corresponding data. For -example, we will check that we are indeed reading data from the run 5971:: - - >>> reader['run_id'] - <ChunkedArray [5971 5971 5971 ... 5971 5971 5971] at 0x7fb2341ad810> - -Let's look at the number of hits and tracks in the event number 5:: - - >>> reader[5]['hits'] +**Note:** this file was cropped to 10 events only, so don't be surprised in this tutorial if you see few events in the file. + +First, let's read our file: + +.. code-block:: python3 + + >>> import km3io as ki + >>> file = 'datav6.0test.jchain.aanet.00005971.root' + >>> r = ki.OfflineReader(file) + <km3io.aanet.OfflineReader at 0x7f24cc2bd550> + +and that's it! Note that `file` can be either an str of your file path, or a path-like object. + +To explore all the available branches in our offline file: + +.. code-block:: python3 + + >>> r.keys + Events keys are: + id + det_id + mc_id + run_id + mc_run_id + frame_index + trigger_mask + trigger_counter + overlays + hits + trks + w + w2list + w3list + mc_t + mc_hits + mc_trks + comment + index + flags + t.fSec + t.fNanoSec + Hits keys are: + hits.id + hits.dom_id + hits.channel_id + hits.tdc + hits.tot + hits.trig + hits.pmt_id + hits.t + hits.a + hits.pos.x + hits.pos.y + hits.pos.z + hits.dir.x + hits.dir.y + hits.dir.z + hits.pure_t + hits.pure_a + hits.type + hits.origin + hits.pattern_flags + Tracks keys are: + trks.fUniqueID + trks.fBits + trks.id + trks.pos.x + trks.pos.y + trks.pos.z + trks.dir.x + trks.dir.y + trks.dir.z + trks.t + trks.E + trks.len + trks.lik + trks.type + trks.rec_type + trks.rec_stages + trks.status + trks.mother_id + trks.fitinf + trks.hit_ids + trks.error_matrix + trks.comment + Mc hits keys are: + mc_hits.id + mc_hits.dom_id + mc_hits.channel_id + mc_hits.tdc + mc_hits.tot + mc_hits.trig + mc_hits.pmt_id + mc_hits.t + mc_hits.a + mc_hits.pos.x + mc_hits.pos.y + mc_hits.pos.z + mc_hits.dir.x + mc_hits.dir.y + mc_hits.dir.z + mc_hits.pure_t + mc_hits.pure_a + mc_hits.type + mc_hits.origin + mc_hits.pattern_flags + Mc tracks keys are: + mc_trks.fUniqueID + mc_trks.fBits + mc_trks.id + mc_trks.pos.x + mc_trks.pos.y + mc_trks.pos.z + mc_trks.dir.x + mc_trks.dir.y + mc_trks.dir.z + mc_trks.t + mc_trks.E + mc_trks.len + mc_trks.lik + mc_trks.type + mc_trks.rec_type + mc_trks.rec_stages + mc_trks.status + mc_trks.mother_id + mc_trks.fitinf + mc_trks.hit_ids + mc_trks.error_matrix + mc_trks.comment + +In an offline file, there are 5 main trees with data: + +* events tree +* hits tree +* tracks tree +* mc hits tree +* mc tracks tree + +with km3io, these trees can be accessed with a simple tab completion: + +.. image:: https://git.km3net.de/km3py/km3io/reader.png + +In the following, we will explore each tree using km3io package. + +reading events data +""""""""""""""""""" + +to read data in events tree with km3io: + +.. code-block:: python3 + + >>> r.events + <OfflineEvents: 10 parsed events> + +to get the total number of events in the events tree: + +.. code-block:: python3 + + >>> len(r.events) + 10 + +the branches stored in the events tree in an offline file can be easily accessed with a tab completion as seen below: + +.. image:: /home/zineb/Desktop/events.png + +to get data from the events tree, chose any branch of interest with the tab completion, the following is a non exaustive set of examples. + +to get event ids: + +.. code-block:: python3 + + >>> r.events.id + <ChunkedArray [1 2 3 ... 8 9 10] at 0x7f249eeb6f10> + +to get detector ids: + +.. code-block:: python3 + + >>> r.events.det_id + <ChunkedArray [44 44 44 ... 44 44 44] at 0x7f249eeba050> + +to get frame_index: + +.. code-block:: python3 + + >>> r.events.frame_index + <ChunkedArray [182 183 202 ... 185 185 204] at 0x7f249eeba410> + +to get snapshot hits: + +.. code-block:: python3 + + >>> r.events.hits + <ChunkedArray [176 125 318 ... 84 255 105] at 0x7f249eebaa10> + +to illustrate the strength of this data structure, we will play around with `r.events.hits` using Numpy universal `functions <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>`__. + +.. code-block:: python3 + + >>> import numpy as np + >>> np.log(r.events.hits) + <ChunkedArray [5.170483995038151 4.8283137373023015 5.762051382780177 ... 4.430816798843313 5.541263545158426 4.653960350157523] at 0x7f249b8ebb90> + +to get all data from one specific event (for example event 0): + +.. code-block:: python3 + + >>> r.events[0] + offline event: + id : 1 + det_id : 44 + mc_id : 0 + run_id : 5971 + mc_run_id : 0 + frame_index : 182 + trigger_mask : 22 + trigger_counter : 0 + overlays : 60 + hits : 176 + trks : 56 + w : [] + w2list : [] + w3list : [] + mc_t : 0.0 + mc_hits : 0 + mc_trks : 0 + comment : b'' + index : 0 + flags : 0 + t_fSec : 1567036818 + t_fNanoSec : 200000000 + +to get a specific value from event 0, for example the number of overlays: + +.. code-block:: python3 + + >>> r.events[0].overlays 60 - >>> reader[5]['trks'] - 56 - -So event 5 has exactly 60 hits and 56 tracks. Let's explore in more details -hits and tracks data in event 5:: - - >>> reader['hits.dom_id'][5] - array([806455814, 806487219, 806487219, 806487219, 806487226, 808432835, - 808432835, 808432835, 808432835, 808432835, 808432835, 808432835, - 808451904, 808451904, 808451907, 808451907, 808469129, 808469129, - 808469129, 808493910, 808949744, 808949744, 808951460, 808951460, - 808956908, 808961655, 808964908, 808969848, 808969857, 808972593, - 808972593, 808972598, 808972598, 808972698, 808972698, 808974758, - 808974811, 808976377, 808981510, 808981523, 808981812, 808982005, - 808982005, 808982018, 808982077, 808982077, 808982547, 809007627, - 809521500, 809521500, 809521500, 809524432, 809526097, 809526097, - 809526097, 809526097, 809526097, 809526097, 809526097, 809544058], - dtype=int32) - -One can access the dom_id for the first hit in event 5 as follows:: - - >>> reader['hits.dom_id'][5][0] - 806455814 - -Now let's read tracks data in event 5:: - - >>> reader['trks.dir.z'][5] - array([-0.60246049, -0.60246049, -0.60246049, -0.51420541, -0.5475772 , - -0.5772408 , -0.56068238, -0.64907684, -0.67781799, -0.66565114, - -0.63014839, -0.64566464, -0.62691012, -0.58465493, -0.59287533, - -0.63655091, -0.63771247, -0.73446841, -0.7456636 , -0.70941246, - -0.66312268, -0.66312268, -0.56806477, -0.56806477, -0.66312268, - -0.66312268, -0.74851077, -0.74851077, -0.66312268, -0.74851077, - -0.56806477, -0.74851077, -0.66312268, -0.74851077, -0.56806477, - -0.66312268, -0.56806477, -0.66312268, -0.56806477, -0.56806477, - -0.66312268, -0.74851077, -0.66312268, -0.93501626, -0.56806477, - -0.74851077, -0.66312268, -0.56806477, -0.82298389, -0.74851077, - -0.66312268, -0.56806477, -0.82298389, -0.56806477, -0.66312268, - -0.97094183]) - -One can access the 'trks.dir.z' for the first track in event 5 as follows:: - - >>> reader['trks.dir.z'][5][0] - -0.60246049 + +or the number of hits: + +.. code-block:: python3 + + >>> r.events[0].hits + 176 + + +reading hits data +""""""""""""""""" + +to read data in hits tree with km3io: + +.. code-block:: python3 + + >>> r.hits + <OfflineHits: 10 parsed elements> + +this shows that in our offline file, there are 10 hits trees (because we have 10 events and each event has a hits tree!). + +to have access to all data in a specific branche from the hits tree, you can use the tab completion: + +.. image:: /home/zineb/Desktop/hits.png + +to get ALL the dom ids in all hits trees in our offline file: + +.. code-block:: python3 + + >>> r.hits.dom_id + <ChunkedArray [[806451572 806451572 806451572 ... 809544061 809544061 809544061] [806451572 806451572 806451572 ... 809524432 809526097 809544061] [806451572 806451572 806451572 ... 809544061 809544061 809544061] ... [806451572 806455814 806465101 ... 809526097 809544058 809544061] [806455814 806455814 806455814 ... 809544061 809544061 809544061] [806455814 806455814 806455814 ... 809544058 809544058 809544061]] at 0x7f249eebac50> + +to get ALL the time over threshold (tot) in all hits trees in our offline file: + +.. code-block:: python3 + + >>> r.hits.tot + <ChunkedArray [[24 30 22 ... 38 26 23] [29 26 22 ... 26 28 24] [27 19 13 ... 27 24 16] ... [22 22 9 ... 27 32 27] [30 32 17 ... 30 24 29] [27 41 36 ... 29 24 28]] at 0x7f249eec9050> + + +if you are interested in a specific event (let's say event 0), you can access the corresponding hits tree by doing the following: + +.. code-block:: python3 + + >>> r[0].hits + <OfflineHits: 176 parsed elements> + +notice that now there are 176 parsed elements (as opposed to 10 elements parsed when r.hits is called). This means that in event 0 there are 176 hits! To get the dom ids from this event: + +.. code-block:: python3 + + >>> r[0].hits.dom_id + array([806451572, 806451572, 806451572, 806451572, 806455814, 806455814, + 806455814, 806483369, 806483369, 806483369, 806483369, 806483369, + 806483369, 806483369, 806483369, 806483369, 806483369, 806487219, + 806487226, 806487231, 806487231, 808432835, 808435278, 808435278, + 808435278, 808435278, 808435278, 808447180, 808447180, 808447180, + 808447180, 808447180, 808447180, 808447180, 808447180, 808447186, + 808451904, 808451904, 808472265, 808472265, 808472265, 808472265, + 808472265, 808472265, 808472265, 808472265, 808488895, 808488990, + 808488990, 808488990, 808488990, 808488990, 808489014, 808489014, + 808489117, 808489117, 808489117, 808489117, 808493910, 808946818, + 808949744, 808951460, 808951460, 808951460, 808951460, 808951460, + 808956908, 808956908, 808959411, 808959411, 808959411, 808961448, + 808961448, 808961504, 808961504, 808961655, 808961655, 808961655, + 808964815, 808964815, 808964852, 808964908, 808969857, 808969857, + 808969857, 808969857, 808969857, 808972593, 808972698, 808972698, + 808972698, 808974758, 808974758, 808974758, 808974758, 808974758, + 808974758, 808974758, 808974758, 808974758, 808974758, 808974758, + 808974773, 808974773, 808974773, 808974773, 808974773, 808974972, + 808974972, 808976377, 808976377, 808976377, 808979567, 808979567, + 808979567, 808979721, 808979721, 808979721, 808979721, 808979721, + 808979721, 808979721, 808979729, 808979729, 808979729, 808981510, + 808981510, 808981510, 808981510, 808981672, 808981672, 808981672, + 808981672, 808981672, 808981672, 808981672, 808981672, 808981672, + 808981672, 808981672, 808981672, 808981672, 808981672, 808981672, + 808981672, 808981672, 808981812, 808981812, 808981812, 808981864, + 808981864, 808982005, 808982005, 808982005, 808982018, 808982018, + 808982018, 808982041, 808982041, 808982077, 808982077, 808982547, + 808982547, 808982547, 808997793, 809006037, 809524432, 809526097, + 809526097, 809544061, 809544061, 809544061, 809544061, 809544061, + 809544061, 809544061], dtype=int32 + +to get all data of a specific hit (let's say hit 0) from event 0: + +.. code-block:: python3 + + >>>r[0].hits[0] + offline hit: + id : 0 + dom_id : 806451572 + channel_id : 8 + tdc : 0 + tot : 24 + trig : 1 + pmt_id : 0 + t : 70104010.0 + a : 0.0 + pos_x : 0.0 + pos_y : 0.0 + pos_z : 0.0 + dir_x : 0.0 + dir_y : 0.0 + dir_z : 0.0 + pure_t : 0.0 + pure_a : 0.0 + type : 0 + origin : 0 + pattern_flags : 0 + +to get a specific value from hit 0 in event 0, let's say for example the dom id: + +.. code-block:: python3 + + >>>r[0].hits[0].dom_id + 806451572 + +reading tracks data +""""""""""""""""""" + +to read data in tracks tree with km3io: + +.. code-block:: python3 + + >>> r.tracks + <OfflineTracks: 10 parsed elements> + +this shows that in our offline file, there are 10 parsed elements (events), each event is associated with tracks data. + +to have access to all data in a specific branche from the tracks tree, you can use the tab completion: + +.. image:: /home/zineb/Desktop/tracks.png + +to get ALL the cos(zenith angle) in all tracks tree in our offline file: + +.. code-block:: python3 + + >>> r.tracks.dir_z + <ChunkedArray [[-0.872885221293917 -0.872885221293917 -0.872885221293917 ... -0.6631226836266504 -0.5680647731737454 -0.5680647731737454] [-0.8351996698137462 -0.8351996698137462 -0.8351996698137462 ... -0.7485107718446855 -0.8229838871876581 -0.239315690284641] [-0.989148723802379 -0.989148723802379 -0.989148723802379 ... -0.9350162572437829 -0.88545604390297 -0.88545604390297] ... [-0.5704611045902105 -0.5704611045902105 -0.5704611045902105 ... -0.9350162572437829 -0.4647231989130516 -0.4647231989130516] [-0.9779941383490359 -0.9779941383490359 -0.9779941383490359 ... -0.88545604390297 -0.88545604390297 -0.8229838871876581] [-0.7396916780974963 -0.7396916780974963 -0.7396916780974963 ... -0.6631226836266504 -0.7485107718446855 -0.7485107718446855]] at 0x7f249eed2090> + +to get ALL the tracks likelihood in our offline file: + +.. code-block:: python3 + + >>> r.tracks.lik + <ChunkedArray [[294.6407542676734 294.6407542676734 294.6407542676734 ... 67.81221253265059 67.7756405143316 67.77250505700384] [96.75133289411137 96.75133289411137 96.75133289411137 ... 39.21916536442286 39.184645826013806 38.870325146341884] [560.2775306614813 560.2775306614813 560.2775306614813 ... 118.88577278801066 118.72271313687405 117.80785995187605] ... [71.03251451148226 71.03251451148226 71.03251451148226 ... 16.714140573909347 16.444395245214945 16.34639241716669] [326.440133294878 326.440133294878 326.440133294878 ... 87.79818671079849 87.75488082571873 87.74839444768625] [159.77779654216795 159.77779654216795 159.77779654216795 ... 33.8669134999348 33.821631538334984 33.77240735670646]] at 0x7f249eed2590> + + +if you are interested in a specific event (let's say event 0), you can access the corresponding tracks tree by doing the following: + +.. code-block:: python3 + + >>> r[0].tracks + <OfflineTracks: 56 parsed elements> + +notice that now there are 56 parsed elements (as opposed to 10 elements parsed when r.tracks is called). This means that in event 0 there is data about 56 possible tracks! To get the tracks likelihood from this event: + +.. code-block:: python3 + + >>> r[0].tracks.lik + array([294.64075427, 294.64075427, 294.64075427, 291.64653113, + 291.27392663, 290.69031512, 289.19290546, 289.08449217, + 289.03373947, 288.19030836, 282.92343367, 282.71527118, + 282.10762402, 280.20553861, 275.93183966, 273.01809111, + 257.46433694, 220.94357656, 194.99426403, 190.47809685, + 79.95235686, 78.94389763, 78.90791169, 77.96122466, + 77.9579604 , 76.90769883, 75.97546175, 74.91530508, + 74.9059469 , 72.94007716, 72.90467038, 72.8629316 , + 72.81280833, 72.80229533, 72.78899435, 71.82404165, + 71.80085542, 71.71028058, 70.91130096, 70.89150223, + 70.85845637, 70.79081796, 70.76929743, 69.80667603, + 69.64058976, 68.93085058, 68.84304037, 68.83154232, + 68.79944298, 68.79019375, 68.78581291, 68.72340328, + 67.86628937, 67.81221253, 67.77564051, 67.77250506]) + +to get all data of a specific track (let's say track 0) from event 0: + +.. code-block:: python3 + + >>>r[0].tracks[0] + offline track: + fUniqueID : 0 + fBits : 33554432 + id : 1 + pos_x : 445.835395997812 + pos_y : 615.1089636184813 + pos_z : 125.1448339836911 + dir_x : 0.0368711082700674 + dir_y : -0.48653048395923415 + dir_z : -0.872885221293917 + t : 70311446.46401498 + E : 99.10458562488608 + len : 0.0 + lik : 294.6407542676734 + type : 0 + rec_type : 4000 + rec_stages : [1, 3, 5, 4] + status : 0 + mother_id : -1 + hit_ids : [] + error_matrix : [] + comment : 0 + JGANDALF_BETA0_RAD : 0.004957442219414389 + JGANDALF_BETA1_RAD : 0.003417848024252858 + JGANDALF_CHI2 : -294.6407542676734 + JGANDALF_NUMBER_OF_HITS : 142.0 + JENERGY_ENERGY : 99.10458562488608 + JENERGY_CHI2 : 1.7976931348623157e+308 + JGANDALF_LAMBDA : 4.2409761837248484e-12 + JGANDALF_NUMBER_OF_ITERATIONS : 10.0 + JSTART_NPE_MIP : 24.88469697331908 + JSTART_NPE_MIP_TOTAL : 55.88169412579765 + JSTART_LENGTH_METRES : 98.89582506402911 + JVETO_NPE : 0.0 + JVETO_NUMBER_OF_HITS : 0.0 + JENERGY_MUON_RANGE_METRES : 344.9767431592819 + JENERGY_NOISE_LIKELIHOOD : -333.87773581129136 + JENERGY_NDF : 1471.0 + JENERGY_NUMBER_OF_HITS : 101.0 + +to get a specific value from track 0 in event 0, let's say for example the liklihood: + +.. code-block:: python3 + + >>>r[0].tracks[0].lik + 294.6407542676734 + + +reading mc hits data +"""""""""""""""""""" + +to read mc hits data: + +.. code-block:: python3 + + >>>r.mc_hits + <OfflineHits: 10 parsed elements> + +that's it! All branches in mc hits tree can be accessed in the exact same way described in the section `reading hits data <#reading-hits-data>`__ . All data is easily accesible and if you are stuck, hit tab key to see all the available branches: + +.. image:: /home/zineb/Desktop/mc_hits.png + +reading mc tracks data +"""""""""""""""""""""" + +to read mc tracks data: + +.. code-block:: python3 + + >>>r.mc_tracks + <OfflineTracks: 10 parsed elements> + +that's it! All branches in mc tracks tree can be accessed in the exact same way described in the section `reading tracks data <#reading-tracks-data>`__ . All data is easily accesible and if you are stuck, hit tab key to see all the available branches: + +.. image:: /home/zineb/Desktop/mc_tracks.png