Skip to content
Snippets Groups Projects
Commit 96b39544 authored by Zineb Aly's avatar Zineb Aly
Browse files

add detailed doc

parent e5322030
No related branches found
No related tags found
No related merge requests found
......@@ -14,8 +14,7 @@ The km3io Python package
:target: https://km3py.pages.km3net.de/km3io
This software provides a set of Python classes to read KM3NeT ROOT files
without having ROOT, Jpp or aanet installed. It only depends on Python 3.5+ and
the amazing uproot package and gives you access to the data via numpy arrays.
without having ROOT, Jpp or aanet installed. It only depends on Python 3.5+ and the amazing `uproot <https://github.com/scikit-hep/uproot>`__ package and gives you access to the data via numpy arrays.
It's very easy to use and according to the `uproot <https://github.com/scikit-hep/uproot>`__ benchmarks, it is able to outperform the ROOT I/O performance.
......@@ -59,6 +58,17 @@ Tutorial
* `Offline files reader <#offline-file-reader>`__
* `reading events data <#reading-events-data>`__
* `reading hits data <#reading-hits-data>`__
* `reading tracks data <#reading-tracks-data>`__
* `reading mc hits data <#reading-mc-hits-data>`__
* `reading mc tracks data <#reading-mc-tracks-data>`__
Introduction
------------
......@@ -67,7 +77,7 @@ Most of km3net data is stored in root files. These root files are either created
km3io is a Python package that provides a set of classes (``DaqReader`` and ``OfflineReader``) to read both daq root files and offline root files without any dependency to aanet, Jpp or ROOT.
Data in km3io is often returned as a "lazyarray", a "jagged lazyarray", "a jagged array" or a Numpy array. A lazyarray is an array-like object that reads data on demand! In a lazyarray, only the first and the last chunks of data are read in memory. A lazyarray can be used with all Numpy's universal `functions <https://docs.scipy.org/doc/numpy/referenceufuncs.html>`__. Here is how a lazyarray looks like:
Data in km3io is often returned as a "lazyarray", a "jagged lazyarray" or a `Numpy <https://docs.scipy.org/doc/numpy>`__ array. A lazyarray is an array-like object that reads data on demand! In a lazyarray, only the first and the last chunks of data are read in memory. A lazyarray can be used with all Numpy's universal `functions <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>`__. Here is how a lazyarray looks like:
.. code-block:: python3
......@@ -112,10 +122,10 @@ That's it! Now let's have a look at the hits data:
.. code-block:: python3
events
# Number of events: 17023
events[23].snapshot_hits.tot
# array([28, 22, 17, 29, 5, 27, 24, 26, 21, 28, 26, 21, 26, 24, 17, 28, 23,29, 27, 24, 23, 26, 29, 25, 18, 28, 24, 28, 26, 20, 25, 31, 28, 23, 26, 21, 30, 33, 27, 16, 23, 24, 19, 24, 27, 22, 23, 21, 25, 16, 28, 22, 22, 29, 24, 29, 24, 24, 25, 25, 21, 31, 26, 28, 30, 42, 28], dtype=uint8)
>>> events
Number of events: 17023
>>> events[23].snapshot_hits.tot
array([28, 22, 17, 29, 5, 27, 24, 26, 21, 28, 26, 21, 26, 24, 17, 28, 23,29, 27, 24, 23, 26, 29, 25, 18, 28, 24, 28, 26, 20, 25, 31, 28, 23, 26, 21, 30, 33, 27, 16, 23, 24, 19, 24, 27, 22, 23, 21, 25, 16, 28, 22, 22, 29, 24, 29, 24, 24, 25, 25, 21, 31, 26, 28, 30, 42, 28], dtype=uint8)
Offline files reader
......@@ -123,137 +133,496 @@ Offline files reader
Let's have a look at some muons data from ORCA 4 lines simulations - run id 5971 (``datav6.0test.jchain.aanet.00005971.root``).
To get a lazy ragged array of all data::
>>> import km3io as ki
>>> reader = ki.AanetReader('datav6.0test.jchain.aanet.00005971.root')
That's it! Now let's take a look at all the available branches in our file::
>>> reader
Number of events: 145028
Events keys are:
id
det_id
mc_id
run_id
mc_run_id
frame_index
trigger_mask
trigger_counter
overlays
hits
trks
w
w2list
w3list
mc_t
mc_hits
mc_trks
comment
index
flags
t.fSec
t.fNanoSec
Hits keys are:
hits.id
hits.dom_id
hits.channel_id
hits.tdc
hits.tot
hits.trig
hits.pmt_id
hits.t
hits.a
hits.pos.x
hits.pos.y
hits.pos.z
hits.dir.x
hits.dir.y
hits.dir.z
hits.pure_t
hits.pure_a
hits.type
hits.origin
hits.pattern_flags
Tracks keys are:
trks.fUniqueID
trks.fBits
trks.usr_data
trks.usr_names
trks.id
trks.pos.x
trks.pos.y
trks.pos.z
trks.dir.x
trks.dir.y
trks.dir.z
trks.t
trks.E
trks.len
trks.lik
trks.type
trks.rec_type
trks.rec_stages
trks.status
trks.mother_id
trks.fitinf
trks.hit_ids
trks.error_matrix
trks.comment
Now that you have seen all the available branches, you can choose any key from
the above (key refers to a branch name) and display the corresponding data. For
example, we will check that we are indeed reading data from the run 5971::
>>> reader['run_id']
<ChunkedArray [5971 5971 5971 ... 5971 5971 5971] at 0x7fb2341ad810>
Let's look at the number of hits and tracks in the event number 5::
>>> reader[5]['hits']
**Note:** this file was cropped to 10 events only, so don't be surprised in this tutorial if you see few events in the file.
First, let's read our file:
.. code-block:: python3
>>> import km3io as ki
>>> file = 'datav6.0test.jchain.aanet.00005971.root'
>>> r = ki.OfflineReader(file)
<km3io.aanet.OfflineReader at 0x7f24cc2bd550>
and that's it! Note that `file` can be either an str of your file path, or a path-like object.
To explore all the available branches in our offline file:
.. code-block:: python3
>>> r.keys
Events keys are:
id
det_id
mc_id
run_id
mc_run_id
frame_index
trigger_mask
trigger_counter
overlays
hits
trks
w
w2list
w3list
mc_t
mc_hits
mc_trks
comment
index
flags
t.fSec
t.fNanoSec
Hits keys are:
hits.id
hits.dom_id
hits.channel_id
hits.tdc
hits.tot
hits.trig
hits.pmt_id
hits.t
hits.a
hits.pos.x
hits.pos.y
hits.pos.z
hits.dir.x
hits.dir.y
hits.dir.z
hits.pure_t
hits.pure_a
hits.type
hits.origin
hits.pattern_flags
Tracks keys are:
trks.fUniqueID
trks.fBits
trks.id
trks.pos.x
trks.pos.y
trks.pos.z
trks.dir.x
trks.dir.y
trks.dir.z
trks.t
trks.E
trks.len
trks.lik
trks.type
trks.rec_type
trks.rec_stages
trks.status
trks.mother_id
trks.fitinf
trks.hit_ids
trks.error_matrix
trks.comment
Mc hits keys are:
mc_hits.id
mc_hits.dom_id
mc_hits.channel_id
mc_hits.tdc
mc_hits.tot
mc_hits.trig
mc_hits.pmt_id
mc_hits.t
mc_hits.a
mc_hits.pos.x
mc_hits.pos.y
mc_hits.pos.z
mc_hits.dir.x
mc_hits.dir.y
mc_hits.dir.z
mc_hits.pure_t
mc_hits.pure_a
mc_hits.type
mc_hits.origin
mc_hits.pattern_flags
Mc tracks keys are:
mc_trks.fUniqueID
mc_trks.fBits
mc_trks.id
mc_trks.pos.x
mc_trks.pos.y
mc_trks.pos.z
mc_trks.dir.x
mc_trks.dir.y
mc_trks.dir.z
mc_trks.t
mc_trks.E
mc_trks.len
mc_trks.lik
mc_trks.type
mc_trks.rec_type
mc_trks.rec_stages
mc_trks.status
mc_trks.mother_id
mc_trks.fitinf
mc_trks.hit_ids
mc_trks.error_matrix
mc_trks.comment
In an offline file, there are 5 main trees with data:
* events tree
* hits tree
* tracks tree
* mc hits tree
* mc tracks tree
with km3io, these trees can be accessed with a simple tab completion:
.. image:: https://git.km3net.de/km3py/km3io/reader.png
In the following, we will explore each tree using km3io package.
reading events data
"""""""""""""""""""
to read data in events tree with km3io:
.. code-block:: python3
>>> r.events
<OfflineEvents: 10 parsed events>
to get the total number of events in the events tree:
.. code-block:: python3
>>> len(r.events)
10
the branches stored in the events tree in an offline file can be easily accessed with a tab completion as seen below:
.. image:: /home/zineb/Desktop/events.png
to get data from the events tree, chose any branch of interest with the tab completion, the following is a non exaustive set of examples.
to get event ids:
.. code-block:: python3
>>> r.events.id
<ChunkedArray [1 2 3 ... 8 9 10] at 0x7f249eeb6f10>
to get detector ids:
.. code-block:: python3
>>> r.events.det_id
<ChunkedArray [44 44 44 ... 44 44 44] at 0x7f249eeba050>
to get frame_index:
.. code-block:: python3
>>> r.events.frame_index
<ChunkedArray [182 183 202 ... 185 185 204] at 0x7f249eeba410>
to get snapshot hits:
.. code-block:: python3
>>> r.events.hits
<ChunkedArray [176 125 318 ... 84 255 105] at 0x7f249eebaa10>
to illustrate the strength of this data structure, we will play around with `r.events.hits` using Numpy universal `functions <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>`__.
.. code-block:: python3
>>> import numpy as np
>>> np.log(r.events.hits)
<ChunkedArray [5.170483995038151 4.8283137373023015 5.762051382780177 ... 4.430816798843313 5.541263545158426 4.653960350157523] at 0x7f249b8ebb90>
to get all data from one specific event (for example event 0):
.. code-block:: python3
>>> r.events[0]
offline event:
id : 1
det_id : 44
mc_id : 0
run_id : 5971
mc_run_id : 0
frame_index : 182
trigger_mask : 22
trigger_counter : 0
overlays : 60
hits : 176
trks : 56
w : []
w2list : []
w3list : []
mc_t : 0.0
mc_hits : 0
mc_trks : 0
comment : b''
index : 0
flags : 0
t_fSec : 1567036818
t_fNanoSec : 200000000
to get a specific value from event 0, for example the number of overlays:
.. code-block:: python3
>>> r.events[0].overlays
60
>>> reader[5]['trks']
56
So event 5 has exactly 60 hits and 56 tracks. Let's explore in more details
hits and tracks data in event 5::
>>> reader['hits.dom_id'][5]
array([806455814, 806487219, 806487219, 806487219, 806487226, 808432835,
808432835, 808432835, 808432835, 808432835, 808432835, 808432835,
808451904, 808451904, 808451907, 808451907, 808469129, 808469129,
808469129, 808493910, 808949744, 808949744, 808951460, 808951460,
808956908, 808961655, 808964908, 808969848, 808969857, 808972593,
808972593, 808972598, 808972598, 808972698, 808972698, 808974758,
808974811, 808976377, 808981510, 808981523, 808981812, 808982005,
808982005, 808982018, 808982077, 808982077, 808982547, 809007627,
809521500, 809521500, 809521500, 809524432, 809526097, 809526097,
809526097, 809526097, 809526097, 809526097, 809526097, 809544058],
dtype=int32)
One can access the dom_id for the first hit in event 5 as follows::
>>> reader['hits.dom_id'][5][0]
806455814
Now let's read tracks data in event 5::
>>> reader['trks.dir.z'][5]
array([-0.60246049, -0.60246049, -0.60246049, -0.51420541, -0.5475772 ,
-0.5772408 , -0.56068238, -0.64907684, -0.67781799, -0.66565114,
-0.63014839, -0.64566464, -0.62691012, -0.58465493, -0.59287533,
-0.63655091, -0.63771247, -0.73446841, -0.7456636 , -0.70941246,
-0.66312268, -0.66312268, -0.56806477, -0.56806477, -0.66312268,
-0.66312268, -0.74851077, -0.74851077, -0.66312268, -0.74851077,
-0.56806477, -0.74851077, -0.66312268, -0.74851077, -0.56806477,
-0.66312268, -0.56806477, -0.66312268, -0.56806477, -0.56806477,
-0.66312268, -0.74851077, -0.66312268, -0.93501626, -0.56806477,
-0.74851077, -0.66312268, -0.56806477, -0.82298389, -0.74851077,
-0.66312268, -0.56806477, -0.82298389, -0.56806477, -0.66312268,
-0.97094183])
One can access the 'trks.dir.z' for the first track in event 5 as follows::
>>> reader['trks.dir.z'][5][0]
-0.60246049
or the number of hits:
.. code-block:: python3
>>> r.events[0].hits
176
reading hits data
"""""""""""""""""
to read data in hits tree with km3io:
.. code-block:: python3
>>> r.hits
<OfflineHits: 10 parsed elements>
this shows that in our offline file, there are 10 hits trees (because we have 10 events and each event has a hits tree!).
to have access to all data in a specific branche from the hits tree, you can use the tab completion:
.. image:: /home/zineb/Desktop/hits.png
to get ALL the dom ids in all hits trees in our offline file:
.. code-block:: python3
>>> r.hits.dom_id
<ChunkedArray [[806451572 806451572 806451572 ... 809544061 809544061 809544061] [806451572 806451572 806451572 ... 809524432 809526097 809544061] [806451572 806451572 806451572 ... 809544061 809544061 809544061] ... [806451572 806455814 806465101 ... 809526097 809544058 809544061] [806455814 806455814 806455814 ... 809544061 809544061 809544061] [806455814 806455814 806455814 ... 809544058 809544058 809544061]] at 0x7f249eebac50>
to get ALL the time over threshold (tot) in all hits trees in our offline file:
.. code-block:: python3
>>> r.hits.tot
<ChunkedArray [[24 30 22 ... 38 26 23] [29 26 22 ... 26 28 24] [27 19 13 ... 27 24 16] ... [22 22 9 ... 27 32 27] [30 32 17 ... 30 24 29] [27 41 36 ... 29 24 28]] at 0x7f249eec9050>
if you are interested in a specific event (let's say event 0), you can access the corresponding hits tree by doing the following:
.. code-block:: python3
>>> r[0].hits
<OfflineHits: 176 parsed elements>
notice that now there are 176 parsed elements (as opposed to 10 elements parsed when r.hits is called). This means that in event 0 there are 176 hits! To get the dom ids from this event:
.. code-block:: python3
>>> r[0].hits.dom_id
array([806451572, 806451572, 806451572, 806451572, 806455814, 806455814,
806455814, 806483369, 806483369, 806483369, 806483369, 806483369,
806483369, 806483369, 806483369, 806483369, 806483369, 806487219,
806487226, 806487231, 806487231, 808432835, 808435278, 808435278,
808435278, 808435278, 808435278, 808447180, 808447180, 808447180,
808447180, 808447180, 808447180, 808447180, 808447180, 808447186,
808451904, 808451904, 808472265, 808472265, 808472265, 808472265,
808472265, 808472265, 808472265, 808472265, 808488895, 808488990,
808488990, 808488990, 808488990, 808488990, 808489014, 808489014,
808489117, 808489117, 808489117, 808489117, 808493910, 808946818,
808949744, 808951460, 808951460, 808951460, 808951460, 808951460,
808956908, 808956908, 808959411, 808959411, 808959411, 808961448,
808961448, 808961504, 808961504, 808961655, 808961655, 808961655,
808964815, 808964815, 808964852, 808964908, 808969857, 808969857,
808969857, 808969857, 808969857, 808972593, 808972698, 808972698,
808972698, 808974758, 808974758, 808974758, 808974758, 808974758,
808974758, 808974758, 808974758, 808974758, 808974758, 808974758,
808974773, 808974773, 808974773, 808974773, 808974773, 808974972,
808974972, 808976377, 808976377, 808976377, 808979567, 808979567,
808979567, 808979721, 808979721, 808979721, 808979721, 808979721,
808979721, 808979721, 808979729, 808979729, 808979729, 808981510,
808981510, 808981510, 808981510, 808981672, 808981672, 808981672,
808981672, 808981672, 808981672, 808981672, 808981672, 808981672,
808981672, 808981672, 808981672, 808981672, 808981672, 808981672,
808981672, 808981672, 808981812, 808981812, 808981812, 808981864,
808981864, 808982005, 808982005, 808982005, 808982018, 808982018,
808982018, 808982041, 808982041, 808982077, 808982077, 808982547,
808982547, 808982547, 808997793, 809006037, 809524432, 809526097,
809526097, 809544061, 809544061, 809544061, 809544061, 809544061,
809544061, 809544061], dtype=int32
to get all data of a specific hit (let's say hit 0) from event 0:
.. code-block:: python3
>>>r[0].hits[0]
offline hit:
id : 0
dom_id : 806451572
channel_id : 8
tdc : 0
tot : 24
trig : 1
pmt_id : 0
t : 70104010.0
a : 0.0
pos_x : 0.0
pos_y : 0.0
pos_z : 0.0
dir_x : 0.0
dir_y : 0.0
dir_z : 0.0
pure_t : 0.0
pure_a : 0.0
type : 0
origin : 0
pattern_flags : 0
to get a specific value from hit 0 in event 0, let's say for example the dom id:
.. code-block:: python3
>>>r[0].hits[0].dom_id
806451572
reading tracks data
"""""""""""""""""""
to read data in tracks tree with km3io:
.. code-block:: python3
>>> r.tracks
<OfflineTracks: 10 parsed elements>
this shows that in our offline file, there are 10 parsed elements (events), each event is associated with tracks data.
to have access to all data in a specific branche from the tracks tree, you can use the tab completion:
.. image:: /home/zineb/Desktop/tracks.png
to get ALL the cos(zenith angle) in all tracks tree in our offline file:
.. code-block:: python3
>>> r.tracks.dir_z
<ChunkedArray [[-0.872885221293917 -0.872885221293917 -0.872885221293917 ... -0.6631226836266504 -0.5680647731737454 -0.5680647731737454] [-0.8351996698137462 -0.8351996698137462 -0.8351996698137462 ... -0.7485107718446855 -0.8229838871876581 -0.239315690284641] [-0.989148723802379 -0.989148723802379 -0.989148723802379 ... -0.9350162572437829 -0.88545604390297 -0.88545604390297] ... [-0.5704611045902105 -0.5704611045902105 -0.5704611045902105 ... -0.9350162572437829 -0.4647231989130516 -0.4647231989130516] [-0.9779941383490359 -0.9779941383490359 -0.9779941383490359 ... -0.88545604390297 -0.88545604390297 -0.8229838871876581] [-0.7396916780974963 -0.7396916780974963 -0.7396916780974963 ... -0.6631226836266504 -0.7485107718446855 -0.7485107718446855]] at 0x7f249eed2090>
to get ALL the tracks likelihood in our offline file:
.. code-block:: python3
>>> r.tracks.lik
<ChunkedArray [[294.6407542676734 294.6407542676734 294.6407542676734 ... 67.81221253265059 67.7756405143316 67.77250505700384] [96.75133289411137 96.75133289411137 96.75133289411137 ... 39.21916536442286 39.184645826013806 38.870325146341884] [560.2775306614813 560.2775306614813 560.2775306614813 ... 118.88577278801066 118.72271313687405 117.80785995187605] ... [71.03251451148226 71.03251451148226 71.03251451148226 ... 16.714140573909347 16.444395245214945 16.34639241716669] [326.440133294878 326.440133294878 326.440133294878 ... 87.79818671079849 87.75488082571873 87.74839444768625] [159.77779654216795 159.77779654216795 159.77779654216795 ... 33.8669134999348 33.821631538334984 33.77240735670646]] at 0x7f249eed2590>
if you are interested in a specific event (let's say event 0), you can access the corresponding tracks tree by doing the following:
.. code-block:: python3
>>> r[0].tracks
<OfflineTracks: 56 parsed elements>
notice that now there are 56 parsed elements (as opposed to 10 elements parsed when r.tracks is called). This means that in event 0 there is data about 56 possible tracks! To get the tracks likelihood from this event:
.. code-block:: python3
>>> r[0].tracks.lik
array([294.64075427, 294.64075427, 294.64075427, 291.64653113,
291.27392663, 290.69031512, 289.19290546, 289.08449217,
289.03373947, 288.19030836, 282.92343367, 282.71527118,
282.10762402, 280.20553861, 275.93183966, 273.01809111,
257.46433694, 220.94357656, 194.99426403, 190.47809685,
79.95235686, 78.94389763, 78.90791169, 77.96122466,
77.9579604 , 76.90769883, 75.97546175, 74.91530508,
74.9059469 , 72.94007716, 72.90467038, 72.8629316 ,
72.81280833, 72.80229533, 72.78899435, 71.82404165,
71.80085542, 71.71028058, 70.91130096, 70.89150223,
70.85845637, 70.79081796, 70.76929743, 69.80667603,
69.64058976, 68.93085058, 68.84304037, 68.83154232,
68.79944298, 68.79019375, 68.78581291, 68.72340328,
67.86628937, 67.81221253, 67.77564051, 67.77250506])
to get all data of a specific track (let's say track 0) from event 0:
.. code-block:: python3
>>>r[0].tracks[0]
offline track:
fUniqueID : 0
fBits : 33554432
id : 1
pos_x : 445.835395997812
pos_y : 615.1089636184813
pos_z : 125.1448339836911
dir_x : 0.0368711082700674
dir_y : -0.48653048395923415
dir_z : -0.872885221293917
t : 70311446.46401498
E : 99.10458562488608
len : 0.0
lik : 294.6407542676734
type : 0
rec_type : 4000
rec_stages : [1, 3, 5, 4]
status : 0
mother_id : -1
hit_ids : []
error_matrix : []
comment : 0
JGANDALF_BETA0_RAD : 0.004957442219414389
JGANDALF_BETA1_RAD : 0.003417848024252858
JGANDALF_CHI2 : -294.6407542676734
JGANDALF_NUMBER_OF_HITS : 142.0
JENERGY_ENERGY : 99.10458562488608
JENERGY_CHI2 : 1.7976931348623157e+308
JGANDALF_LAMBDA : 4.2409761837248484e-12
JGANDALF_NUMBER_OF_ITERATIONS : 10.0
JSTART_NPE_MIP : 24.88469697331908
JSTART_NPE_MIP_TOTAL : 55.88169412579765
JSTART_LENGTH_METRES : 98.89582506402911
JVETO_NPE : 0.0
JVETO_NUMBER_OF_HITS : 0.0
JENERGY_MUON_RANGE_METRES : 344.9767431592819
JENERGY_NOISE_LIKELIHOOD : -333.87773581129136
JENERGY_NDF : 1471.0
JENERGY_NUMBER_OF_HITS : 101.0
to get a specific value from track 0 in event 0, let's say for example the liklihood:
.. code-block:: python3
>>>r[0].tracks[0].lik
294.6407542676734
reading mc hits data
""""""""""""""""""""
to read mc hits data:
.. code-block:: python3
>>>r.mc_hits
<OfflineHits: 10 parsed elements>
that's it! All branches in mc hits tree can be accessed in the exact same way described in the section `reading hits data <#reading-hits-data>`__ . All data is easily accesible and if you are stuck, hit tab key to see all the available branches:
.. image:: /home/zineb/Desktop/mc_hits.png
reading mc tracks data
""""""""""""""""""""""
to read mc tracks data:
.. code-block:: python3
>>>r.mc_tracks
<OfflineTracks: 10 parsed elements>
that's it! All branches in mc tracks tree can be accessed in the exact same way described in the section `reading tracks data <#reading-tracks-data>`__ . All data is easily accesible and if you are stuck, hit tab key to see all the available branches:
.. image:: /home/zineb/Desktop/mc_tracks.png
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment