Skip to content
Snippets Groups Projects
README.rst 8.91 KiB
Newer Older
Tamas Gal's avatar
Tamas Gal committed
The km3io Python package
========================

Tamas Gal's avatar
Tamas Gal committed
.. image:: https://git.km3net.de/km3py/km3io/badges/master/build.svg
    :target: https://git.km3net.de/km3py/km3io/pipelines

.. image:: https://git.km3net.de/km3py/km3io/badges/master/coverage.svg
    :target: https://km3py.pages.km3net.de/km3io/coverage

Tamas Gal's avatar
Tamas Gal committed
.. image:: https://api.codacy.com/project/badge/Grade/0660338483874475ba04f324de2123ec
    :target: https://www.codacy.com/manual/tamasgal/km3io?utm_source=github.com&utm_medium=referral&utm_content=KM3NeT/km3io&utm_campaign=Badge_Grade

Tamas Gal's avatar
Tamas Gal committed
.. image:: https://examples.pages.km3net.de/km3badges/docs-latest-brightgreen.svg
    :target: https://km3py.pages.km3net.de/km3io

Tamas Gal's avatar
Tamas Gal committed
This software provides a set of Python classes to read KM3NeT ROOT files
without having ROOT, Jpp or aanet installed. It only depends on Python 3.5+ and
the amazing uproot package and gives you access to the data via numpy arrays.

Zineb Aly's avatar
Zineb Aly committed
It's very easy to use and according to the `uproot <https://github.com/scikit-hep/uproot>`__ benchmarks, it is able to outperform the ROOT I/O performance. 
Zineb Aly's avatar
Zineb Aly committed
**Note:** Beware that this package is in the development phase, so the API will change until version ``1.0.0`` is released!
Tamas Gal's avatar
Tamas Gal committed

Zineb Aly's avatar
Zineb Aly committed
Installation
============

Install km3io using pip::

    pip install km3io 
Tamas Gal's avatar
Tamas Gal committed

Tamas Gal's avatar
Tamas Gal committed
To get the latest (stable) development release::

Tamas Gal's avatar
Tamas Gal committed
    pip install git+https://git.km3net.de/km3py/km3io.git
Tamas Gal's avatar
Tamas Gal committed

Zineb Aly's avatar
Zineb Aly committed
**Reminder:** km3io is **not** dependent on aanet, ROOT or Jpp! 

Questions
=========

If you have a question about km3io, please proceed as follows:

- Read the documentation below.
- Explore the `examples <https://km3py.pages.km3net.de/km3io/examples.html>`__ in the documentation.
- Haven't you found an answer to your question in the documentation, post a git issue with your question showing us an example of what you have tried first, and what you would like to do.
- Have you noticed a bug, please post it in a git issue, we appreciate your contribution.

Tutorial
========

**Table of contents:**

* `Introduction <#introduction>`__

  * `Overview of daq files <#overview-of-daq-files>`__

  * `Overview of offline files <#overview-of-offline-files>`__

* `Daq files reader <#daq-files-reader>`__

* `Offline files reader <#offline-file-reader>`__

Introduction
------------

Most of km3net data is stored in root files. These root files are either created with `Jpp <https://git.km3net.de/common/jpp>`__ or `aanet <https://git.km3net.de/common/aanet>`__ software. A root file created with 
`Jpp <https://git.km3net.de/common/jpp>`__ is often referred to as "a Jpp root file". Similarly, a root file created with `aanet <https://git.km3net.de/common/aanet>`__ is often referred to as "an aanet file". In km3io, an aanet root file will always be reffered to as an ``offline file``, while a Jpp root file will always be referred to as a ``daq file``.

km3io is a Python package that provides a set of classes (``DaqReader`` and ``OfflineReader``) to read both daq root files and offline root files without any dependency to aanet, Jpp or ROOT. 

Data in km3io is often returned as a "lazyarray", a "jagged lazyarray", "a jagged array" or a Numpy array. A lazyarray is an array-like object that reads data on demand! In a lazyarray, only the first and the last chunks of data are read in memory. A lazyarray can be used with all Numpy's universal `functions <https://docs.scipy.org/doc/numpy/referenceufuncs.html>`__. Here is how a lazyarray looks like:

.. code-block:: python3

    # <ChunkedArray [5971 5971 5971 ... 5971 5971 5971] at 0x7fb2341ad810>

Tamas Gal's avatar
Tamas Gal committed

Zineb Aly's avatar
Zineb Aly committed
A jagged array, is a 2+ dimentional array with different arrays lengths. In other words, a jagged array is an array of arrays of different sizes. So a jagged lazyarray is simply a jagged array of lazyarrays with different sizes. Here is how a jagged lazyarray looks like:


.. code-block:: python3

    # <JaggedArray [[102 102 102 ... 11517 11518 11518] [] [101 101 102 ... 11518 11518 11518] ... [101 101 102 ... 11516 11516 11517] [] [101 101 101 ... 11517 11517 11518]] at 0x7f74b0ef8810>


Overview of daq files
"""""""""""""""""""""
# info needed here

Overview of offline files
"""""""""""""""""""""""""

# info needed here

Daq files reader
----------------

# an update is needed here?

Currently only events (the ``KM3NET_EVENT`` tree) are supported but timeslices and summaryslices will be implemented very soon.
Tamas Gal's avatar
Tamas Gal committed

Let's have a look at some ORCA data (``KM3NeT_00000044_00005404.root``)

Zineb Aly's avatar
Zineb Aly committed
To get a lazy ragged array of the events:
Tamas Gal's avatar
Tamas Gal committed

Zineb Aly's avatar
Zineb Aly committed
.. code-block:: python3

  import km3io as ki
  events = ki.JppReader("KM3NeT_00000044_00005404.root").events


That's it! Now let's have a look at the hits data:

.. code-block:: python3

  events
  # Number of events: 17023
  events[23].snapshot_hits.tot
  # array([28, 22, 17, 29,  5, 27, 24, 26, 21, 28, 26, 21, 26, 24, 17, 28, 23,29, 27, 24, 23, 26, 29, 25, 18, 28, 24, 28, 26, 20, 25, 31, 28, 23, 26, 21, 30, 33, 27, 16, 23, 24, 19, 24, 27, 22, 23, 21, 25, 16, 28, 22, 22, 29, 24, 29, 24, 24, 25, 25, 21, 31, 26, 28, 30, 42, 28], dtype=uint8)


Offline files reader
--------------------

Let's have a look at some muons data from ORCA 4 lines simulations - run id 5971 (``datav6.0test.jchain.aanet.00005971.root``). 
Zineb Aly's avatar
Zineb Aly committed

To get a lazy ragged array of all data::

    >>> import km3io as ki
    >>> reader = ki.AanetReader('datav6.0test.jchain.aanet.00005971.root')

That's it! Now let's take a look at all the available branches in our file::

    >>> reader
    Number of events: 145028
Zineb Aly's avatar
Zineb Aly committed
    Events keys are:
      id
      det_id
      mc_id
      run_id
      mc_run_id
      frame_index
      trigger_mask
      trigger_counter
      overlays
      hits
      trks
      w
      w2list
      w3list
      mc_t
      mc_hits
      mc_trks
      comment
      index
      flags
      t.fSec
      t.fNanoSec
    Hits keys are:
      hits.id
      hits.dom_id
      hits.channel_id
      hits.tdc
      hits.tot
      hits.trig
      hits.pmt_id
      hits.t
      hits.a
      hits.pos.x
      hits.pos.y
      hits.pos.z
      hits.dir.x
      hits.dir.y
      hits.dir.z
      hits.pure_t
      hits.pure_a
      hits.type
      hits.origin
      hits.pattern_flags
    Tracks keys are:
      trks.fUniqueID
      trks.fBits
      trks.usr_data
      trks.usr_names
      trks.id
      trks.pos.x
      trks.pos.y
      trks.pos.z
      trks.dir.x
      trks.dir.y
      trks.dir.z
      trks.t
      trks.E
      trks.len
      trks.lik
      trks.type
      trks.rec_type
      trks.rec_stages
      trks.status
      trks.mother_id
      trks.fitinf
      trks.hit_ids
      trks.error_matrix
      trks.comment
Tamas Gal's avatar
Tamas Gal committed
Now that you have seen all the available branches, you can choose any key from
the above (key refers to a branch name) and display the corresponding data. For
example, we will check that we are indeed reading data from the run 5971::

    >>> reader['run_id']
    <ChunkedArray [5971 5971 5971 ... 5971 5971 5971] at 0x7fb2341ad810>

Let's look at the number of hits and tracks in the event number 5::

    >>> reader[5]['hits']
    60
    >>> reader[5]['trks']
    56

Tamas Gal's avatar
Tamas Gal committed
So event 5 has exactly 60 hits and 56 tracks. Let's explore in more details
hits and tracks data in event 5::

    >>> reader['hits.dom_id'][5]
    array([806455814, 806487219, 806487219, 806487219, 806487226, 808432835,
       808432835, 808432835, 808432835, 808432835, 808432835, 808432835,
       808451904, 808451904, 808451907, 808451907, 808469129, 808469129,
       808469129, 808493910, 808949744, 808949744, 808951460, 808951460,
       808956908, 808961655, 808964908, 808969848, 808969857, 808972593,
       808972593, 808972598, 808972598, 808972698, 808972698, 808974758,
       808974811, 808976377, 808981510, 808981523, 808981812, 808982005,
       808982005, 808982018, 808982077, 808982077, 808982547, 809007627,
       809521500, 809521500, 809521500, 809524432, 809526097, 809526097,
       809526097, 809526097, 809526097, 809526097, 809526097, 809544058],
      dtype=int32)

One can access the dom_id for the first hit in event 5 as follows:: 

    >>> reader['hits.dom_id'][5][0]
    806455814

Now let's read tracks data in event 5::

    >>> reader['trks.dir.z'][5]
    array([-0.60246049, -0.60246049, -0.60246049, -0.51420541, -0.5475772 ,
       -0.5772408 , -0.56068238, -0.64907684, -0.67781799, -0.66565114,
       -0.63014839, -0.64566464, -0.62691012, -0.58465493, -0.59287533,
       -0.63655091, -0.63771247, -0.73446841, -0.7456636 , -0.70941246,
       -0.66312268, -0.66312268, -0.56806477, -0.56806477, -0.66312268,
       -0.66312268, -0.74851077, -0.74851077, -0.66312268, -0.74851077,
       -0.56806477, -0.74851077, -0.66312268, -0.74851077, -0.56806477,
       -0.66312268, -0.56806477, -0.66312268, -0.56806477, -0.56806477,
       -0.66312268, -0.74851077, -0.66312268, -0.93501626, -0.56806477,
       -0.74851077, -0.66312268, -0.56806477, -0.82298389, -0.74851077,
       -0.66312268, -0.56806477, -0.82298389, -0.56806477, -0.66312268,
       -0.97094183])

One can access the 'trks.dir.z' for the first track in event 5 as follows::

    >>> reader['trks.dir.z'][5][0]
    -0.60246049