pages/Quality.md · 744b7f43785f88fab02d85eaae5e4044beb1a0ce · Open Data / Open Science Portal · GitLab

Snippets Groups Projects

4 years ago
0b619748

adding Rodri's stuff · 0b619748
Jutta Schnabel authored 4 years ago

0b619748

History

adding Rodri's stuff
Jutta Schnabel authored 4 years ago

Quality.md 4.77 KiB

Title: Quality management
Author: Rodri
Topics:
  - quality checks on event data
  - quality indicators
status: dump

The quality of the scientific results produced by KM3NeT will be influenced by the performance of the different processes involved in the data generation and processing chain. In the following, the implementation of quality control procedures at the levels of detector components, data acquisition, calibration and simulations is described. This description focuses on the optical data. /The quality control procedures for slow control data and for acoustic data are not ready yet but their implementation is imminent and will be similar to the implementation of quality control procedures for optical data.

During the data acquisition process, the online monitoring software presents real time plots that allow the shifters to promptly identify problems with the data acquisition. It includes an alert system that sends notifications to the shifters if during the data taking, problems appear that require human intervention. The online monitor uses the same data that are stored for offline analyses (this is actually not true, and should be changed). This implies that any anomaly observed during the detector operation can be reproduced offline.

As explained in XX.YY, the optical data obtained from the detector operation are stored in .root files and moved to a high performance storage environment. The offline data quality control procedures start with a first analysis of these files which is performed daily. It mainly focuses on but is not restricted to the summary data stored in the .root files. The summary data contain information related to the performance of the data acquisition procedures for each optical module in the detector. As a result of this first analysis, a set of key-value pairs is produced where each key corresponds to a parameter that represents a given dimension of data quality and the value represents the evaluation of this parameter for the livetime of the analysed data. The results are tagged with a unique identifier corresponding to the analysed data set and uploaded to the database. In the present implementation the analysis is performed for each available file where each file corresponds to a data taking run, although this may change in the future as the data volume generated per run will increase with the detector size.

A further analysis of the results stored in the database includes the comparison of the values of the different parameters to some reference values, allowing for a classification of data periods according to their quality. The reference values are typically set according to the accuracy with which the current detector simulations include the different quality parameters. In addition, the evolution of the different quality parameters can be monitored and made available to the full collaboration as reports. Currently this is done every week by the shifters, and the reports are posted on an electronic log book (ELOG).

The first step in the data processing chain is to determine the detector calibration parameters using the data obtained from the detector operation. These parametrers include the time offsets of the PMTs as well as their gains and efficiencies, the positions and orientations of the optical modules. The PMT time offsets and the positions of the optical modules are used in later stages of the data processing chain for event reconstruction, as well as by the real time data filter during the detector operation. While the event reconstruction requires an accurate knowledge of these parameters, the algorithms used by the real time data filter depend rather losely on them, and its performance is not dependent on variations occuring within a timescale of the order of months. Nevertheless, it is still necessary to monitor them and correct the values used by the data filter if necessary. The performance of the detector operation also depends on the response of the PMTs, which is partly determined by their gains. These evolve over time, and they can be set to their nominal values through a tuning of the high-voltage applied to each PMT. Monitoring the PMT gains is therefore also necessary to maximise the detector performance. Additionally, the PMT gains and efficiencies are also used offline by the detector simulation.

Within the context of data quality assesment, software tools have been developed that allow to monitor the parameters described above and compare them to reference values. The reference values should be determined by the impact of miscalibrations on the scientific goals of KM3NeT. The determination of the reference values has not been addressed.

Once the calibration constnats have been determined, the data processing chain continues with the reconstruction of the events, and with the simulation of an equivalent data set.