Skip to content
Snippets Groups Projects
Title: The FAIR principles
Author: Jutta
Topics:
  - Policy basics
  - dedication to open science

status: review

Publishing FAIR data

Requirements for FAIR data

The widely-accepted paradigm for open science data publication requires the implementation of the FAIR principles for research data. This involves the

  • definition of descriptive and standardized metadata and application of persistent identifiers to create a transparent and self-descriptive data regime,
  • Interlinking this data to common science platforms and registries to increase findability,
  • possibility to harvest from the data through commonly implemented interfaces and
  • definition of a policy standard including licensing and access rights management.

In all these fields, the standards of KM3NeT are currently developing. In this development process, the application of existing standards especially from the astrophysics community, the development of dedicated KM3NeT software solutions and the integration of the KM3NeT efforts developed during the KM3NeT-INFRADEV are integrated into the ESCAPE project, which forms the main development environments for open data publication in KM3NeT.

Compliance with the FAIR principles

The FAIR principles provide a solid set of requirements for the development of an open data regime. Following the FAIR requirements, the following solutions have been established in KM3NeT to enable FAIR data sharing and open science.

Findable data

  • Unique identifiers have been defined for digital objects within KM3NeT, including data files, software, collections and workflow steps, as well as identifiers for relevant data sets like particle detection ("event") in the detector
  • At the publication level, extended metadata sets are assigned to each published data product.
  • The datasets can both be accessed via UID directly on the data servers as well as through external community-relevant repositories.

Accessible data

  • The data can, at this point, be directly accessed via a webpage and through a REST-API where data cannot be offered through VO protocols.
  • At this point, no authentication is implemented, although in the future an authentication scheme is aimed for to allow access to unpublished data sets for a associated scientists.
  • Records will be kept and transfer to long-term data repositories for high-level data sets is envisioned for archiving.

Interoperable data

  • Vocabularies and content descriptors are introduced that draw on external standards like VO standards or W3C standards where possible.
  • Documentation on the metadata and vocabularies are provided.
  • Metadata classes are well-connected to allow the cross-referencing between different digital objects and extended metadata.

Reusable data

  • Licensing standards for data, software and supplementary material have been introduced.
  • Basic provenance information is provided with the data, which serves as a development start point to propagate provenance management through the complex the full data processing workflow in the future.