Skip to content
Snippets Groups Projects
Tamas Gal's avatar
Tamas Gal authored
added new plot to Acoustic and AHRS

See merge request !37
c4b22828
History

km3mon

DOI

Online monitoring suite for the KM3NeT neutrino detectors.

Requirements

  • Docker

Everything is containerised, so no need to install other software. The version of the pre-built Docker base images is determined by the KM3MON_VERSION variable in the .env file (see example.env). Ideally, you want to keep the same version as the checked out Git tag, but scripts which are not part of the base images can be updated independently. The backend for example contains a lot of Python scripts which can easily be updated without touching the base image, that only consists of a Python installation with the required packages.

Setup

  1. Create a file called .env from the example.env template and adjust the detector ID and the IP/port of the servers and also the KM3MON_VERSION variable, which should ideally be set to the latest version. This determines the Docker images to be used for each service.

  2. Next, create a backend/supervisord.conf from the template file backend/supervisord.conf.example and adjust if needed.

  3. Create a backend/pipeline.toml from the backend/pipeline.toml.example and adapt the settings if needed. Don't forget to add operators and shifters.

  4. Optionally, adapt the layout of the plots in frontend/app/routes.py.

Start and stop

The monitoring system can be started using

docker compose up -d

This will download and build all the required images and launch the containers for each service. It will also create an overlay network.

To stop it it

docker compose down

Monitoring the monitoring

Log files are kept in logs/, data dumps in data/ and plots in plots/. These folders can be used (mounted) by all services defined in the docker-compose.yml file. The frontent service for example which runs the Webserver mounts plots/ to have access to the plots.

To check the logs or follow them in real-time (-f) and limit the rewind to a number of lines --tail=N, e.g.

docker compose logs -f --tail=10 SERVICE_NAME

The SERVICE_NAME can be any of backend, frontend, ligier, ligiermirror, ligierlogmirror, reco or livelog.

Back-end configuration file

The file backend/pipeline.toml is the heart of most monitoring processes and can be used to set different kind of parameters, like plot attributes or ranges. That configuration file is accessibe within each Python script under backend/scripts.

The monitoring back-end is running inside a Docker container and controlled by supervisord. You can enter the backend with

docker exec -it monitoring_backend_1 bash

The supervisorctl is the tool to communicate with the monitoring back-end system. To see the status of the processes, use supervisorctl status, it will show each process one by one (make sure you call it in the folder where you launched it):

$ supervisorctl status
alerts:timesync_monitor               RUNNING   pid 26, uptime 1 day, 5:21:06
logging:chatbot                       RUNNING   pid 11, uptime 1 day, 5:21:06
logging:log_analyser                  RUNNING   pid 10, uptime 1 day, 5:21:06
logging:msg_dumper                    RUNNING   pid 9, uptime 1 day, 5:21:06
monitoring_process:acoustics          RUNNING   pid 1567, uptime 1 day, 5:20:59
monitoring_process:ahrs_calibration   RUNNING   pid 91859, uptime 1:09:14
monitoring_process:dom_activity       RUNNING   pid 1375, uptime 1 day, 5:21:00
monitoring_process:dom_rates          RUNNING   pid 1378, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_10       RUNNING   pid 1376, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_11       RUNNING   pid 1379, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_13       RUNNING   pid 1377, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_14       RUNNING   pid 1568, uptime 1 day, 5:20:59
monitoring_process:pmt_rates_18       RUNNING   pid 21, uptime 1 day, 5:21:06
monitoring_process:pmt_rates_9        RUNNING   pid 1566, uptime 1 day, 5:20:59
monitoring_process:rttc               RUNNING   pid 118444, uptime 0:17:20
monitoring_process:trigger_rates      RUNNING   pid 22, uptime 1 day, 5:21:06
monitoring_process:triggermap         RUNNING   pid 1796, uptime 1 day, 5:20:58
monitoring_process:ztplot             RUNNING   pid 24, uptime 1 day, 5:21:06
reconstruction:time_residuals         RUNNING   pid 27, uptime 1 day, 5:21:06

The processes are grouped accordingly (logging, monitoring_process etc.) and automatically started in the right order.

You can stop and start individual services using supervisorctl stop group:process_name and supervisorctl start group:process_name

Since the system knows the order, you can safely restart all or just a group of processes. Use the supervisorctl help to find out more and supervisorctl help COMMAND to get a detailed description of the corresponding command.

Frontent

The frontend is a simple webserver and uses HTML templates to render the websites. The layout of the plots can be changed in frontend/app/routes.py using nested lists. Each URL endpoint can be assigned to a specific function which uses a specific template to render the actual page. The templates for the base layout (incuding the menubar) and each page are under frontend/app/templates.

Chatbot

The km3mon suite comes with a chatbot which can join a channel defined in the pipeline.toml file under the [Alerts] section:

[Alerts]
botname = "monitoring"
password = "supersecretpassword"
channel = "operations_fr"
operators = [ "a_enzenhoefer", "tamasgal",]

The password is the actual login password of the bot. Once the chatbot service is running, the bot will notifiy important events like sudden drop of the trigger rate and can also be used to retrieve information from the monitoring system, set the current shifts and even control the monitoring services through the supervisorctl interface. Only the operators defined in the configuration file are allowed to modify services or change the shifters. To see the bot's capabilities, one simply asks them for help via @monitoring help:

Hi Tamas Gal, I was built to take care of the monitoring alerts.
Here is how you can use me:
- @monitoring shifters are cnorris and bspencer
-> set the new shifters who I may annoy with chat messages and
emails.
- @monitoring status -> show the status of the monitoring system
- @monitoring supervisorctl -> take control over the monitoring system
- @monitoring help -> show this message

Troubleshooting

Database connection needed

The monitoring processes talk to the KM3NeT Oracle DB service and need a valid session cookie. The monitoring servers of the ORCA and ARCA shore stations are whitelisted and use the following session cookies (defined in the .env file):

  • ARCA: KM3NET_DB_COOKIE=_tgal_192.84.151.58_48d829f9ef104d82b4c1d8557ee5bb55
  • ORCA: KM3NET_DB_COOKIE=_tgal_188.194.66.108_4c0d9307fb4a423cb0bd8c2b34ba4790

If you run the system on other machines, you need to provide the cookie string for that specific machine. To get the cookie string, run the monitoring system with docker compose up -d and connect to the backend with

# docker exec -it monitoring-backend-1 bash

To get a session cookie, query the database however you like, e.g.

# streamds get detectors

It will ask you for your KM3NeT (external) credentials and the required cookie value is the last column in the file ~/.km3netdb_cookie

# cat ~/.km3netdb_cookie
.in2p3.fr	TRUE	/	TRUE	0	sid	_tgal_131.42.5.23_6d132a51d884b22b2ba861a8847346c

Create a new environment variable in the .env file on the host system (not in the Docker container!) with the following entry (of course with your cookie string):

KM3NET_DB_COOKIE=_tgal_131.42.5.23_6d132a51d884b22b2ba861a8847346c

and restart the whole monitoring system with

docker compose down && docker compose up -d