# km3mon

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3268538.svg)](https://doi.org/10.5281/zenodo.3268538)

Online monitoring suite for the KM3NeT neutrino detectors.

## Requirements

 - Docker and Docker Compose

Everything is containerised, so no need to install other software. The version
of the pre-built Docker images is determined by the `KM3MON_VERSION` variable
in the `.env` file (see `example.env`). Ideally, you want to keep the same version
as the checked out Git tag. If you want to experiment around, feel free to
uncomment the `build` lines in the `docker-compose.yml`.

## Setup

1. Create a file called `.env` from the `example.env` template and adjust the detector
   ID and the IP/port of the servers and also the `KM3MON_VERSION` variable, which should
   ideally be set to the latest version. This determines the Docker images to be used
   for each service.

2. Next, create a `backend/supervisord.conf` from the template file
   `backend/supervisord.conf.example` and adjust if needed.

3. Create a `backend/pipeline.toml` from the `backend/pipeline.toml.example`
   and adapt the settings if needed. Don't forget to add operators and shifters.

4. Optionally, adapt the layout of the plots in `frontend/app/routes.py`.

## Start and stop

The monitoring system can be started using

    docker compose up -d

This will download and build all the required images and launch the containers
for each service. It will also create an overlay network.

To stop it it

    docker compose down

## Monitoring the monitoring

Log files are kept in `logs/`, data dumps in `data/` and plots in `plots/`.

To check the logs or follow them in real-time (`-f`) and limit the rewind
to a number of lines `--tail=N`, e.g.

    docker compose logs -f --tail=10 SERVICE_NAME

The `SERVICE_NAME` can be any of `backend`, `frontend`, `ligier`, `ligiermirror`,
`ligierlogmirror`, `reco` or `livelog`.

The monitoring back-end is running inside a Docker container and controlled
by `supervisord`. You can enter the `backend` with

    docker exec -it monitoring_backend_1 bash

The ``supervisorctl`` is the tool to communicate with the monitoring
back-end system. To see the status of the processes, use `supervisorctl status`,
it will show each process one by one (make sure you call it in the
folder where you launched it):

```
$ supervisorctl status
alerts:timesync_monitor               RUNNING   pid 26, uptime 1 day, 5:21:06
logging:chatbot                       RUNNING   pid 11, uptime 1 day, 5:21:06
logging:log_analyser                  RUNNING   pid 10, uptime 1 day, 5:21:06
logging:msg_dumper                    RUNNING   pid 9, uptime 1 day, 5:21:06
monitoring_process:acoustics          RUNNING   pid 1567, uptime 1 day, 5:20:59
monitoring_process:ahrs_calibration   RUNNING   pid 91859, uptime 1:09:14
monitoring_process:dom_activity       RUNNING   pid 1375, uptime 1 day, 5:21:00
monitoring_process:dom_rates          RUNNING   pid 1378, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_10       RUNNING   pid 1376, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_11       RUNNING   pid 1379, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_13       RUNNING   pid 1377, uptime 1 day, 5:21:00
monitoring_process:pmt_rates_14       RUNNING   pid 1568, uptime 1 day, 5:20:59
monitoring_process:pmt_rates_18       RUNNING   pid 21, uptime 1 day, 5:21:06
monitoring_process:pmt_rates_9        RUNNING   pid 1566, uptime 1 day, 5:20:59
monitoring_process:rttc               RUNNING   pid 118444, uptime 0:17:20
monitoring_process:trigger_rates      RUNNING   pid 22, uptime 1 day, 5:21:06
monitoring_process:triggermap         RUNNING   pid 1796, uptime 1 day, 5:20:58
monitoring_process:ztplot             RUNNING   pid 24, uptime 1 day, 5:21:06
reconstruction:time_residuals         RUNNING   pid 27, uptime 1 day, 5:21:06
```

The processes are grouped accordingly (logging, monitoring_process etc.) and
automatically started in the right order.

You can stop and start individual services using ``supervisorctl stop
group:process_name`` and ``supervisorctl start group:process_name``

Since the system knows the order, you can safely ``restart all`` or just
a group of processes. Use the ``supervisorctl help`` to find out more and
``supervisorctl help COMMAND`` to get a detailed description of the
corresponding command.

## Back-end configuration file

The file `backend/pipeline.toml` is the heart of all monitoring processes and
can be used to set different kind of parameters, like plot attributes or ranges.

## Chatbot

The `km3mon` suite comes with a chatbot which can join a channel defined
in the `pipeline.toml` file under the `[Alerts]` section:

``` toml
[Alerts]
botname = "monitoring"
password = "supersecretpassword"
channel = "operations_fr"
operators = [ "a_enzenhoefer", "tamasgal",]
```

The password is the actual login password of the bot. Once the `chatbot` service
is running, the bot will notifiy important events like sudden drop of the
trigger rate and can also be used to retrieve information from the monitoring
system, set the current shifts and even control the monitoring services through
the `supervisorctl` interface. Only the operators defined in the configuration
file are allowed to modify services or change the shifters.
To see the bot's capabilities, one simply asks them for help via
`@monitoring help`:

```
Hi Tamas Gal, I was built to take care of the monitoring alerts.
Here is how you can use me:
- @monitoring shifters are cnorris and bspencer
-> set the new shifters who I may annoy with chat messages and
emails.
- @monitoring status -> show the status of the monitoring system
- @monitoring supervisorctl -> take control over the monitoring system
- @monitoring help -> show this message
```

### Troubleshooting

#### Database connection needed

The monitoring processes talk to the KM3NeT Oracle DB service and need an
valid session cookie. The monitoring servers of the ORCA and ARCA shore stations
are whitelisted, but if you run the system on other machines, you need to provide
the cookie string for that specific machine. To get the cookie string, run the
monitoring system with `docker compose up -d` and connect to the backend with

    # docker exec -it monitoring-backend-1 bash

To get a session cookie, query the database however you like, e.g.

    # streamds get detectors

It will ask you for your KM3NeT (external) credentials and the required
cookie value is the last column in the file `~/.km3netdb_cookie`

    # cat ~/.km3netdb_cookie
    .in2p3.fr	TRUE	/	TRUE	0	sid	_tgal_131.42.5.23_6d132a51d884b22b2ba861a8847346c

Create a new environment variable in the `.env` file on the host system (not in the Docker
container!) with the following entry (of course with your cookie string):

    KM3NET_DB_COOKIE=_tgal_131.42.5.23_6d132a51d884b22b2ba861a8847346c

and restart the whole monitoring system with

    docker compose down && docker compose up -d