Improve handling of detector-specific details (geometry, empty DOMs, data normalisation, etc.) and relation to coarsening

Created by: asogaard

Things to be improved (non-exhaustive):

Currently, the clustering of pulses is done in a "vectorised" manner. This is in order to (A) make the clustering work on batches of events and (B) to do this in a way that is fast. The downside is that it is very hard to read, and thus maintain/extend. It would be great if it were possible to apply per-event logic — for clustering, detector geometry, etc. — even on batched events. This might be possible using torch.jit.script or torch.compile in conjunction with pyg.data.batch.Batch.{to_data_list,from_data_list}, whereas using just the latter on its own to split a batch into its constituent events, apply per-event logic, and re-form the batch is prohibitively slow.
There are some detector-specific operations (like adding empty DOMs to a graph) that do not involve any trainable parameters and which could benefit from being moved to the CPU as part of the Dataset class' loading of graphs. However, we don't want to maintain a myriad of different Dataset subclasses that implement different pre-processing, and we still want deployed models to be self-contained (i.e. all models targeting the same detector should act on the same original graph, and should apply all necessary preprocessing internally). It would be possible to prepare a GraphBuilder class, as suggested in #462 (closed), that could be given as an optional argument to Dataset and thus be run in parallel when loading data during training but which could also be included in the Model, either as a conditional layer that is only applied during inference, or which is simply "tacked on" the trained model when it is being saved for deployment.

NB: Each feature to be improved on might be separated out into its own issue.