Skip to content

Should we retire training/utils.py?

Created by: asogaard

The module training/utils.py contains a number of utility functions that are not written in the same object-oriented way that the rest of the repo is, the functionality of which have largely been superseded by other parts of the repo, and which generally score poorly in terms of maintainability and test coverage. Specifically:

  • make_dataloader: Could be removed in favour of DataLoader{,.from_dataset_config}
  • make_train_test_dataloader: Could be removed in favour of DataLoader{,.from_dataset_config}
  • get_predictions: Could be removed in favour of Model.predict{,_as_dataframe}
  • save_results: There is no other code to directly replace this, I figure it's the one that's most worth keeping. It could be replaced with just a few lines of code that are specific to the training in question. However, I can see the argument for having a standard way of saving results. In this case, though, I would recommend we consider simplifying some of the internal logic (e.g., doing away with the separate db, archive, and tag arguments).

Is the above functionality actively being used in favour of the newer alternatives? If so, why?