couldn't connect to display "localhost:29.0"
Hello everyone,
I have used the e_reg_model and e_reg_config files from orcanet_contrib/ORCA4_neutrino_analysis_configs_and_models. I have used a python program that uses orcanet function to train the model. Hence, I needed to use the GPU.
When I used the GPU interactively, I sometimes get an error of couldn't connect to display "localhost:10.0" which I guess appears because the port is busy. But nevertheless, the training has worked and it has created all the required results.
On the other hand, I have an issue doing the same by submitting a job in the GPU. As I get the following error: couldn't connect to display "localhost:29.0". The traceback is shown here:
I am not quite sure what is going wrong? I ran it more than one time and this is still what I got. Except for once I got an error saying: FileNotFoundError: [Errno 2] No such file or directory:‘output/sum_model/train_log/log_epoch_1_file_1.txt’
Which is a file that is supposed to be created by the model, not by me, it may be that running in the batch mode doesn't give the same permissions to create a folder?
I would be glad if someone could assist me to solve this. Many thanks!