I moved this back to km3pipe
, that's the only place where we need a follow-up.
I'm bumping into an error when trying to read detector file with km3pipe when these are coming from the data processing. Full reproducible example at lyon:
module load km3net_soft_env/
python -c "import km3pipe as kp; d = kp.hardware.Detector('/sps/km3net/repo/data_processing/tag/v9.0/workdirs/KM3NeT_00000049/00000049/00007742/detector/KM3NeT_00000049_00007742_offline.detx')"
Which return:
++ Detector: Parsing the DETX header
++ Detector: Reading PMT information...
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 76, in __init__
self._init_from_file(filename)
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 102, in _init_from_file
self._parse()
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 173, in _parse
dom_id, du, floor, n_pmts = split(line, int)
ValueError: not enough values to unpack (expected 4, got 2)
From what I understand, the problem is coming from the header of the detector containing such multiline comments:
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JCalibrateK40 -d 1 -f \"./intra/KM3NeT_00000049_00007223.root ./intra/KM3NeT_00000049_00007224.root ./intra/KM3NeT_00000049_00007225.root ./intra/KM3N
eT_00000049_00007231.root ./intra/KM3NeT_00000049_00007232.root ./intra/KM3NeT_00000049_00007233.root ./intra/KM3NeT_00000049_00007234.root ./intra/KM3NeT_00000049_00007235.root\" -a ./detector.datx -o calibrate
K40_sea.root -b rates -C JDAQTimesliceL1 -V0+1e5 -M2+31
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JMergeCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JMergeCalibrateK40 -d 1 -f calibrateK40_sea.root -o mergecalibrateK40_sea.root
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JFitK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JFitK40 -d 1 -f mergecalibrateK40_sea.root -a ./detector_seaCalibration_pmtMixed.datx -o fitK40_pmtMixed_sea.root -! \"808961448 15
808951460 15
808996773 15
808493910 15
808489117 15
806451572 15
808961655 15
808981864 15
808961504 15
808982005 15
809544058 15
808949744 15
808469129 15
808451904 15
808964883 15
808964908 15
808982066 15
808956908 15
808435278 15
808981812 15
808976377 15
808981510 15
808959411 15
808982547 15
808969857 15
808974811 15
806487226 15
808984711 15
809524432 15
808472260 15
808972593 15
808981523 15
808451907 15
809521500 15
806487219 15
808972598 15
808946818 15
809006037 15
808982041 15
808979729 15
806483369 15
808974758 15
808981672 15
808447180 15
808472265 15
806455814 15
808493231 15
808974972 15
806487231 15
808961480 15
806465101 15
808488895 15
808447186 15
808964852 15
808972698 15
808982077 15
808969848 15
808974773 15
808979567 15
809544061 15
808488990 15
808964815 15
808982018 15
808979721 15
809007627 15
809503416 15
808489014 15
808997793 15
808992657 15
808488997 15
809526097 15
808432835 15\" -A -w
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JConvertDetectorFormat
# command /Jpp/out//Linux/bin//JConvertDetectorFormat -a /tmp//detectorTHEvYf.datx -o detectors/ORCA/00000049/E_1.0.0/KM3NeT_00000049_E_1.0.0_00007406.datx --!
# namespace KM3NET
When parsing the file, the module pmtid pairs are interpreted like actual part of the detector file, trigger e.g. this line and making the detector looking like a v1.
Not sure how to log that, any idea @tgal ?
I'm bumping into an error when trying to read detector file with km3pipe when these are coming from the data processing. Full reproducible example at lyon:
module load km3net_soft_env/
python -c "import km3pipe as kp; d = kp.hardware.Detector('/sps/km3net/repo/data_processing/tag/v9.0/workdirs/KM3NeT_00000049/00000049/00007742/detector/KM3NeT_00000049_00007742_offline.detx')"
Which return:
++ Detector: Parsing the DETX header
++ Detector: Reading PMT information...
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 76, in __init__
self._init_from_file(filename)
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 102, in _init_from_file
self._parse()
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 173, in _parse
dom_id, du, floor, n_pmts = split(line, int)
ValueError: not enough values to unpack (expected 4, got 2)
From what I understand, the problem is coming from the header of the detector containing such multiline comments:
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JCalibrateK40 -d 1 -f \"./intra/KM3NeT_00000049_00007223.root ./intra/KM3NeT_00000049_00007224.root ./intra/KM3NeT_00000049_00007225.root ./intra/KM3N
eT_00000049_00007231.root ./intra/KM3NeT_00000049_00007232.root ./intra/KM3NeT_00000049_00007233.root ./intra/KM3NeT_00000049_00007234.root ./intra/KM3NeT_00000049_00007235.root\" -a ./detector.datx -o calibrate
K40_sea.root -b rates -C JDAQTimesliceL1 -V0+1e5 -M2+31
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JMergeCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JMergeCalibrateK40 -d 1 -f calibrateK40_sea.root -o mergecalibrateK40_sea.root
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JFitK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JFitK40 -d 1 -f mergecalibrateK40_sea.root -a ./detector_seaCalibration_pmtMixed.datx -o fitK40_pmtMixed_sea.root -! \"808961448 15
808951460 15
808996773 15
808493910 15
808489117 15
806451572 15
808961655 15
808981864 15
808961504 15
808982005 15
809544058 15
808949744 15
808469129 15
808451904 15
808964883 15
808964908 15
808982066 15
808956908 15
808435278 15
808981812 15
808976377 15
808981510 15
808959411 15
808982547 15
808969857 15
808974811 15
806487226 15
808984711 15
809524432 15
808472260 15
808972593 15
808981523 15
808451907 15
809521500 15
806487219 15
808972598 15
808946818 15
809006037 15
808982041 15
808979729 15
806483369 15
808974758 15
808981672 15
808447180 15
808472265 15
806455814 15
808493231 15
808974972 15
806487231 15
808961480 15
806465101 15
808488895 15
808447186 15
808964852 15
808972698 15
808982077 15
808969848 15
808974773 15
808979567 15
809544061 15
808488990 15
808964815 15
808982018 15
808979721 15
809007627 15
809503416 15
808489014 15
808997793 15
808992657 15
808488997 15
809526097 15
808432835 15\" -A -w
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JConvertDetectorFormat
# command /Jpp/out//Linux/bin//JConvertDetectorFormat -a /tmp//detectorTHEvYf.datx -o detectors/ORCA/00000049/E_1.0.0/KM3NeT_00000049_E_1.0.0_00007406.datx --!
# namespace KM3NET
When parsing the file, the module pmtid pairs are interpreted like actual part of the detector file, trigger e.g. this line and making the detector looking like a v1.
Not sure how to log that, any idea @tgal ?
Whatever "magic" is there to read in multi-line headers, it should either be part of the specification (v6 then) or we should simply not allow them.
Of course KM3io.jl
will crash as well
julia> using KM3io
julia> Detector("/sps/km3net/repo/data_processing/tag/v9.0/workdirs/KM3NeT_00000049/00000049/00007742/detector/KM3NeT_00000049_00007742_offline.detx")
ERROR: BoundsError: attempt to access 2-element Vector{SubString{String}} at index [1:3]
Stacktrace:
[1] throw_boundserror(A::Vector{SubString{String}}, I::Tuple{UnitRange{Int64}})
@ Base ./abstractarray.jl:737
[2] checkbounds
@ ./abstractarray.jl:702 [inlined]
[3] getindex
@ ./array.jl:973 [inlined]
[4] read_detx(io::IOStream)
@ KM3io ~/.julia/packages/KM3io/A0mdY/src/hardware.jl:479
[5] #15
@ ~/.julia/packages/KM3io/A0mdY/src/hardware.jl:369 [inlined]
[6] open(::KM3io.var"#15#17", ::String, ::Vararg{String}; kwargs::@Kwargs{})
@ Base ./io.jl:396
[7] open
@ ./io.jl:393 [inlined]
[8] Detector(filename::String)
@ KM3io ~/.julia/packages/KM3io/A0mdY/src/hardware.jl:368
[9] top-level scope
@ REPL[2]:1
What should we do then?
Fwiw, I found a workaround by using JConvertDetectorFormat
to "squash" the header, therefor getting rid of the problematic field.
The DETX file is not formatted according to the specifications. I raised this issue with meta data a while ago but I don't think it has ever been addressed. The same problem happens with some JSON conversions.
I'm bumping into an error when trying to read detector file with km3pipe when these are coming from the data processing. Full reproducible example at lyon:
module load km3net_soft_env/
python -c "import km3pipe as kp; d = kp.hardware.Detector('/sps/km3net/repo/data_processing/tag/v9.0/workdirs/KM3NeT_00000049/00000049/00007742/detector/KM3NeT_00000049_00007742_offline.detx')"
Which return:
++ Detector: Parsing the DETX header
++ Detector: Reading PMT information...
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 76, in __init__
self._init_from_file(filename)
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 102, in _init_from_file
self._parse()
File "/pbs/throng/km3net/software/python/3.7.5/lib/python3.7/site-packages/km3pipe/hardware.py", line 173, in _parse
dom_id, du, floor, n_pmts = split(line, int)
ValueError: not enough values to unpack (expected 4, got 2)
From what I understand, the problem is coming from the header of the detector containing such multiline comments:
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JCalibrateK40 -d 1 -f \"./intra/KM3NeT_00000049_00007223.root ./intra/KM3NeT_00000049_00007224.root ./intra/KM3NeT_00000049_00007225.root ./intra/KM3N
eT_00000049_00007231.root ./intra/KM3NeT_00000049_00007232.root ./intra/KM3NeT_00000049_00007233.root ./intra/KM3NeT_00000049_00007234.root ./intra/KM3NeT_00000049_00007235.root\" -a ./detector.datx -o calibrate
K40_sea.root -b rates -C JDAQTimesliceL1 -V0+1e5 -M2+31
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JMergeCalibrateK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JMergeCalibrateK40 -d 1 -f calibrateK40_sea.root -o mergecalibrateK40_sea.root
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JFitK40
# command /pbs/throng/km3net/src/Jpp/v18.5.0/out//Linux/bin//JFitK40 -d 1 -f mergecalibrateK40_sea.root -a ./detector_seaCalibration_pmtMixed.datx -o fitK40_pmtMixed_sea.root -! \"808961448 15
808951460 15
808996773 15
808493910 15
808489117 15
806451572 15
808961655 15
808981864 15
808961504 15
808982005 15
809544058 15
808949744 15
808469129 15
808451904 15
808964883 15
808964908 15
808982066 15
808956908 15
808435278 15
808981812 15
808976377 15
808981510 15
808959411 15
808982547 15
808969857 15
808974811 15
806487226 15
808984711 15
809524432 15
808472260 15
808972593 15
808981523 15
808451907 15
809521500 15
806487219 15
808972598 15
808946818 15
809006037 15
808982041 15
808979729 15
806483369 15
808974758 15
808981672 15
808447180 15
808472265 15
806455814 15
808493231 15
808974972 15
806487231 15
808961480 15
806465101 15
808488895 15
808447186 15
808964852 15
808972698 15
808982077 15
808969848 15
808974773 15
808979567 15
809544061 15
808488990 15
808964815 15
808982018 15
808979721 15
809007627 15
809503416 15
808489014 15
808997793 15
808992657 15
808488997 15
809526097 15
808432835 15\" -A -w
# namespace KM3NET
# system Linux cca010 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64
# GIT 18.5.0
# ROOT 6.22/06
# application JConvertDetectorFormat
# command /Jpp/out//Linux/bin//JConvertDetectorFormat -a /tmp//detectorTHEvYf.datx -o detectors/ORCA/00000049/E_1.0.0/KM3NeT_00000049_E_1.0.0_00007406.datx --!
# namespace KM3NET
When parsing the file, the module pmtid pairs are interpreted like actual part of the detector file, trigger e.g. this line and making the detector looking like a v1.
Not sure how to log that, any idea @tgal ?
Ah sorry, the original problem was something else, but at least we can resolve the thread above.
Indeed ;)
I think this can be closed here, as this is a storage and not a km3pipe
issue.
Thanks Kay!
That is not sustainable: in principle nobody will look beforehand if a file is small and on disk only, but simply try to retrieve it. I will open an issue in comp&soft - we need a general solution.
There is no other way around I think - but we will loose those files eventually!!!
OK interesting. Thanks ;) So now we need to use iRODS where we already have extremely limited slots (and is the only option to use on other CCs) to get small files. Great
Hi @tgal
I got this answer form CC Lyon.
For disk-only files, using iget is acceptable: in this case, HPSS will not be called upon, as these small files are stored on the iRODS disk.
Conversely, if the files are in HPSS, you need to use XRootD. In this case, these large files will not be stored on the iRODS disk, and going via iRODS will require HPSS for recovery, which is not optimal.
You should open a ticket at the Lyon CC helpdesk. iget
should be avoided!
Thank you Santiago.
I was facing a similar issue, I believe this might have to do with the fact that xrootd is looking at hpss
. I understand that recently (I don't know since when) the lyon admins made a condition such that only files with size > 100Mb would be copied to hpss. The file you are trying to retrieve has 44Mb in size, then it won't be in hpss
. This is the conclusion I came up with though, someone could correct me if I am wrong.
I was wondering if this would be a problem with the new mass processing being copied to iRODS. We might have to use iget
to retrieve these files which I believe is not a good solution.