Is it possible to access the aanet usr fields from OfflineReader?
r = ki.OfflineReader(file)u = r['usr']
correctly instantiates an object but I am not sure if it's possible and how to access its content (I would expect uproot to provide you dictionary-like access for free?)
Designs
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Related merge requests
1
When this merge request is accepted, this issue will be closed automatically.
Thanks Tom. I want to access what in aanet one would access evt.getusr(key). In this case it's a set of custom variables used by the classifier in the online event selection.
I am not sure I understand the usr stuff from tracks and MC hits definition, sorry for my naivety.
In general, I agree we should not encourage proliferation of used-defined fields without some discussion on if/how to support them. On the other hand there are a few situations in which they are used as a temporary or intermediate solution before a definitive data format is defined.
Alright, it seems that uproot fails to interpret it automatically. This is basically true for everything which comes from aanet
The Jpp stuff works 95% of the time...
Anyways, I extracted the data and now need to find out where this DeltaPosZ is hidden:
In [52]: f['E']['Evt']['usr_names'].array(uproot.asdebug)[0].tostring() Out[52]: b'@\x00\x00\xc8\x00\t\x00\x00\x00\x11\x0bRecoQuality\x07RecoNDF\x03CoC\x03ToT\x0bChargeAbove\x0bChargeBelow\x0bChargeRatio\tDeltaPosZ\rFirstPartPosZ\x0cLastPartPosZ\tNSnapHits\tNTrigHits\tNTrigDOMs\nNTrigLines\x0eNSpeedVetoHits\x11NGeometryVetoHits\x12ClassficationScore'
Looks pretty scary but seems to be what I need! Thanks very much. I will get back to this later this afternoon (hopefully), but I guess the solution is provided and the issue can be closed :)
I released that in v0.9.0, which is currently also deployed to CC Lyon.
Let me know if you need some tweaks. The first call to f.usr takes a few seconds due to the structure of the tree but afterwards it uses an optimised lookup with a very low memory overhead.
The first call to f.usr takes a few seconds due to the structure of the tree but afterwards it uses an optimised lookup with a very low memory overhead.
I am afraid processing of multiple files at once will not be very sustainable. Is there any way this lookup table can be reused when opening a new file? I see this could get hacky and dangerous as one must be sure that all the files have the same structure :/
Actually the big overhead is the read-in, which is a nested structure.
I'd kindly ask you to make some usability test first, before we optimise the wrong thing
The structure of the usr dataformat is not ideal at all, so there is not much one can optimise there. I already did a trick to examine only the first event, but technically this is wrong as it implies that the usr-structure is the same for every event, although this is an unwritten rule, it can be abused in future.
That's another reason why I dislike this design... If I do not assume this rule, the performance is horrible (and the same is true for any other language as well, it is not related to Python).
I’ll however still improve the readout to be lazy, so that each usr field will be loaded only when it’s accessed, this will get rid of the whole lag I think.
Here are some numbers (file opening is 260ms, first access to any usr.field is roughly 2.5s and any other access (no matter which field) is just ~40us.
notice that usr is now nested on the corresponding branch, so not f.usr but f.events.usr
Everything is on the latest 37-user-parameters-seem-to-be-transposed branch.