Datatype issues with Pandas and Km3db
I discovered an issue related to the datatype representation of PROMISID values using km3db, to better understand the problem, please follow the code example below:
>>> import km3db
>>> import numpy as np
>>> sds = km3db.tools.StreamDS(container='pd')
>>> clb = sds.clbid(MACADDR = Mac)
>>> mac='08:00:30:38:34:D6'
>>> detoid='D0ARCA006'
>>> sds = km3db.tools.StreamDS(container='pd')
>>> clb = sds.clbid(MACADDR = Mac)
>>> clb_upi = clb.iloc[0]['CLBUPI']
>>> di = sds.detectorintegration(clbupi=clb_upi, detoid=detoid)
>>> dom_upi = di.iloc[0]['DOMUPI']
>>> dom_ser = di.iloc[0]['DOMSER’]
>>> pmt_upis = di.iloc[1:]['PMTUPI']
>>> pmt_sers = di.iloc[1:]['PMTSER'].astype(int)
>>> print(f'DOMUPI: {dom_upi}. DOMSER: {dom_ser}, CLUPI: {clb_upi}’)
>>> pmt_sers
1 12049
…
31 11368
Name: PMTSER, dtype: int64
>>> pmt_upis
1 3.4.2.3/HAMA-R12199/1.12049
…
31 3.4.2.3/HAMA-R12199/2.11368
Name: PMTUPI, dtype: object
The lines above extract the DOMUPI, its CLB and the related PMT serials and UPIs starting from: The MAC address and the DetectorId. To obtain the PromisID, I used:
>>> sds.pmt_2_base_2_promis(pmtupi=pmt_upis[1])
PMTUPI BASEUPI PROMISID
0 3.4.2.3/HAMA-R12199/1.12049 3.4.2.2/HAMA-R12199/5.1247 0021AB <- Correct hexadecimal representation
The information is correctly extracted, but there are two particular cases:
>>> sds.pmt_2_base_2_promis(pmtupi=pmt_upis[6])
PMTUPI BASEUPI PROMISID
0 3.4.2.3/HAMA-R12199/2.9556 3.4.2.2/HAMA-R12199/5.971 52000.0 <- Instead of 0052e3
The presence of the ‘eXX’ notation at the end of the data, triggers somewhere a data conversion to a float.
>>> sds.pmt_2_base_2_promis(pmtupi=pmt_upis[3])
PMTUPI BASEUPI PROMISID
0 3.4.2.3/HAMA-R12199/2.9558 3.4.2.2/HAMA-R12199/5.12800 5770 <- Instead of 005770
Somewhere the data is converted to the corresponding integer digits as decimal number
Edited by KM3NeT Collaboration