Losing Timezone-awareness When Saving Hyerarchical Pandas Datetimeindex To Hdf5 In Python
I'm on pandas 0.14.1. Assume I need to index data by two timestamps in a hierarchical index using timezones. When saving the resulted DataFrame to hdf5 I seem to lose timezone-awar
Solution 1:
This is not supported under fixed
format when using a multi-index. I guess should probably raise not implemented I supposed. Here's an issue to track this
See full-hdf5-interface docs here
In [11]: pd.read_hdf('/tmp/my.h5', 'data').index.levels[0]
Out[11]:
<class'pandas.tseries.index.DatetimeIndex'>
[2000-01-01 05:00:00, 2000-01-02 05:00:00]
Length: 2, Freq: None, Timezone: None
But if you specify table
format it works.
In [13]: df.to_hdf('/tmp/my.h5', 'data2', format='table')
In [14]: pd.read_hdf('/tmp/my.h5', 'data2')
Out[14]:
a
2000-01-0100:00:00-05:002000-01-0200:00:00-05:0002000-01-0200:00:00-05:002000-01-0300:00:00-05:000
In [15]: pd.read_hdf('/tmp/my.h5', 'data2').index.levels[0]
Out[15]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-0100:00:00-05:00, 2000-01-0200:00:00-05:00]
Length: 2, Freq: None, Timezone: EST5EDT
In [16]: pd.read_hdf('/tmp/my.h5', 'data2').index.levels[1]
Out[16]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-0200:00:00-05:00, 2000-01-0300:00:00-05:00]
Length: 2, Freq: None, Timezone: EST5EDT
Post a Comment for "Losing Timezone-awareness When Saving Hyerarchical Pandas Datetimeindex To Hdf5 In Python"