Skip to content Skip to sidebar Skip to footer

Losing Timezone-awareness When Saving Hyerarchical Pandas Datetimeindex To Hdf5 In Python

I'm on pandas 0.14.1. Assume I need to index data by two timestamps in a hierarchical index using timezones. When saving the resulted DataFrame to hdf5 I seem to lose timezone-awar

Solution 1:

This is not supported under fixed format when using a multi-index. I guess should probably raise not implemented I supposed. Here's an issue to track this

See full-hdf5-interface docs here

In [11]: pd.read_hdf('/tmp/my.h5', 'data').index.levels[0]
Out[11]: 
<class'pandas.tseries.index.DatetimeIndex'>
[2000-01-01 05:00:00, 2000-01-02 05:00:00]
Length: 2, Freq: None, Timezone: None

But if you specify table format it works.

In [13]: df.to_hdf('/tmp/my.h5', 'data2', format='table')

In [14]: pd.read_hdf('/tmp/my.h5', 'data2')
Out[14]: 
                                                     a
2000-01-0100:00:00-05:002000-01-0200:00:00-05:0002000-01-0200:00:00-05:002000-01-0300:00:00-05:000

In [15]: pd.read_hdf('/tmp/my.h5', 'data2').index.levels[0]
Out[15]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-0100:00:00-05:00, 2000-01-0200:00:00-05:00]
Length: 2, Freq: None, Timezone: EST5EDT

In [16]: pd.read_hdf('/tmp/my.h5', 'data2').index.levels[1]
Out[16]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-0200:00:00-05:00, 2000-01-0300:00:00-05:00]
Length: 2, Freq: None, Timezone: EST5EDT

Post a Comment for "Losing Timezone-awareness When Saving Hyerarchical Pandas Datetimeindex To Hdf5 In Python"