Skip to content Skip to sidebar Skip to footer

Cumulative Sum On Time Series Split By Consecutive Negative Or Positive Values

I have a time series data that looks like this: date values 2017-05-01 1 2017-05-02 0.5 2017-05-03 -2 2017-05-04 -1 2017-05-05 -1.25 2017-05-06 0.

Solution 1:

Putting 0 in with the positives, you can use the shift-compare-cumsum pattern:

In [33]: sign = df["values"] >= 0

In [34]: df["vsum"] = df["values"].groupby((sign != sign.shift()).cumsum()).cumsum()

In [35]: df
Out[35]: 
         date  values  vsum
0  2017-05-01    1.00  1.00
1  2017-05-02    0.50  1.50
2  2017-05-03   -2.00 -2.00
3  2017-05-04   -1.00 -3.00
4  2017-05-05   -1.25 -4.25
5  2017-05-06    0.50  0.50
6  2017-05-07    0.50  1.00

which works because (sign != sign.shift()).cumsum() gives us a new number for each contiguous group:

In [36]:sign!=sign.shift()Out[36]:0True1False2True3False4False5True6FalseName:values,dtype:boolIn [37]:(sign!=sign.shift()).cumsum()Out[37]:01112232425363Name:values,dtype:int64

Solution 2:

Create a groups:

g = np.sign(df['values']).diff().ne(0).cumsum()
g

Output:

01112232425363Name:values,dtype:int64

Now, use g as a groupby with cumsum

df.groupby(g).cumsum()

Output:

values01.0011.502   -2.003   -3.004   -4.2550.5061.00

Post a Comment for "Cumulative Sum On Time Series Split By Consecutive Negative Or Positive Values"