九、 时间序列

Pandas 在对频率转换进行重新采样时拥有简单、强大且高效的功能(如将按秒采样的数据转换为按5分钟为单位进行采样的数据)。这种操作在金融领域非常常见。具体参考:时间序列

  1. In [108]: rng = pd.date_range('1/1/2012', periods=100, freq='S')
  2. In [109]: ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)
  3. In [110]: ts.resample('5Min').sum()
  4. Out[110]:
  5. 2012-01-01 25083
  6. Freq: 5T, dtype: int64

1、 时区表示:

  1. In [111]: rng = pd.date_range('3/6/2012 00:00', periods=5, freq='D')
  2. In [112]: ts = pd.Series(np.random.randn(len(rng)), rng)
  3. In [113]: ts
  4. Out[113]:
  5. 2012-03-06 0.464000
  6. 2012-03-07 0.227371
  7. 2012-03-08 -0.496922
  8. 2012-03-09 0.306389
  9. 2012-03-10 -2.290613
  10. Freq: D, dtype: float64
  11. In [114]: ts_utc = ts.tz_localize('UTC')
  12. In [115]: ts_utc
  13. Out[115]:
  14. 2012-03-06 00:00:00+00:00 0.464000
  15. 2012-03-07 00:00:00+00:00 0.227371
  16. 2012-03-08 00:00:00+00:00 -0.496922
  17. 2012-03-09 00:00:00+00:00 0.306389
  18. 2012-03-10 00:00:00+00:00 -2.290613
  19. Freq: D, dtype: float64

2、 时区转换:

  1. In [116]: ts_utc.tz_convert('US/Eastern')
  2. Out[116]:
  3. 2012-03-05 19:00:00-05:00 0.464000
  4. 2012-03-06 19:00:00-05:00 0.227371
  5. 2012-03-07 19:00:00-05:00 -0.496922
  6. 2012-03-08 19:00:00-05:00 0.306389
  7. 2012-03-09 19:00:00-05:00 -2.290613
  8. Freq: D, dtype: float64

3、 时间跨度转换:

  1. In [117]: rng = pd.date_range('1/1/2012', periods=5, freq='M')
  2. In [118]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
  3. In [119]: ts
  4. Out[119]:
  5. 2012-01-31 -1.134623
  6. 2012-02-29 -1.561819
  7. 2012-03-31 -0.260838
  8. 2012-04-30 0.281957
  9. 2012-05-31 1.523962
  10. Freq: M, dtype: float64
  11. In [120]: ps = ts.to_period()
  12. In [121]: ps
  13. Out[121]:
  14. 2012-01 -1.134623
  15. 2012-02 -1.561819
  16. 2012-03 -0.260838
  17. 2012-04 0.281957
  18. 2012-05 1.523962
  19. Freq: M, dtype: float64
  20. In [122]: ps.to_timestamp()
  21. Out[122]:
  22. 2012-01-01 -1.134623
  23. 2012-02-01 -1.561819
  24. 2012-03-01 -0.260838
  25. 2012-04-01 0.281957
  26. 2012-05-01 1.523962
  27. Freq: MS, dtype: float64

4、 时期和时间戳之间的转换使得可以使用一些方便的算术函数。

  1. In [123]: prng = pd.period_range('1990Q1', '2000Q4', freq='Q-NOV')
  2. In [124]: ts = pd.Series(np.random.randn(len(prng)), prng)
  3. In [125]: ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9
  4. In [126]: ts.head()
  5. Out[126]:
  6. 1990-03-01 09:00 -0.902937
  7. 1990-06-01 09:00 0.068159
  8. 1990-09-01 09:00 -0.057873
  9. 1990-12-01 09:00 -0.368204
  10. 1991-03-01 09:00 -1.144073
  11. Freq: H, dtype: float64