TSAP(1) : Date&Times

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u014281392/article/details/83186883

TSAP : TimeSeries Analysis with Python

维基百科不同语种网页访问量

import pandas as pd
import numpy as np

DatatimeIndex

# DatetimeIndex
date_index = pd.date_range('2016 Jul 1', periods = 5, freq = 'D')
date_index

DatetimeIndex([‘2016-07-01’, ‘2016-07-02’, ‘2016-07-03’, ‘2016-07-04’,
‘2016-07-05’],
dtype=‘datetime64[ns]’, freq=‘D’)

Timestamp, Time spans

# TIME STAMPS VS TIME SPANS
pd.Timestamp('2016-07-10')

Timestamp(‘2016-07-10 00:00:00’)

# add hours
pd.Timestamp('2016-07-10 10')

Timestamp(‘2016-07-10 10:00:00’)

# add minute
pd.Timestamp('2016-07-10 10:15')

Timestamp(‘2016-07-10 10:15:00’)

hint: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-date-components

Time Spans

# TIME SPANS(日期中最小单位)
pd.Period('2016-01')

Period(‘2016-01’, ‘M’)

pd.Period('2016-01-01')

Period(‘2016-01-01’, ‘D’)

pd.Period('2016-01-01 10')

Period(‘2016-01-01 10:00’, ‘H’)

pd.Period('2016-01-01 10:10')

Period(‘2016-01-01 10:10’, ‘T’)

pd.Period('2016-01-01 10:10:10')

Period(‘2016-01-01 10:10:10’, ‘S’)

Time Offsets(偏移)

# TIME OFFSETS
pd.Timedelta('1 day')

Timedelta(‘1 days 00:00:00’)

pd.Period('2016-01-01 10:10') + pd.Timedelta('1 day')

Period(‘2016-01-02 10:10’, ‘T’)

pd.Timestamp('2016-01-01 10:10') + pd.Timedelta('1 day')

Timestamp(‘2016-01-02 10:10:00’)

pd.Timestamp('2016-01-01 10:10') + pd.Timedelta('15 ns')

Timestamp(‘2016-01-01 10:10:00.000000015’)

Frequency Setting

# Only want business days(周一至周五)
pd.period_range('2016-01-01 10:10', freq = 'B', periods = 10)

PeriodIndex([‘2016-01-01’, ‘2016-01-04’, ‘2016-01-05’, ‘2016-01-06’,
‘2016-01-07’, ‘2016-01-08’, ‘2016-01-11’, ‘2016-01-12’,
‘2016-01-13’, ‘2016-01-14’],
dtype=‘period[B]’, freq=‘B’)

#25 hours each day.
pd.period_range('2016-01-01 10:10', freq = '25H', periods = 10)

PeriodIndex([‘2016-01-01 10:00’, ‘2016-01-02 11:00’, ‘2016-01-03 12:00’,
‘2016-01-04 13:00’, ‘2016-01-05 14:00’, ‘2016-01-06 15:00’,
‘2016-01-07 16:00’, ‘2016-01-08 17:00’, ‘2016-01-09 18:00’,
‘2016-01-10 19:00’],
dtype=‘period[25H]’, freq=‘25H’)

# 25 hours each day = 1D1H
pd.period_range('2016-01-01 10:10', freq = '1D1H', periods = 10)

PeriodIndex([‘2016-01-01 10:00’, ‘2016-01-02 11:00’, ‘2016-01-03 12:00’,
‘2016-01-04 13:00’, ‘2016-01-05 14:00’, ‘2016-01-06 15:00’,
‘2016-01-07 16:00’, ‘2016-01-08 17:00’, ‘2016-01-09 18:00’,
‘2016-01-10 19:00’],
dtype=‘period[25H]’, freq=‘25H’)

hint: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

data_range( )

rng = pd.date_range('2016 Jul 1', periods = 10, freq = 'D')
rng

DatetimeIndex([‘2016-07-01’, ‘2016-07-02’, ‘2016-07-03’, ‘2016-07-04’,
‘2016-07-05’, ‘2016-07-06’, ‘2016-07-07’, ‘2016-07-08’,
‘2016-07-09’, ‘2016-07-10’],
dtype=‘datetime64[ns]’, freq=‘D’)

pd.Series(range(len(rng)), index = rng)

2016-07-01 0
2016-07-02 1
2016-07-03 2
2016-07-04 3
2016-07-05 4
2016-07-06 5
2016-07-07 6
2016-07-08 7
2016-07-09 8
2016-07-10 9
Freq: D, dtype: int64

# 自定义时间跨度,做时间索引
periods = [pd.Period('2016-01'), pd.Period('2016-02'), pd.Period('2016-03')]
ts = pd.Series(np.random.randn(len(periods)), index = periods)
ts

2016-01 2.417748
2016-02 0.934770
2016-03 0.086499
Freq: M, dtype: float64

# buniness day
time_index = pd.date_range('2016 Jul 1', periods=3, freq='B')
pd.Series(np.random.randn(len(time_index)), index = time_index)

2016-07-01 1.166464
2016-07-04 2.334886
2016-07-05 0.702889
Freq: B, dtype: float64

# type of index
type(ts.index)

pandas.core.indexes.period.PeriodIndex

ts = pd.Series(range(10), pd.date_range('07-10-16 8:00', periods = 10, freq = 'H'))
ts

2016-07-10 08:00:00 0
2016-07-10 09:00:00 1
2016-07-10 10:00:00 2
2016-07-10 11:00:00 3
2016-07-10 12:00:00 4
2016-07-10 13:00:00 5
2016-07-10 14:00:00 6
2016-07-10 15:00:00 7
2016-07-10 16:00:00 8
2016-07-10 17:00:00 9
Freq: H, dtype: int64

ts_period = ts.to_period()
ts_period

2016-07-10 08:00 0
2016-07-10 09:00 1
2016-07-10 10:00 2
2016-07-10 11:00 3
2016-07-10 12:00 4
2016-07-10 13:00 5
2016-07-10 14:00 6
2016-07-10 15:00 7
2016-07-10 16:00 8
2016-07-10 17:00 9
Freq: H, dtype: int64

# period时间索引切片(按整点向下取整)
ts_period['2016-07-10 09:30':'2016-07-10 11:45']  

2016-07-10 09:00 1
2016-07-10 10:00 2
2016-07-10 11:00 3
Freq: H, dtype: int64

# 按整点向上取整
ts['2016-07-10 08:30':'2016-07-10 11:45']

2016-07-10 09:00:00 1
2016-07-10 10:00:00 2
2016-07-10 11:00:00 3
Freq: H, dtype: int64

猜你喜欢

转载自blog.csdn.net/u014281392/article/details/83186883
今日推荐