ARIMA
Notation
- T-n : a prior or lag time
- T: current time and point of reference
- T+n: future or forecast time
components
- Level: baseline value
- trend: often linear increasing or decreasing over time
- Seasonality: repeating patterns over time
- Noise: cannot be explained by the model
Some concerns
Sample size
Updated frequently over time or be made once and remain static
-
Down-sampling or up-sampling
- Frequency
- outliers
- Missing
-
As a supervised Machine Learning
- Sliding window with univariate time series / multivariate time series
Q&A
- (Python) Difference between autocorrelation_plot and plot_acf / plot_pacf ?
- autocorrelation_plot and plot_acf are the same
- Definition
- { Yt } 严平稳: 对一切 k 和时点 t1, t2, …, tn, 都有T_t1, T_t2, … T_tn 与T_{t1-k}, T_{t2-k}, …., T_{tn-k} 的联合分布相关
- { Yt } 弱平稳条件
- 均值函数在所有时间上恒为常数
- Gamma_{t, t-k} = gamma_{0, k}, 对所有时间 t 和 滞后 k
- { Yt } 弱平稳条件
- methods
- Line plot
- Randomly split data into 2 or more parts then check the mean and covariance
- Statistical test - ADF(augmented Dicky-Fuller test)
- Explanation
- H0: time series has a unit root, meaning is is non-stationary
- { Yt } 严平稳: 对一切 k 和时点 t1, t2, …, tn, 都有T_t1, T_t2, … T_tn 与T_{t1-k}, T_{t2-k}, …., T_{tn-k} 的联合分布相关
- Transforms
- Difference
- Log
- 当序列散度与序列值有正相关关系时,即序列的值越大,围绕该值的波动就越大
- 对数的差分通常称为收益率
- Box-Cox/幂变换
- 估计lambda
- 当lambda = 0 时,退化为log变换
- Add Seasonality
- How to interpret the key results for ARIMA :
- For each coef, the null hypothesis is that the term is not significantly different from 0, which indicates that no association exists between the term and the response.
- https://support.minitab.com/en-us/minitab/18/help-and-how-to/modeling-statistics/time-series/how-to/arima/interpret-the-results/key-results/?SID=117600
- Residuals test
- Residuals time series -> exist trend or not
- qq plot -> lies in a line
- Residuals acf graph
- Residual Ljung-box test
- 将相关系数的值作为一个组来检验,定义统计量 Q