Abstract
In previous chapters, we have mainly discussed copula models for bivariate/multivariate random variables. Now we ask two other questions that usually arise in hydrology and water resources engineering. Can we use the stochastic approach to predict streamflow at a downstream location using streamflow at the upstream location? If streamflow is time dependent, then it cannot be considered as a random variable as is done in frequency analysis. Can we model the temporal dependence of an at-site streamflow sequence (e.g., monthly streamflow) more robustly than with the classical time series and Markov modeling approach (e.g., modeling the nonlinearity of time series freely)? This chapter attempts to address these questions and introduces how to model a time series with the use of copula approach.
9.1 General Concept of Time Series Modeling
In this section, we briefly introduce time series modeling. The reader may refer to Box et al. (2008) for a complete discussion. A time series (more specifically with even time intervals) may be stationary, nonstationary, or long-memory; linear or nonlinear. Following Box et al. (2008), a general form of a linear time series YtYt may be written as follows:
where YtYt is the time series; B is the backward operator; d is the differencing operator; d = 0d=0 for stationary; d is a positive integer (usually d = 1 or 2d=1or2) for nonstationary; d ∈ (0, 1)d∈01 for long memory time series; ϕ(B) = 1 − ϕ1B − ϕ2B2 − … − ϕpBpϕB=1−ϕ1B−ϕ2B2−···−ϕpBp is the autoregressive term; θ(B) = 1 + θ1B + θ2B2 + … + θqBqθB=1+θ1B+θ2B2+···+θqBq is the moving average term; and atat is the innovation (i.e., white noise and more specifically white Gaussian noise).
The classic time series model given in Equation (9.1) may be identified with the following procedures:
1. Graph the sample autocorrelation (ACF) and partial autocorrelation (PACF) function for time series {Xt : t = 1, …, n}Xt:t=1…n.
2. Identify the possible model order from sample ACF and PACF, if the visual evidences are observed:
i. If sample ACF falls into the 95% confidence bound quickly, then the time series XtXt may be considered stationary (shown in Figure 9.1(a)); otherwise, the time series is nonstationary or long memory (Figure 9.1(b)), and differencing is needed to convert a nonstationary time series into the stationary time series (Figure 9.1(c)).
ii. With the stationary time series, the model order may then be estimated from the sample ACF and PACF as follows: (a) if the cutoff point in ACF with the PACF falls into the 95% confidence bound, we will have moving average (MA) time series model (Figure 9.2(a)); (b) if the cutoff point in PACF with the ACF falls into the 95% confidence bound, we will have autoregressive (AR) time series model (Figure 9.2(b)); and (c) if both ACF and PACF fall into the 95% confidence bound, we will have an autoregressive and moving average (ARMA) time series model (Figure 9.2(c)).
3. Estimate the model parameters for the stationary time series with the assumption of model residual: at~N0σa2.
Figure 9.1 Sample autocorrelation function illustration plots.
With the preceding initial introduction, we will now further illustrate Equation (9.1) using streamflow as an example. It is supposed that the differencing order, d = 0, occurs most likely for a watershed before experiencing climate change and/or alteration by human activities; d = 1 occurs most likely for the watershed with these impacts; and d ∈ (0, 0.5)d∈00.5 occurs usually for reservoir operations. In other words, the original stationary streamflow series (or the stationary streamflow series after necessary differencing) at time t is dependent on the value at previous p times (i.e., it depends on the streamflow at t − 1, t − 2, …, t − pt−1,t−2,…,t−p). Constant c relates to the long-term average of the stationary series given in Equation (9.2b). θ(B) = 1 + θ1B + θ2B2 + … + θqBqθB=1+θ1B+θ2B2+…+θqBq represents the moving average term. Replacing wt = (1 − B)dYtwt=1−BdYt such that YtYt is now written as stationary time series after necessary differencing, Equation (9.1) may be rewritten as follows:
Taking the expectation of Equation (9.2), we have the following:
Substituting E(wt) = E(wt − i), i = 1, …, pEwt=Ewt−i,i=1,…,p, and E(at) = E(at − j) = 0Eat=Eat−j=0 for the stationary time series into Equation (9.2a), we have the following:
To further evaluate if differencing is necessary, two statistical tests can help make a reasonable and formal decision. The Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test (1992) has the null hypothesis of time series being stationary, while the augmented Dickey–Fuller (ADF) test (Dickey and Fuller, 1979) has the null hypothesis of time series as a unit root process (or is simply called nonstationary). The KPSS and ADF tests are complementary to each other (Arya and Zhang, 2015) as follows:
i. Time series being stationary: acceptance by KPSS test while rejection by ADF test.
ii. Time series being nonstationary: rejection by KPSS test while acceptance by ADF test.
iii. Time series belonging to a long memory process: rejection by both KPSS and ADF tests. In this case, the Hurst coefficient (Hurst, 1951) is applied to evaluate the necessary fractional differencing order.
iv. Not enough evidence to decide whether the time series is stationary or nonstationary: acceptance by both KPSS and ADF tests.
Furthermore, if there exists heteroscedasticity (i.e., changing variance) in the time series such that the time series tends to have a large value following a large value and a small value following a small value as a simple illustration. Then, for the time series with heteroscedasticity, the model error of Equation (9.1) needs to be further revised using (Generalized) Autoregressive Conditional Heteroscedastic (G) (ARCH) models. A (G)ARCH model indicates a second-order dependent time series. In other words, the conditional variability depends on the past history of the time series. An ARCH model can be written as follows:
and a Generalized ARCH (Bollerslev, 1986) model can be written as follows:
In Equations (9.3) and (9.4), htht denotes the conditional variance (variability) of atat given at − 1, at − 2, …at−1,at−2,…; w0>0, wi ≥ 0, i = 1, …, s, qj ≥ 0, j = 1, …, rw0>0,wi≥0,i=1,…,s,qj≥0,j=1,…,r; wiwi denotes the coefficients of the ARCH effects (i.e., for the correlated squared model errors atat); and qiqi denotes the coefficients of the correlated conditional variance htht.
In addition, there exists a relation among conditional variance (ht)ht), innovation (i.e., model residual at)at) and standard white Guassian noise (et, et~N(0, 1))et,et~N01) as follows:
The parameters of the time series model may be estimated with the use of the maximum likelihood method.
Plot the original time series and residual sequence. Is the residual sequence a white Gaussian noise?
Year | Flow (cfs) | Year | Flow (cfs) |
---|---|---|---|
1960 | 517.9 | 1988 | 280.8 |
1961 | 367.8 | 1989 | 500.9 |
1962 | 252.2 | 1990 | 528.9 |
1963 | 258.6 | 1991 | 602.8 |
1964 | 281.3 | 1992 | 386 |
1965 | 308.4 | 1993 | 627.2 |
1966 | 317.3 | 1994 | 520 |
1967 | 372.4 | 1995 | 345.3 |
1968 | 349.1 | 1996 | 575 |
1969 | 504.3 | 1997 | 663.2 |
1970 | 330.9 | 1998 | 412.4 |
1971 | 413.3 | 1999 | 311 |
1972 | 461.7 | 2000 | 385.3 |
1973 | 567 | 2001 | 299.3 |
1974 | 500.6 | 2002 | 417.5 |
1975 | 654.9 | 2003 | 578.5 |
1976 | 550.3 | 2004 | 715.8 |
1977 | 401 | 2005 | 676.3 |
1978 | 593.8 | 2006 | 507 |
1979 | 508 | 2007 | 736 |
1980 | 543.5 | 2008 | 677.4 |
1981 | 442.3 | 2009 | 508.9 |
1982 | 477.3 | 2010 | 418.3 |
1983 | 473 | 2011 | 721.2 |
1984 | 548.3 | 2012 | 609 |
1985 | 467.7 | 2013 | 536.3 |
1986 | 539.3 | 2014 | 608.1 |
1987 | 431.7 | 2015 | 552.4 |
Solution: To fit AR(1) to the time series listed in Table 9.1, we can simply use the MATLAB function as follows:
1. Assign the TS as the time series listed in Table 9.1.
2. Set up the AR(1) model, where we need to fit using arima:
model=arima(1,0,0); % ARIMA (P, D, Q): P=1, AR term; D=0, stationary series; Q=0, MA term.
3. Apply the estimate function (through MLE):
[param,Var,LogL]=estimate(model,TS);
% param: estimated parameter for the model defined above.
% Var: variance-covariance matrix for the parameter estimated. Here, we have 3 parameters: constant-C, autoregressive parameter, and variance of model residual.
% LogL: the loglikelihood of the objective function after optimization.
Using the preceding functions, we get the results listed in Table 9.2.
ARIMA (1,0,0) model (AR(1) model) | |||
---|---|---|---|
Conditional probability distribution: Gaussian | |||
Parameter | Value | Standard error | T-statistics |
Constant (cfs) | 255.13 | 71.80 | 3.55 |
AR{1} | 0.475 | 0.15 | 3.18 |
Variance (cfs2) | 12751.6 | 3141.41 | 4.06 |
LogL = –434.18 |
The fitted AR(1) time series model is now written as follows:
4. Apply the infer function to compute the model residual sequence listed in Table 9.3:
res=infer(param,TS);
Year | Residual (cfs) | Year | Residual (cfs) |
---|---|---|---|
1960 | 24.90 | 1988 | –179.29 |
1961 | –133.22 | 1989 | 112.45 |
1962 | –177.55 | 1990 | 35.95 |
1963 | –116.27 | 1991 | 96.56 |
1964 | –96.61 | 1992 | –155.32 |
1965 | –80.29 | 1993 | 188.81 |
1966 | –84.25 | 1994 | –32.91 |
1967 | –33.38 | 1995 | –156.71 |
1968 | –82.84 | 1996 | 155.93 |
1969 | 83.43 | 1997 | 135.07 |
1970 | –163.66 | 1998 | –157.60 |
1971 | 1.07 | 1999 | –139.93 |
1972 | 10.34 | 2000 | –17.49 |
1973 | 92.67 | 2001 | –138.76 |
1974 | –23.73 | 2002 | 20.27 |
1975 | 162.10 | 2003 | 125.15 |
1976 | –15.76 | 2004 | 186.01 |
1977 | –115.40 | 2005 | 81.33 |
1978 | 148.28 | 2006 | –69.22 |
1979 | –29.05 | 2007 | 240.16 |
1980 | 47.18 | 2008 | 72.84 |
1981 | –70.87 | 2009 | –67.84 |
1982 | 12.18 | 2010 | –78.44 |
1983 | –8.74 | 2011 | 267.47 |
1984 | 68.60 | 2012 | 11.46 |
1985 | –47.75 | 2013 | –7.97 |
1986 | 62.12 | 2014 | 98.35 |
1987 | –79.48 | 2015 | 8.56 |
Figure 9.3 plots the original time series, fitted model residuals, and the histogram compared to the hypothesized white Gaussian noise. From the histogram plot, it seems that the hypothesized distribution may properly represent the distribution of the fitted model residuals. To formally assess whether the fitted model residuals are a white Gaussian noise, we apply the Kolmogorov–Smirnov (KS) test. The KS test evaluates the maximum distance of empirical and parametric CDFs. Its test statistic Dn can be written as follows:
The null hypothesis (H0) is that the fitted model residuals follow N0σe2,i.e.,N012751.6. With this null hypothesis, we can either use the parametric bootstrap method or MATLAB function kstest directly. Here we will simply use the MATLAB function kstest. Table 9.4 lists the empirical and parametric CDFs for the fitted model residuals.
Figure 9.3 Original time series, fitted model residual plots, and histogram.
Residual | Parametric CDF | Empirical CDF | Residual | Parametric CDF | Empirical CDF |
---|---|---|---|---|---|
24.90 | 0.59 | 0.63 | –179.29 | 0.06 | 0.02 |
–133.22 | 0.12 | 0.16 | 112.45 | 0.84 | 0.82 |
–177.55 | 0.06 | 0.04 | 35.95 | 0.62 | 0.65 |
–116.27 | 0.15 | 0.18 | 96.56 | 0.80 | 0.79 |
–96.61 | 0.20 | 0.21 | –155.32 | 0.08 | 0.11 |
–80.29 | 0.24 | 0.26 | 188.81 | 0.95 | 0.95 |
–84.25 | 0.23 | 0.23 | –32.91 | 0.39 | 0.40 |
–33.38 | 0.38 | 0.39 | –156.71 | 0.08 | 0.09 |
–82.84 | 0.23 | 0.25 | 155.93 | 0.92 | 0.89 |
83.43 | 0.77 | 0.75 | 135.07 | 0.88 | 0.86 |
–163.66 | 0.07 | 0.05 | –157.60 | 0.08 | 0.07 |
1.07 | 0.50 | 0.53 | –139.93 | 0.11 | 0.12 |
10.34 | 0.54 | 0.56 | –17.49 | 0.44 | 0.46 |
92.67 | 0.79 | 0.77 | –138.76 | 0.11 | 0.14 |
–23.73 | 0.42 | 0.44 | 20.27 | 0.57 | 0.61 |
162.10 | 0.92 | 0.91 | 125.15 | 0.87 | 0.84 |
–15.76 | 0.44 | 0.47 | 186.01 | 0.95 | 0.93 |
–115.40 | 0.15 | 0.19 | 81.33 | 0.76 | 0.74 |
148.28 | 0.91 | 0.88 | –69.22 | 0.27 | 0.33 |
–29.05 | 0.40 | 0.42 | 240.16 | 0.98 | 0.96 |
47.18 | 0.66 | 0.67 | 72.84 | 0.74 | 0.72 |
–70.87 | 0.27 | 0.32 | –67.84 | 0.27 | 0.35 |
12.18 | 0.54 | 0.60 | –78.44 | 0.24 | 0.30 |
–8.74 | 0.47 | 0.49 | 267.47 | 0.99 | 0.98 |
68.60 | 0.73 | 0.70 | 11.46 | 0.54 | 0.58 |
–47.75 | 0.34 | 0.37 | –7.97 | 0.47 | 0.51 |
62.12 | 0.71 | 0.68 | 98.35 | 0.81 | 0.81 |
–79.48 | 0.24 | 0.28 | 8.56 | 0.53 | 0.54 |
Applying kstest as follows
[H,Pvalue,stat]=kstest(res,[res,normcdf(res,0,param.Variance^0.5)],0.05)
we have H = 0, Pvalue = 0.803, and test statistic = 0.083. With null hypothesis being accepted and Pvalue > 0.05, we show that the fitted model residual is a white Gaussian noise.
9.2 Spatially Dependent Bivariate or Multivariate Time Series
In stock exchanges, the stock values among major exchanges (e.g., London, Hong Kong, New York, Tokyo) always impact one another. In other words, these major exchanges have a tendency to follow each other. In the field of hydrology and water resources engineering, there also exists a similar tendency (or spatial dependence). For example, streamflow (or flood) at a downstream location is generally positively dependent on that at the upstream location. In this section, we will show how to evaluate spatially dependent time series.
As discussed in the previous chapters, copulas are applied to bivariate/multivariate independent random variables. Thus, to employ the copula theory for a bivariate (multivariate) time-dependent sequence (e.g., spatial dependence of bivariate/multivariate time series), we need to investigate each individual time series first and fit the time series with proper models (e.g., the equations in Section 9.1). Three steps are needed for bivariate/multivariate time series analysis using copulas:
1. Investigate each univariate time series separately, including the assessment of stationarity, time series model identification, and estimation of model parameters;
2. Compute model residuals from the fitted univariate time series model.
3. Apply copulas to the model residuals obtained from step 2.
In what follows, we will use a simple example to illustrate how to model a bivariate time series.
TS1 | TS2 | TS1 | TS2 | ||
---|---|---|---|---|---|
1 | 60.25 | 476.35 | 26 | 45.24 | 475.13 |
2 | 55.87 | 475.48 | 27 | 11.28 | 475.01 |
3 | 76.18 | 476.02 | 28 | 30.96 | 475.44 |
4 | 74.84 | 476.75 | 29 | 78.03 | 475.49 |
5 | 84.79 | 476.89 | 30 | 97.69 | 475.91 |
6 | 68.39 | 477.27 | 31 | 90.31 | 476.52 |
7 | 73.14 | 477.51 | 32 | 60.44 | 475.51 |
8 | 63.01 | 476.50 | 33 | 52.28 | 474.24 |
9 | 63.28 | 475.87 | 34 | 35.00 | 473.57 |
10 | 80.17 | 475.74 | 35 | 38.98 | 473.33 |
11 | 72.42 | 475.29 | 36 | 49.74 | 473.42 |
12 | 57.82 | 476.75 | 37 | 75.43 | 473.87 |
13 | 86.49 | 476.60 | 38 | 81.49 | 471.65 |
14 | 99.88 | 475.69 | 39 | 56.48 | 470.84 |
15 | 75.34 | 474.94 | 40 | 51.48 | 472.87 |
16 | 66.07 | 473.83 | 41 | 50.26 | 473.17 |
17 | 56.61 | 473.63 | 42 | 52.20 | 474.16 |
18 | 69.76 | 473.84 | 43 | 76.67 | 475.32 |
19 | 94.75 | 475.58 | 44 | 83.07 | 475.63 |
20 | 62.69 | 476.17 | 45 | 79.80 | 475.90 |
21 | 64.71 | 475.11 | 46 | 81.26 | 474.64 |
22 | 88.54 | 476.29 | 47 | 73.69 | 474.19 |
23 | 84.75 | 478.09 | 48 | 72.70 | 476.01 |
24 | 94.86 | 478.04 | 49 | 77.68 | 476.85 |
25 | 81.05 | 476.20 | 50 | 89.06 | 476.79 |
Solution: Applying the procedure similar to Example 9.1, the time series TS1 and TS2 are fitted with ARIMA (1,0,1) (i.e., ARMA(1,1)) and ARIMA(2,0,0) (i.e., AR(2)), respectively. The fitted parameters for the time series are given in Table 9.6. Figure 9.4 plots the original time series and empirical frequencies (histogram) for the model residuals.
TS1-ARIMA(1,0,1) model: | |||
conditional probability distribution: Gaussian | |||
Parameter | Value | Standard error | T statistics |
Constant | 55.24 | 13.18 | 4.19 |
AR{1} | 0.21 | 0.19 | 1.09 |
MA{1} | 0.66 | 0.15 | 4.34 |
Variance | 185.21 | 42.53 | 4.36 |
TS2-ARIMA(2,0,0) model: | |||
conditional probability distribution: Gaussian | |||
Parameter | Value | Standard error | T statistics |
Constant | 125.18 | 38.95 | 3.16 |
AR{1} | 1.1 | 0.15 | 7.15 |
AR{2} | –0.36 | 0.16 | –2.3 |
Variance | 0.7 | 0.12 | 5.63 |
Figure 9.4 Plots of original time series and histograms of the fitted model residuals.
The acceptance by the KS test for the fitted model residuals indicates that the model residuals belong to the white noise.
We can write the residuals as a function of time series:
Now, we may apply copula to restTS1,restTS2 that are considered as independent random variables. Figure 9.5 shows the scatter plot of the random variables.
Figure 9.5 Scatter plot of the fitted model residuals: restT1 and restT2.
From Figure 9.5, it is seen the fitted model residuals are positively correlated. Using Equations (3.70) and (3.73), the empirical rank-based correlation coefficients, Spearman’s ρ, and Kendall’s τ are computed as ρn≈0.28, τn≈0.16ρn≈0.28,τn≈0.16. To this end, we apply the Archimedean copulas (i.e., Gumbel–Hougaard, Clayton, and Frank copulas presented in Chapter 4) and meta-elliptical copulas (i.e., the Gaussian and Student t copulas presented in Chapter 7). Similar to the discussion in the previous chapters, we apply the pseudo (i.e., semiparametric) and two-stage maximum likelihood methods for parameter estimation. Table 9.7 lists the empirical and parametric CDFs of the fitted model residuals. Table 9.8 lists the parameters and corresponding estimated likelihood values. The likelihood values in Table 9.8 suggest that the Frank copula attains the best overall performance, followed by Gaussian copula. Figure 9.6 compares the CDF of the model residuals to the simulated random variates from the fitted parametric copulas. Figure 9.7 compares the model residuals to that computed from the two-stage estimation. Comparison indicates a similar performance between the Frank and Gaussian copulas.
restT1 | restT2 | Empirical CDF | Parametric CDF | ||
---|---|---|---|---|---|
restT1 | restT2 | restT1 | restT2 | ||
–9.03 | 0.13 | 0.25 | 0.59 | 0.25 | 0.56 |
–5.91 | –0.59 | 0.31 | 0.20 | 0.33 | 0.24 |
13.25 | 0.89 | 0.78 | 0.86 | 0.83 | 0.86 |
–4.93 | 0.72 | 0.39 | 0.82 | 0.36 | 0.80 |
17.28 | 0.24 | 0.92 | 0.67 | 0.90 | 0.61 |
–15.82 | 0.73 | 0.14 | 0.84 | 0.12 | 0.81 |
14.15 | 0.60 | 0.80 | 0.80 | 0.85 | 0.76 |
–16.72 | –0.53 | 0.12 | 0.24 | 0.11 | 0.26 |
5.99 | 0.03 | 0.67 | 0.57 | 0.67 | 0.51 |
7.86 | 0.23 | 0.69 | 0.65 | 0.72 | 0.61 |
–4.62 | –0.29 | 0.41 | 0.35 | 0.37 | 0.36 |
–9.39 | 1.60 | 0.24 | 0.94 | 0.24 | 0.97 |
25.46 | –0.31 | 0.98 | 0.33 | 0.97 | 0.36 |
9.94 | –0.53 | 0.71 | 0.22 | 0.77 | 0.26 |
–7.15 | –0.33 | 0.27 | 0.31 | 0.30 | 0.35 |
–0.08 | –0.94 | 0.51 | 0.14 | 0.50 | 0.13 |
–12.28 | –0.19 | 0.18 | 0.37 | 0.18 | 0.41 |
10.88 | –0.16 | 0.73 | 0.43 | 0.79 | 0.42 |
17.88 | 1.28 | 0.94 | 0.90 | 0.91 | 0.94 |
–23.98 | 0.02 | 0.06 | 0.53 | 0.04 | 0.51 |
12.28 | –1.06 | 0.76 | 0.10 | 0.82 | 0.10 |
11.79 | 1.51 | 0.75 | 0.92 | 0.81 | 0.96 |
3.38 | 1.62 | 0.55 | 0.96 | 0.60 | 0.97 |
19.83 | 0.01 | 0.96 | 0.51 | 0.93 | 0.50 |
–6.93 | –1.12 | 0.29 | 0.06 | 0.31 | 0.09 |
–22.24 | –0.19 | 0.08 | 0.39 | 0.05 | 0.41 |
–38.68 | 0.22 | 0.02 | 0.61 | 0.00 | 0.60 |
–1.12 | 0.38 | 0.47 | 0.73 | 0.47 | 0.68 |
17.11 | –0.08 | 0.90 | 0.47 | 0.90 | 0.46 |
14.99 | 0.45 | 0.86 | 0.75 | 0.86 | 0.70 |
4.93 | 0.60 | 0.59 | 0.78 | 0.64 | 0.76 |
–16.78 | –0.93 | 0.10 | 0.16 | 0.11 | 0.13 |
–4.43 | –0.85 | 0.43 | 0.18 | 0.37 | 0.15 |
–28.16 | –0.51 | 0.04 | 0.25 | 0.02 | 0.27 |
–4.96 | –0.45 | 0.37 | 0.27 | 0.36 | 0.30 |
–10.31 | –0.35 | 0.22 | 0.29 | 0.22 | 0.34 |
16.67 | –0.08 | 0.88 | 0.49 | 0.89 | 0.46 |
–0.37 | –2.76 | 0.49 | 0.02 | 0.49 | 0.00 |
–15.41 | –0.97 | 0.16 | 0.12 | 0.13 | 0.12 |
–5.32 | 1.16 | 0.35 | 0.88 | 0.35 | 0.92 |
–12.14 | –1.08 | 0.20 | 0.08 | 0.19 | 0.10 |
–5.46 | 0.32 | 0.33 | 0.69 | 0.34 | 0.65 |
14.21 | 0.49 | 0.82 | 0.76 | 0.85 | 0.72 |
2.57 | –0.11 | 0.53 | 0.45 | 0.57 | 0.45 |
5.65 | 0.23 | 0.63 | 0.63 | 0.66 | 0.61 |
5.76 | –1.21 | 0.65 | 0.04 | 0.66 | 0.07 |
–2.19 | -0.18 | 0.45 | 0.41 | 0.44 | 0.42 |
3.63 | 1.68 | 0.57 | 0.98 | 0.61 | 0.98 |
4.98 | 0.36 | 0.61 | 0.71 | 0.64 | 0.67 |
14.43 | 0.03 | 0.84 | 0.55 | 0.86 | 0.51 |
Copula | Pseudo | MLE | Semiparametric | MLE |
---|---|---|---|---|
Gumbel–Hougaard | 1.15 | 0.75 | 1.13 | 0.61 |
Clayton | 0.31 | 0.88 | 0.08 | 0.15 |
Frank | 1.63 | 1.79 | 1.56 | 1.69 |
Gaussian | 0.26 | 1.30 | 0.20 | 0.99 |
Student t | [0.26, 1.38 × 107]0.261.38×107 | 1.30 | [0.20, 1.38 × 107]0.201.38×107 | 0.99 |
Figure 9.6 Comparison of simulated random variates to pseudo-observations and fitted parametric marginals.
Figure 9.7 Comparison of fitted model residuals to those computed from copula with parameter estimated using two-stage MLE.
From this example, we also note that the rank-based correlation coefficient of the model residuals may be different from that of the original time series. In this example, we have τn≈0.16τn≈0.16 for the model residuals, while τn≈0.35τn≈0.35 for the original time series. The reduction of the degree of association for the model residuals may be due to the autoregressive component of time series modeling.
From Equation (9.8), we can also reconstruct the time series with the use of random variates from the simulated copula. Here, we will again use copulas with two-stage MLE as an example. Additionally, we will use the last two values of the time series (i.e., Table 9.5) as initial estimates. Figure 9.8 plots the reconstructed time series, which shows that the reconstructed time series reasonably follows the same pattern as does the original time series.
Figure 9.8 Reconstructed time series using the copula estimated with two-stage MLE.
Now, we have explained how to study the spatial dependence for the sequence with time dependence. In the previous example, we studied time-dependent sequences and spatial dependence of the sequences separately. We first built the time series (i.e., the autoregressive and moving average) model for each univariate time-dependent sequence. Then, we built the copula model on the residual (also called innovation) of the time series model, since the residuals are now random variables.
Copula modeling can also be applied to study the serial dependence of univariate time series, in addition to studying the previously discussed bivariate/multivariate spatial-temporal dependent time series (i.e., spatial dependence for the time-dependent sequence). In the following section, we will introduce how to model the serial dependence of univariate time series.
9.3 Copula Modeling for Univariate Time Series with Serial Dependence: General Discussion
Darsow, et al. (1992) introduced the condition (i.e., equivalent to Chapman–Kolmogorov equations) for a copula-based time series to be a Markov process. Joe (1997) introduced a class of parametric stationary Markov models based on parametric copulas and parametric marginal distributions. Similar to the copula application in bivariate or multivariate frequency analysis discussed previously, a copula-based time series model also allows one to consider serial dependence and marginal behavior of the time series investigated separately.
Following Joe (1997) and Chen and Fan (2006), copulas can be applied to the stationary time series of (i) Markov chain models (both discrete and continuous, including autoregressive models); (ii) K-dependent time series models (i.e., moving average models with order k); (iii) convolution-closed infinitely divisible univariate marginal models. For stationary time series {Zt : t = 1, 2, …}Zt:t=12….
Let {et}et be i.i.d. random variables that are independent of {Zt − 1, Zt − 2…}Zt−1Zt−2… (i.e., the innovation of the time series {Zt : t = 1, 2, …Zt:t=1,2,…}). We may express the preceding three cases by using one of the following models (Joe, 1997).
Markov Chain models:
➣ Kth-order autoregressive model: