Abstract
This chapter discusses how to apply copulas in water quality analysis. For monthly water quality observations, applications will include (i) a copula-based Markov process to study the water quality sequence with temporal dependence; and (ii) a copula-based multivariate water quality time series analysis. This chapter is in line with Chapter 9.
12.1 Case-Study Sites
According to the availability of water quality data, two watersheds are used as a case study. One watershed is a natural watershed, while the other is an urban watershed. The watershed boundaries and streams data were retrieved from the NHDPlus High Resolution National Hydrography Dataset and Watershed Boundary Dataset (https://nhd.usgs.gov/). The land use and land cover (LULC) were retrieved from the National Land Cover Database (www.nrlc.gov).
12.1.1 Snohomish River Watershed
According to the Department of Ecology of the State of Washington (ecy.wa.gov), the Snohomish River is formed near the city of Monroe, where the Skykomish and Snohomish rivers meet. The Snohomish River continues its way through the estuary of the city of Snohomish before entering into Puget Sound. The Snohomish watershed covers an area of 1,978 square miles (about 5,123 square kilometers) and provides important water recreation activities. In the past, agriculture and forest were two main LULC within the watershed; however, throughout the last century, more human activity has been introduced into the watershed. The Department of Ecology clearly stated (ecy.wa.gov): “over the last century, diking and other engineering activities in the lower part of the basin greatly changed how water is stored and managed in floodplain areas. More recently, cities and suburban areas have grown rapidly, creating more change to the natural water cycle.”
Besides the change in the natural water cycle induced by human activities, water quality issues (including but not limited to bacteria, dissolved oxygen (DO), temperature, and pH) have also been identified for some areas. Four stations located in the Snohomish watershed are selected for the case study: A90, C70, D50, and D130 (shown in Figure 12.1). The total persulfate nitrogen (TPN) and DO at C70 are chosen as the targeting water quality parameters for the temporal dependence case study. DO at all four stations is chosen for the spatial dependence study.
Figure 12.1 Snohomish watershed map and its LULC in 2011(retrieved from USGS and NLCD).
12.1.2 Chattahoochee River Watershed
As a tributary of Apalachicola River, the Chattahoochee River originates south of the Alabama and Georgia border and joins the Apalachicola River at the Georgia and Florida border. The Chattahoochee River watershed is the largest subwatershed of Apalachicola–Chattahoochee–Flint river basin.
The city of Atlanta is located within the watershed. There are gauging stations upstream and downstream of metropolis, i.e., the Belton Bridge station (USGS02332017, upstream) and Whitesburg station (USGS02338000, downstream). The subwatershed upstream of the Bridge station may be classified as the forest watershed. With the major metropolitan area – the city of Atlanta – the subwatershed upstream of Whitesburg is more developed (by 2011, the developed land alone accounts for about 34%) and may be considered the urban watershed (shown in Figure 12.2). The targeting water quality parameters are total nitrogen (TKN, mg/L), DO, and phosphorus (mg/L).
Figure 12.2 Chattahoochee River watershed upstream of the Whitesburg station and its LULC in 2011 (retrieved from USGS and NLCD).
12.2 Dependence Study at the Snohomish River Watershed
In this section, we will investigate the temporal and spatial–temporal dependence for the water quality parameters at the case-study site of the Snohomish River watershed. For the case study of the Snohomish River watershed, monthly TPN (C70 only) and DO (C70, D50, D130, and A90) are applied for the study. The monthly TPN and DO at station C70 are used for the study of temporal dependence. The monthly DO at all four of these stations is applied to study the spatial–temporal dependence.
12.2.1 Study of Temporal Dependence Using Copulas
Temporal Dependence of Monthly TPN and DO at Station C70
The TPN and DO at station C70 are chosen for the study of temporal dependence. Table 12.1 lists the dataset for both temporal and spatial dependence study. We will use the water quality data before 2012 to build a copula-based Markov process model, and the water quality data of 2012 and 2013 will be used for model validation purpose.
Dates | TPN (C70) | DO (C70) | DO (D50) | DO (D130) | DO (A90) |
---|---|---|---|---|---|
Oct-94 | 0.116 | 11.3 | 10.8 | 10.5 | 10.9 |
Nov-94 | 0.312 | 12 | 11.7 | 11.9 | 11.9 |
Dec-94 | 0.287 | 12.7 | 12.6 | 12.6 | 12.6 |
Jan-95 | 0.285 | 12.8 | 11.9 | 12.3 | 12.15 |
Feb-95 | 0.21 | 12.4 | 12.4 | 12.8 | 12.8 |
Mar-95 | 0.212 | 12.1 | 11.6 | 11.8 | 11.7 |
Apr-95 | 0.141 | 12.2 | 11.4 | 11.9 | 11.6 |
May-95 | 0.081 | 11.6 | 10.4 | 11.1 | 10.8 |
Jun-95 | 0.086 | 11.2 | 10.2 | 10.9 | 10.6 |
Jul-95 | 0.066 | 10.2 | 9.2 | 9.4 | 9.7 |
Aug-95 | 0.117 | 10.5 | 9.6 | 9.8 | 9.8 |
Sep-95 | 0.125 | 10.3 | 9.6 | 9.3 | 9 |
Oct-95 | 0.253 | 11 | 10.4 | 10.8 | 10.6 |
Nov-95 | 0.178 | 12.2 | 10.9 | 12 | 11.5 |
Dec-95 | 0.203 | 12.5 | 11.4 | 12.1 | 11.8 |
Jan-96 | 0.223 | 12.8 | 11.9 | 12.4 | 12.3 |
Feb-96 | 0.155 | 12.4 | 12.3 | 12.2 | 12.2 |
Mar-96 | 0.119 | 12.6 | 11.7 | 12.2 | 12 |
Apr-96 | 0.114 | 11.7 | 10.8 | 11.3 | 11.1 |
May-96 | 0.09 | 12 | 11.3 | 11.7 | 11.6 |
Jun-96 | 0.043 | 11.3 | 10.1 | 10.8 | 10.7 |
Jul-96 | 0.045 | 10.5 | 9.7 | 9.7 | 10 |
Aug-96 | 0.107 | 10.1 | 9.4 | 9.6 | 9.8 |
Sep-96 | 0.194 | 10.7 | 9.9 | 10.5 | 10.2 |
Oct-96 | 0.331 | 11.6 | 11.4 | 11.7 | 11.3 |
Nov-96 | 0.237 | 12.3 | 11.8 | 12.2 | 12 |
Dec-96 | 0.286 | 12.7 | 12 | 12.5 | 12.2 |
Jan-97 | 0.201 | 13 | 12.3 | 12.6 | 12.4 |
Feb-97 | 0.232 | 12.5 | 12 | 12.3 | 12 |
Mar-97 | 0.299 | 12.5 | 11.9 | 12.3 | 12.2 |
Apr-97 | 0.218 | 12.7 | 12.7 | 12.5 | 12.6 |
May-97 | 0.077 | 11.9 | 10.9 | 11.8 | 11.2 |
Jun-97 | 0.076 | 11.6 | 10.8 | 11.2 | 11.1 |
Jul-97 | 0.058 | 10.2 | 9 | 9.3 | 9.3 |
Aug-97 | 0.055 | 10.2 | 9.3 | 9.4 | 9.2 |
Sep-97 | 0.204 | 10.4 | 9.9 | 10.3 | 10 |
Oct-97 | 0.163 | 11.1 | 10.7 | 11 | 11 |
Nov-97 | 0.187 | 11.8 | 11.3 | 11.3 | 11.4 |
Dec-97 | 0.282 | 12.6 | 11.7 | 11.6 | 11.8 |
Jan-98 | 0.387 | 12.2 | 11.6 | 11 | 11.7 |
Feb-98 | 0.258 | 12.4 | 11.5 | 11.7 | 11.8 |
Mar-98 | 0.217 | 12.1 | 11.5 | 12.2 | 11.8 |
Apr-98 | 0.165 | 12 | 10.9 | 11.5 | 11.5 |
May-98 | 0.14 | 12 | 11.1 | 11.9 | 11.4 |
Jun-98 | 0.065 | 10.8 | 9.7 | 10.3 | 10.1 |
Jul-98 | 0.077 | 10.4 | 9 | 9.9 | 9.3 |
Aug-98 | 0.075 | 10.1 | 8.3 | 9.4 | 9.1 |
Sep-98 | 0.097 | 10.3 | 9.6 | 9.4 | 9.4 |
Oct-98 | 0.23 | 12.1 | 11.7 | 11.5 | 11.2 |
Nov-98 | 0.378 | 11.7 | 11.2 | 11.5 | 11.4 |
Dec-98 | 0.392 | 12.6 | 12.4 | 12.6 | 12.2 |
Jan-99 | 0.333 | 12.2 | 11.9 | 12.1 | 12 |
Feb-99 | 0.275 | 12.8 | 11.6 | 12.4 | 12.4 |
Mar-99 | 0.19 | 12.3 | 12 | 12.5 | 12 |
Apr-99 | 0.162 | 12.4 | 12.1 | 12.7 | 11.9 |
May-99 | 0.146 | 12.2 | 10.9 | 12.2 | 11.5 |
Jun-99 | 0.064 | 11.5 | 11.1 | 11.5 | 11.5 |
Jul-99 | 0.057 | 11.3 | 10.3 | 11.1 | 10.6 |
Aug-99 | 0.084 | 11.3 | 9.8 | 10.5 | 10.5 |
Sep-99 | 0.124 | 9.8 | 9.8 | 9.3 | 9 |
Oct-99 | 0.189 | 11.3 | 11 | 11.2 | 10.9 |
Nov-99 | 0.29 | 12 | 11.7 | 11.9 | 11.7 |
Dec-99 | 0.242 | 12 | 11.3 | 11.7 | 11.6 |
Jan-00 | 0.271 | 13.1 | 12 | 12.4 | 12.3 |
Feb-00 | 0.481 | 12.8 | 12.1 | 12.3 | 12.2 |
Mar-00 | 0.195 | 13 | 12.2 | 12.7 | 12.5 |
Apr-00 | 0.141 | 12.1 | 11.7 | 12.1 | 11.8 |
May-00 | 0.146 | 11.9 | 10.7 | 11.7 | 11.1 |
Jun-00 | 0.061 | 12.2 | 11.1 | 11.6 | 11.4 |
Jul-00 | 0.049 | 10.9 | 9.8 | 10.5 | 10.5 |
Aug-00 | 0.082 | 11 | 9.5 | 9.69 | 9.69 |
Sep-00 | 0.082 | 10.4 | 9.79 | 9.89 | 9.59 |
Oct-00 | 0.148 | 11.1 | 11 | 11.2 | 10.7 |
Nov-00 | 0.149 | 13.36 | 12.57 | 12.57 | 12.67 |
Dec-00 | 0.229 | 12.86 | 12.46 | 12.48 | 12.46 |
Jan-01 | 0.229 | 13.06 | 12.44 | 13.36 | 12.85 |
Feb-01 | 0.224 | 13.57 | 12.95 | 13.06 | 13.06 |
Mar-01 | 0.257 | 12.62 | 13.03 | 13.73 | 12.42 |
Apr-01 | 0.187 | 12.32 | 11.81 | 11.81 | 11.71 |
May-01 | 0.11 | 12.18 | 11.97 | 12.08 | 11.47 |
Jun-01 | 0.074 | 11.4 | 11.6 | 11.7 | 11 |
Jul-01 | 0.089 | 10.71 | 9.89 | 10.51 | 11.22 |
Aug-01 | 0.135 | 9.89 | 10 | 10 | 9.28 |
Sep-01 | 0.181 | 9.9 | 10.1 | 10 | 9.2 |
Oct-01 | 0.309 | 12.22 | 11.81 | 12.42 | 11.81 |
Nov-01 | 0.193 | 12.09 | 11.29 | 11.49 | 11.49 |
Dec-01 | 0.305 | 12.82 | 12.32 | 11.91 | 13.33 |
Jan-02 | 0.22 | 13.26 | 12.07 | 12.57 | 12.57 |
Feb-02 | 0.207 | 14.03 | 12.74 | 13.83 | 13.33 |
Mar-02 | 0.245 | 13.83 | 12.53 | 13.23 | 12.83 |
Apr-02 | 0.149 | 12.9 | 12.31 | 12.51 | 12.21 |
May-02 | 0.099 | 11.96 | 11.47 | 11.76 | 11.47 |
Jun-02 | 0.077 | 11.73 | 11.25 | 11.35 | 11.35 |
Jul-02 | 0.068 | 11 | 9.69 | 10.8 | 10.1 |
Aug-02 | 0.046 | 9.8 | 9.19 | 9.8 | 9.69 |
Sep-02 | 0.096 | 11.34 | 9.75 | 10.34 | 9.95 |
Oct-02 | 0.08 | 10.65 | 10.85 | 10.25 | 10.65 |
Nov-02 | 0.349 | 12.32 | 11.51 | 11.71 | 11.81 |
Dec-02 | 0.205 | 12.69 | 12.18 | 11.97 | 12.48 |
Jan-03 | 0.21 | 13.16 | 13.06 | 12.65 | 12.55 |
Feb-03 | 0.219 | 13.6 | 12.89 | 13.6 | 12.99 |
Mar-03 | 0.18 | 12.5 | 11.4 | 12.4 | 11.8 |
Apr-03 | 0.153 | 12.4 | 11 | 11.2 | 10.4 |
May-03 | 0.105 | 12.08 | 11.67 | 12.18 | 11.26 |
Jun-03 | 0.054 | 11.06 | 9.64 | 10.86 | 10.15 |
Jul-03 | 0.11 | 10.3 | 8.97 | 10.2 | 8.87 |
Aug-03 | 0.094 | 9.2 | 9.6 | 9 | 8.8 |
Sep-03 | 0.19 | 10.6 | 10 | 10.1 | 9.19 |
Oct-03 | 0.25 | 10.9 | 10.7 | 11.11 | 10.8 |
Nov-03 | 0.2745 | 12.46 | 11.71 | 11.91 | 11.91 |
Dec-03 | 0.23 | 12.56 | 12.16 | 12.56 | 12.16 |
Jan-04 | 0.273 | 13.06 | 12.36 | 12.56 | 12.46 |
Feb-04 | 0.193 | 12.7 | 11.7 | 12.5 | 12.1 |
Mar-04 | 0.15 | 12.3 | 11.7 | 11.7 | 11.8 |
Apr-04 | 0.12 | 11.9 | 11.2 | 11.8 | 11.1 |
May-04 | 0.08 | 11.7 | 10.8 | 12 | 11.2 |
Jun-04 | 0.073 | 10.6 | 9.69 | 10.5 | 9.8 |
Jul-04 | 0.076 | 11.1 | 8.8 | 9.4 | 8.8 |
Aug-04 | 0.12 | 9.4 | 8.69 | 8.69 | 8.6 |
Sep-04 | 0.15 | 11.11 | 10.5 | 11.11 | 10.7 |
Oct-04 | 0.18 | 11.2 | 11.2 | 11.3 | 11.2 |
Nov-04 | 0.2 | 11.7 | 11 | 11.3 | 11.3 |
Dec-04 | 0.24 | 12.2 | 12.53 | 12.1 | 11.8 |
Jan-05 | 0.15 | 12.5 | 11 | 12.3 | 11.6 |
Feb-05 | 0.19 | 13.2 | 12.1 | 12.3 | 12.5 |
Mar-05 | 0.251 | 12.2 | 11.8 | 12.1 | 11.3 |
Apr-05 | 0.2 | 12 | 11.7 | 12 | 11.5 |
May-05 | 0.094 | 11.5 | 11.4 | 12 | 10.7 |
Jun-05 | 0.087 | 11.4 | 10.5 | 11.1 | 10.5 |
Jul-05 | 0.13 | 10.19 | 9.3 | 9.69 | 8.9 |
Aug-05 | 0.13 | 9.31 | 8.81 | 9.21 | 8.51 |
Sep-05 | 0.13 | 10.19 | 9.69 | 9.6 | 9.5 |
Oct-05 | 0.19 | 10.8 | 10.4 | 10.7 | 10.19 |
Nov-05 | 0.329 | 12.6 | 12 | 12.3 | 12.4 |
Dec-05 | 0.22 | 13.3 | 12.9 | 13 | 13.1 |
Jan-06 | 0.263 | 12.8 | 11.7 | 12.4 | 12.3 |
Feb-06 | 0.24 | 13.3 | 11.9 | 12.7 | 12.4 |
Mar-06 | 0.253 | 13.1 | 11.6 | 12.4 | 12.3 |
Apr-06 | 0.2 | 12.5 | 11.9 | 12.5 | 12.2 |
May-06 | 0.1 | 12.2 | 10.8 | 11.8 | 11.2 |
Jun-06 | 0.052 | 11.9 | 10.8 | 11.6 | 11.2 |
Jul-06 | 0.067 | 11.1 | 9.5 | 10.4 | 9.9 |
Aug-06 | 0.094 | 9.5 | 9.19 | 8.9 | 9.9 |
Sep-06 | 0.11 | 10.4 | 10 | 9.8 | 9.5 |
Oct-06 | 0.308 | 10.7 | 10.6 | 10.6 | 10.3 |
Nov-06 | 0.2645 | 12.61 | 11.2 | 11.8 | 11.4 |
Dec-06 | 0.23 | 12.19 | 13 | 12.9 | 12.8 |
Jan-07 | 0.24 | 13 | 12.3 | 12.7 | 12.8 |
Feb-07 | 0.14 | 12.9 | 11.8 | 12.7 | 12.2 |
Mar-07 | 0.12 | 12.8 | 12.4 | 12.8 | 12.3 |
Apr-07 | 0.099 | 12.6 | 11.6 | 12.4 | 11.9 |
May-07 | 0.061 | 12.26 | 11.25 | 11.85 | 11.95 |
Jun-07 | 0.073 | 11.7 | 11.2 | 11.2 | 11.5 |
Jul-07 | 0.1 | 10.5 | 9.4 | 9.8 | 9.9 |
Aug-07 | 0.098 | 10 | 8.9 | 9.4 | 9.19 |
Sep-07 | 0.12 | 11.18 | 10.29 | 9.9 | 9.5 |
Oct-07 | 0.25 | 11.8 | 11.9 | 11.8 | 11.5 |
Nov-07 | 0.2 | 12.63 | 12.13 | 12.33 | 12.33 |
Dec-07 | 0.24 | 12.5 | 11.6 | 11.9 | 11.9 |
Jan-08 | 0.28 | 13.23 | 12.33 | 12.33 | 12.63 |
Feb-08 | 0.2 | 12.95 | 12.04 | 12.44 | 12.34 |
Mar-08 | 0.2 | 13.1 | 12 | 12.4 | 12.3 |
Apr-08 | 0.215 | 12.63 | 11.94 | 12.33 | 12.03 |
May-08 | 0.13 | 12.7 | 12.1 | 12.5 | 12.3 |
Jun-08 | 0.098 | 12.1 | 11 | 12 | 11.4 |
Jul-08 | 0.06 | 11.1 | 10.1 | 10.4 | 10.7 |
Aug-08 | 0.055 | 10 | 8.69 | 9.1 | 9.19 |
Sep-08 | 0.12 | 11.1 | 10 | 10.19 | 9.8 |
Oct-08 | 0.18 | 12.3 | 11.8 | 11.8 | 11.7 |
Nov-08 | 0.21 | 9.6 | 11.7 | 11.4 | 10.5 |
Dec-08 | 0.216 | 13.6 | 13.4 | 13.6 | 13.6 |
Jan-09 | 0.212 | 13.5 | 12.4 | 12.9 | 12.9 |
Feb-09 | 0.19 | 12.3 | 11.5 | 11.4 | 12.4 |
Mar-09 | 0.214 | 13.4 | 12.8 | 13.5 | 12.8 |
Apr-09 | 0.227 | 12.2 | 11.6 | 12.6 | 11.6 |
May-09 | 0.091 | 12.5 | 12 | 12 | 12 |
Jun-09 | 0.04 | 11.3 | 10.4 | 10.8 | 10.6 |
Jul-09 | 0.083 | 10.1 | 9 | 9.1 | 9.19 |
Aug-09 | 0.067 | 10 | 9.5 | 9.3 | 9.5 |
Sep-09 | 0.159 | 9.4 | 9.3 | 9.19 | 8.19 |
Oct-09 | 0.327 | 10.6 | 10.19 | 10.6 | 10 |
Nov-09 | 0.163 | 12.4 | 12 | 12.3 | 12 |
Dec-09 | 0.279 | 13 | 12.6 | 12.6 | 12.7 |
Jan-10 | 0.276 | 12.5 | 11.8 | 12.1 | 12.1 |
Feb-10 | 0.182 | 12.8 | 12.6 | 12.1 | 12.6 |
Mar-10 | 0.161 | 12.4 | 11.6 | 12.2 | 12 |
Apr-10 | 0.14 | 12.3 | 11.9 | 12.4 | 11.6 |
May-10 | 0.076 | 12.1 | 11.3 | 11.9 | 10.8 |
Jun-10 | 0.049 | 11.3 | 10.6 | 11.1 | 11.1 |
Jul-10 | 0.078 | 10.3 | 9 | 9.4 | 8.8 |
Aug-10 | 0.083 | 10.4 | 9.59 | 9.89 | 9.49 |
Sep-10 | 0.109 | 11.1 | 10 | 10.4 | 10.5 |
Oct-10 | 0.17 | 10.94 | 10.74 | 10.94 | 11.34 |
Nov-10 | 0.162 | 11.85 | 11.75 | 11.75 | 11.85 |
Dec-10 | 0.267 | 13.13 | 11.6 | 12.2 | 11.9 |
Jan-11 | 0.217 | 13.1 | 12.6 | 12.8 | 13.2 |
Feb-11 | 0.161 | 12.6 | 12 | 12.2 | 12.6 |
Mar-11 | 0.224 | 12.3 | 11.9 | 11.7 | 11.8 |
Apr-11 | 0.154 | 12 | 11.8 | 11.7 | 11.9 |
May-11 | 0.095 | 12.6 | 12.3 | 12.2 | 12.2 |
Jun-11 | 0.042 | 11.87 | 11.16 | 11.47 | 11.47 |
Jul-11 | 0.04 | 11.4 | 10.8 | 10.6 | 10.8 |
Aug-11 | 0.05 | 10.65 | 9.44 | 9.64 | 10.05 |
Sep-11 | 0.149 | 10.02 | 9.12 | 9.62 | 9.82 |
Oct-11 | 0.188 | 11.12 | 10.92 | 11.22 | 10.5 |
Nov-11 | 0.233 | 13.7 | 12.8 | 13.2 | 12.7 |
Dec-11 | 0.255 | 13.6 | 13.1 | 13 | 12.8 |
Jan-12 | 0.356 | 13.2 | 12.6 | 12.8 | 12.8 |
Feb-12 | 0.191 | 12.8 | 12 | 12.3 | 12.8 |
Mar-12 | 0.233 | 12.8 | 12.2 | 12.6 | 13.2 |
Apr-12 | 0.123 | 12.83 | 12.42 | 13.03 | 12.73 |
May-12 | 0.115 | 12.73 | 11.52 | 12.73 | 12.02 |
Jun-12 | 0.094 | 12.2 | 11.9 | 11.9 | 11.5 |
Jul-12 | 0.041 | 11.2 | 10.1 | 10.5 | 10 |
Aug-12 | 0.061 | 10.8 | 9 | 9.5 | 9.6 |
Sep-12 | 0.081 | 10.8 | 10.1 | 9.5 | 10.16 |
Oct-12 | 0.21 | 11.6 | 11.3 | 11.3 | 11.2 |
Nov-12 | 0.227 | 12.6 | 11.5 | 12.2 | 11.7 |
Dec-12 | 0.42 | 12 | 11.8 | 11.9 | 12.1 |
Jan-13 | 0.337 | 13.3 | 12.9 | 12.7 | 12.7 |
Feb-13 | 0.244 | 12.5 | 11.7 | 12.1 | 11.3 |
Mar-13 | 0.175 | 12.8 | 12.2 | 12.6 | 11.7 |
Apr-13 | 0.171 | 12.2 | 11.6 | 12 | 11.8 |
May-13 | 0.067 | 11.8 | 11.2 | 11.6 | 11.3 |
Jun-13 | 0.058 | 11.4 | 10.2 | 11 | 10.3 |
Jul-13 | 0.084 | 9.8 | 8.9 | 9.2 | 9 |
Aug-13 | 0.226 | 9.6 | 8.7 | 8.7 | 8.9 |
Sep-13 | 0.177 | 10 | 9.2 | 10.3 | 9.3 |
In general, before we proceed to investigate the temporal dependence using copulas, we first evaluate whether there exists periodicity (or seasonality) in the sequence. For monthly TPN and DO, we suspect that there should exist seasonality. We can use the sample autocorrelation function plot or cumulative periodogram through spectral analysis (Box et al., 2007) to assess the seasonality.
The sample autocorrelation coefficient [γk]γk for time series xtxt at lag k can be written as follows:
The cumulative periodogram [C(fk)Cfk] for time series xtxt can be written as follows:
In Equations (12.2a) and (12.2b), I(fj)Ifj stands for the periodogram function; fj=jN,j=1,…⌊N2⌋; σx2̂ is the estimated variance for the time series.
Applying Equations (12.1) and (12.2), Figure 12.3 plots the sample autocorrelation function and cumulative periodogram for the TPN and DO at station C70. From the sample autocorrelation function plots in Figure 12.1, we clearly see that both DO and TPN have a 12-month cycle. From the cumulative periodogram plot for TPN at C70, we notice a discontinuity at frequency f=0.0833≈112. The discontinuity of cumulative periodogram indicates the existence of periodicity (or seasonality). From the cumulative periodogram plot for DO at C70, again we see the discontinuity at the same frequency as that of TPN; we see another very small discontinuity at frequency f = 0.1667≈1/6f=0.1667≈1/6, which means six-month period may also exist for the DO sequence. Comparatively speaking, the six-month subcycle is not significant, and we will only deal with the dominating 12-month periodicity for both TPN and DO sequences.
Figure 12.3 Autocorrelation and cumulative periodogram plots for original monthly TPN and DO series.
To remove the periodicity, we will introduce a simple but effective method (called the full deseasonalization method). For our monthly water quality study, we will actually remove the monthly average and monthly standard deviation from the water quality time series using the following:
In this case study, we have S = 12S=12 to show that we have monthly period. After applying Equation (12.3), we can then use the deseasonalized sequence to reevaluate whether the periodicity has been successfully removed as shown in Figure 12.4. As seen in Figure 12.4, the periodicity has been successfully removed. Table 12.2 tabulates the monthly sample mean and sample standard deviation for TPN and DO time series, respectively.
Figure 12.4 Autocorrelation and cumulative periodogram plots for deseasonalized TPN and DO series.
Month | TPN (mg/L) | DO (mg/L) | ||
---|---|---|---|---|
μ̂ | σ̂ | μ̂ | σ̂ | |
January | 0.26 | 0.06 | 12.94 | 0.36 |
February | 0.22 | 0.07 | 12.87 | 0.47 |
March | 0.21 | 0.05 | 12.67 | 0.46 |
April | 0.16 | 0.04 | 12.31 | 0.33 |
May | 0.10 | 0.03 | 12.10 | 0.35 |
June | 0.07 | 0.02 | 11.50 | 0.44 |
July | 0.07 | 0.02 | 10.65 | 0.48 |
August | 0.09 | 0.04 | 10.09 | 0.58 |
September | 0.14 | 0.04 | 10.48 | 0.53 |
October | 0.21 | 0.07 | 11.28 | 0.52 |
November | 0.24 | 0.07 | 12.21 | 0.82 |
December | 0.26 | 0.06 | 12.71 | 0.47 |
With the successful removal of periodicity, we can now proceed to study the temporal dependence using the copula-based Markov process. As stated in Chapter 9, with the application of the copula-based Markov process, the time series does not need to belong or transform to the Gaussian process. In addition, the marginals and serial dependence can be studied separately to avoid possible misidentification. Following the discussion in Sections 9.3–9.5, we will illustrate the application of the copula-based Markov process to the water quality time series. As stated in Chapter 9, the procedure involved for the copula-based Markov process is as follows:
i. Identify the Markov order for the stationary time series.
ii. Investigate the marginal distribution of the Markov process.
iii. Study the serial dependence using copula.
iv. Perform one-step ahead forecasting with the copula-based Markov process.
Identification of the Proper Markov Order for the Deseasonalized TPN and DO Time Series
The Markov order will be identified using the method discussed in Section 9.5.2. The meta-Gaussian copula is applied as the building block for the order identification purpose only. The kernel density method is applied to estimate the marginals nonparametrically. Following the order identification procedure, we obtain that the deseasonalized TPN and DO may be modeled using the first- and second-order Markov process, respectively (as listed in Table 12.3).
Variable | Ft, Ft − 1Ft,Ft−1 | Ft ∣ t − 1, Ft − 2 ∣ t − 1Ft∣t−1,Ft−2∣t−1 | Ft ∣ t − 1, t − 2, Ft − 3 ∣ t − 1, t − 2Ft∣t−1,t−2,Ft−3∣t−1,t−2 | Order | |||
---|---|---|---|---|---|---|---|
τ | p-Val | τ | p-Val | τ | p-Val | ||
TPN | 0.18 | < 0.01 | 0.02 | 0.64 | — | — | 1 |
DO | 0.14 | < 0.01 | 0.16 | < 0.01 | 0.07 | 0.14 | 2 |
With the identified Markov order, we can move on to choose the best-fitted copula functions. For the deseasonalized TPN series, the most common bivariate copulas (i.e., Gumbel, meta-Gaussian, meta-Student t, and Frank) will be selected as the candidates. For the deseasonlized DO series, the D-vine copula application to time series discussed in Chapter 9 will be selected. The pseudo-MLE discussed in Section 9.5.3 is applied for parameter estimation with the use of empirical distribution estimated from kernel densities. To illustrate the empirical distribution with the use of kernel density, we selected a simple Gaussian kernel with the bandwidth of 0.3097 and 0.3507 for deseasonalized TPN and DO, respectively. As shown in Figure 12.5, the kernel density fits the histogram very well. The CDF computed from kernel density also fits the empirical CDF computed with the use of Weibull plotting-position formula very well. Figure 12.5 verifies that the kernel density may be applied to model the marginal distribution of time series.
Figure 12.5 Plots of deseasonalized TPN and DO time series, kernel density, as well as the CDF computed from kernel density.
Parameter Estimation for the Deseasonalized TPN and DO Series
Deseasonalized TPN Series
Table 12.4 lists the parameter, likelihood, and AIC values estimated using the four previously discussed copula candidates. From Table 12.4, it is seen that the Gaussian copula is the best choice based on the AIC criterion. Results of the SnB goodness-of-fit test (SnB = 0.034, P = 0.23) further confirm that the Gaussian copula may properly model the deseasonalized TPN series.
Gumbel–Houggard θθ | Gaussian ρρ | Student t [ρ, ν]ρν | Frank θθ | |
---|---|---|---|---|
Parameters | 1.23 | 0.29 | [0.30, 11.40] | 1.97 |
ML | 6.83 | 9.34 | 9.71 | 8.73 |
AIC | –11.67 | –16.68 | –15.43 | –15.48 |
Deseasonalized DO Series
As discussed in Chapter 9, the copula-based second-order Markov process is fully governed by the joint distribution of (DOt − 2, DOt − 1, DOt)DOt−2,DOt−1,DOt) through the trivariate copula, i.e., three-dimensional D-vine copula shown as Figure 9.10 in Chapter 9. In this structure, (DOt − 1, DOt)DOt−1DOt and (DOt − 2, DOt − 1DOt−2,DOt−1) for the lag-1 dependence possess the same copula. Table 12.5 lists the results for parameter estimation, including the SnB goodness-of-fit statistical test. The results in Table 12.5 show that (1) the Gumbel–Hougaard copula can be applied to model the lag-1 temporal dependence; and (2) the Gaussian copula can be applied to model the conditional dependence of (t|t-1 and t-2|t-1). With the selected copula models (i.e., Gaussian for deseasonalized TPN, Gumbel–Gaussian for deseasonalized DO), we will show the simulation and forecast in what follows.
T1 | Gumbel–Hougaard | Gaussian | Student t | Frank |
---|---|---|---|---|
Parameters | 1.18 | 0.16 | [0.18, 4.01] | 1.24 |
ML | 6.83 | 2.85 | 7.52 | 3.41 |
AIC | –11.66 | –3.71 | –11.04 | –4.82 |
SnB = 0.024, P = 0.56 (t,t-1) | ||||
SnB = 0.024, P = 0.59 (t-2,t-1) |
T2 | Gumbel–Hougaard | Gaussian | Student t | Frank |
---|---|---|---|---|
Parameters | 1.16 | 0.22 | [0.23, 3E+06] | 1.45 |
ML | 4.77 | 5.15 | 5.15 | 4.89 |
AIC | –7.55 | –8.31 | –6.31 | –7.78 |
Note: SnB = 0.033, P = 0.25 (t|t-1, t-2|t-1)
Monthly TPN and DO Simulation and Forecast
Deseasonalized TPN Series
The simulation method discussed in Section 9.4.3 is applied to the first-order TPN series. Likewise, Section 9.4.4 is applied for the one-step ahead median and VaR forecasts. Using one simple example, we will show the inversion of simulated variate in the frequency domain back to the real domain.
Suppose that we simulated UJune = 0.8UJune=0.8 from the Guassian copula fitted to the first-order deseasonalized TPN series. Looking up the empirical CDF computed from the kernel density function, we see the simulated UJune = 0.8UJune=0.8 is bounded by [CDF, TPN] in {[0.787, 0.761], [0.804, 0.844]}. Applying the interpolation, we compute the simulated deseasonalized TPN as follows:
Adding back the monthly average and standard deviation for the month of June, we can compute the simulated TPN of June as follows:
Applying the one-step ahead forecast discussed in Example 9.3, we can proceed with the median forecast as well as the 95% and 5% VaR. To compute the VaR, Equation (9.22) can be rewritten as follows:
Figure 12.6 plots the comparison of simulated monthly TPN with the observed TPN. It also plots the forecasted monthly TPN, its 5% and 95% VaR versus the observed monthly TPN. Figure 12.6 indicates that (a) the simulated deseasonal TPN from the fitted Gaussian copulas well presents the lag-1 temporal dependence compared to the observed deseasonal TPN series; (b) simulated monthly TPN also well presents the dependence of the observed monthly TPN series; (c) the one-step ahead monthly TPN forecast captured the main trend of monthly TPN; and (d) though there is an obvious error for the extreme TPN values, the VaR values may help identify these extreme values. The forecasted and VaR values are listed in Table 12.6.
Figure 12.6 Simulations of deseasonal monthly, monthly TPN, and monthly TPN forecast with 95% and 5% VaRs.
Date | TPN (mg/L) | DO (mg/L) | ||||||
---|---|---|---|---|---|---|---|---|
Observed | Forecast | 5%VaR | 95%VaR | Observed | forecast | 5%VaR | 95%VaR | |
12-Jan | 0.356 | 0.253 | 0.176 | 0.366 | 13.2 | 13.29 | 12.58 | 13.80 |
12-Feb | 0.191 | 0.242 | 0.141 | 0.382 | 12.8 | 13.12 | 12.37 | 13.84 |
12-Mar | 0.233 | 0.195 | 0.135 | 0.284 | 12.8 | 12.69 | 12.02 | 13.47 |
12-Apr | 0.123 | 0.165 | 0.115 | 0.236 | 12.83 | 12.29 | 11.81 | 12.84 |
12-May | 0.115 | 0.089 | 0.058 | 0.136 | 12.73 | 12.24 | 11.64 | 12.81 |
12-Jun | 0.094 | 0.068 | 0.044 | 0.101 | 12.2 | 11.86 | 11.04 | 12.50 |
12-Jul | 0.041 | 0.080 | 0.046 | 0.126 | 11.2 | 11.02 | 10.17 | 11.70 |
12-Aug | 0.061 | 0.072 | 0.021 | 0.146 | 10.8 | 10.40 | 9.46 | 11.28 |
12-Sep | 0.081 | 0.124 | 0.076 | 0.195 | 10.8 | 10.71 | 9.84 | 11.55 |
12-Oct | 0.21 | 0.175 | 0.091 | 0.298 | 11.6 | 11.44 | 10.64 | 12.28 |
12-Nov | 0.227 | 0.231 | 0.144 | 0.359 | 12.6 | 12.35 | 11.11 | 13.69 |
12-Dec | 0.42 | 0.256 | 0.182 | 0.366 | 12 | 12.78 | 12.08 | 13.55 |
13-Jan | 0.337 | 0.297 | 0.207 | 0.420 | 13.3 | 12.87 | 12.34 | 13.51 |
13-Feb | 0.244 | 0.237 | 0.137 | 0.375 | 12.5 | 12.76 | 12.03 | 13.58 |
13-Mar | 0.175 | 0.205 | 0.142 | 0.295 | 12.8 | 12.67 | 12.01 | 13.46 |
13-Apr | 0.171 | 0.151 | 0.105 | 0.221 | 12.2 | 12.25 | 11.77 | 12.80 |
13-May | 0.067 | 0.100 | 0.066 | 0.148 | 11.8 | 12.08 | 11.57 | 12.67 |
13-Jun | 0.058 | 0.058 | 0.037 | 0.089 | 11.4 | 11.38 | 10.75 | 12.14 |
13-Jul | 0.084 | 0.067 | 0.037 | 0.111 | 9.8 | 10.50 | 9.82 | 11.31 |
13-Aug | 0.226 | 0.095 | 0.040 | 0.175 | 9.6 | 9.89 | 9.05 | 10.90 |
13-Sep | 0.177 | 0.163 | 0.104 | 0.244 | 10 | 10.16 | 9.38 | 11.04 |