12 – Water Quality Analysis




Abstract




This chapter discusses how to apply copulas in water quality analysis. For monthly water quality observations, applications will include (i) a copula-based Markov process to study the water quality sequence with temporal dependence; and (ii) a copula-based multivariate water quality time series analysis. This chapter is in line with Chapter 9.





12 Water Quality Analysis




12.1 Case-Study Sites


According to the availability of water quality data, two watersheds are used as a case study. One watershed is a natural watershed, while the other is an urban watershed. The watershed boundaries and streams data were retrieved from the NHDPlus High Resolution National Hydrography Dataset and Watershed Boundary Dataset (https://nhd.usgs.gov/). The land use and land cover (LULC) were retrieved from the National Land Cover Database (www.nrlc.gov).



12.1.1 Snohomish River Watershed


According to the Department of Ecology of the State of Washington (ecy.wa.gov), the Snohomish River is formed near the city of Monroe, where the Skykomish and Snohomish rivers meet. The Snohomish River continues its way through the estuary of the city of Snohomish before entering into Puget Sound. The Snohomish watershed covers an area of 1,978 square miles (about 5,123 square kilometers) and provides important water recreation activities. In the past, agriculture and forest were two main LULC within the watershed; however, throughout the last century, more human activity has been introduced into the watershed. The Department of Ecology clearly stated (ecy.wa.gov): “over the last century, diking and other engineering activities in the lower part of the basin greatly changed how water is stored and managed in floodplain areas. More recently, cities and suburban areas have grown rapidly, creating more change to the natural water cycle.”


Besides the change in the natural water cycle induced by human activities, water quality issues (including but not limited to bacteria, dissolved oxygen (DO), temperature, and pH) have also been identified for some areas. Four stations located in the Snohomish watershed are selected for the case study: A90, C70, D50, and D130 (shown in Figure 12.1). The total persulfate nitrogen (TPN) and DO at C70 are chosen as the targeting water quality parameters for the temporal dependence case study. DO at all four stations is chosen for the spatial dependence study.





Figure 12.1 Snohomish watershed map and its LULC in 2011(retrieved from USGS and NLCD).



12.1.2 Chattahoochee River Watershed


As a tributary of Apalachicola River, the Chattahoochee River originates south of the Alabama and Georgia border and joins the Apalachicola River at the Georgia and Florida border. The Chattahoochee River watershed is the largest subwatershed of Apalachicola–Chattahoochee–Flint river basin.


The city of Atlanta is located within the watershed. There are gauging stations upstream and downstream of metropolis, i.e., the Belton Bridge station (USGS02332017, upstream) and Whitesburg station (USGS02338000, downstream). The subwatershed upstream of the Bridge station may be classified as the forest watershed. With the major metropolitan area – the city of Atlanta – the subwatershed upstream of Whitesburg is more developed (by 2011, the developed land alone accounts for about 34%) and may be considered the urban watershed (shown in Figure 12.2). The targeting water quality parameters are total nitrogen (TKN, mg/L), DO, and phosphorus (mg/L).





Figure 12.2 Chattahoochee River watershed upstream of the Whitesburg station and its LULC in 2011 (retrieved from USGS and NLCD).



12.2 Dependence Study at the Snohomish River Watershed


In this section, we will investigate the temporal and spatial–temporal dependence for the water quality parameters at the case-study site of the Snohomish River watershed. For the case study of the Snohomish River watershed, monthly TPN (C70 only) and DO (C70, D50, D130, and A90) are applied for the study. The monthly TPN and DO at station C70 are used for the study of temporal dependence. The monthly DO at all four of these stations is applied to study the spatial–temporal dependence.



12.2.1 Study of Temporal Dependence Using Copulas



Temporal Dependence of Monthly TPN and DO at Station C70

The TPN and DO at station C70 are chosen for the study of temporal dependence. Table 12.1 lists the dataset for both temporal and spatial dependence study. We will use the water quality data before 2012 to build a copula-based Markov process model, and the water quality data of 2012 and 2013 will be used for model validation purpose.




Table 12.1. TPN and DO monthly dataset from the Snohomish River watershed.














































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































Dates TPN (C70) DO (C70) DO (D50) DO (D130) DO (A90)
Oct-94 0.116 11.3 10.8 10.5 10.9
Nov-94 0.312 12 11.7 11.9 11.9
Dec-94 0.287 12.7 12.6 12.6 12.6
Jan-95 0.285 12.8 11.9 12.3 12.15
Feb-95 0.21 12.4 12.4 12.8 12.8
Mar-95 0.212 12.1 11.6 11.8 11.7
Apr-95 0.141 12.2 11.4 11.9 11.6
May-95 0.081 11.6 10.4 11.1 10.8
Jun-95 0.086 11.2 10.2 10.9 10.6
Jul-95 0.066 10.2 9.2 9.4 9.7
Aug-95 0.117 10.5 9.6 9.8 9.8
Sep-95 0.125 10.3 9.6 9.3 9
Oct-95 0.253 11 10.4 10.8 10.6
Nov-95 0.178 12.2 10.9 12 11.5
Dec-95 0.203 12.5 11.4 12.1 11.8
Jan-96 0.223 12.8 11.9 12.4 12.3
Feb-96 0.155 12.4 12.3 12.2 12.2
Mar-96 0.119 12.6 11.7 12.2 12
Apr-96 0.114 11.7 10.8 11.3 11.1
May-96 0.09 12 11.3 11.7 11.6
Jun-96 0.043 11.3 10.1 10.8 10.7
Jul-96 0.045 10.5 9.7 9.7 10
Aug-96 0.107 10.1 9.4 9.6 9.8
Sep-96 0.194 10.7 9.9 10.5 10.2
Oct-96 0.331 11.6 11.4 11.7 11.3
Nov-96 0.237 12.3 11.8 12.2 12
Dec-96 0.286 12.7 12 12.5 12.2
Jan-97 0.201 13 12.3 12.6 12.4
Feb-97 0.232 12.5 12 12.3 12
Mar-97 0.299 12.5 11.9 12.3 12.2
Apr-97 0.218 12.7 12.7 12.5 12.6
May-97 0.077 11.9 10.9 11.8 11.2
Jun-97 0.076 11.6 10.8 11.2 11.1
Jul-97 0.058 10.2 9 9.3 9.3
Aug-97 0.055 10.2 9.3 9.4 9.2
Sep-97 0.204 10.4 9.9 10.3 10
Oct-97 0.163 11.1 10.7 11 11
Nov-97 0.187 11.8 11.3 11.3 11.4
Dec-97 0.282 12.6 11.7 11.6 11.8
Jan-98 0.387 12.2 11.6 11 11.7
Feb-98 0.258 12.4 11.5 11.7 11.8
Mar-98 0.217 12.1 11.5 12.2 11.8
Apr-98 0.165 12 10.9 11.5 11.5
May-98 0.14 12 11.1 11.9 11.4
Jun-98 0.065 10.8 9.7 10.3 10.1
Jul-98 0.077 10.4 9 9.9 9.3
Aug-98 0.075 10.1 8.3 9.4 9.1
Sep-98 0.097 10.3 9.6 9.4 9.4
Oct-98 0.23 12.1 11.7 11.5 11.2
Nov-98 0.378 11.7 11.2 11.5 11.4
Dec-98 0.392 12.6 12.4 12.6 12.2
Jan-99 0.333 12.2 11.9 12.1 12
Feb-99 0.275 12.8 11.6 12.4 12.4
Mar-99 0.19 12.3 12 12.5 12
Apr-99 0.162 12.4 12.1 12.7 11.9
May-99 0.146 12.2 10.9 12.2 11.5
Jun-99 0.064 11.5 11.1 11.5 11.5
Jul-99 0.057 11.3 10.3 11.1 10.6
Aug-99 0.084 11.3 9.8 10.5 10.5
Sep-99 0.124 9.8 9.8 9.3 9
Oct-99 0.189 11.3 11 11.2 10.9
Nov-99 0.29 12 11.7 11.9 11.7
Dec-99 0.242 12 11.3 11.7 11.6
Jan-00 0.271 13.1 12 12.4 12.3
Feb-00 0.481 12.8 12.1 12.3 12.2
Mar-00 0.195 13 12.2 12.7 12.5
Apr-00 0.141 12.1 11.7 12.1 11.8
May-00 0.146 11.9 10.7 11.7 11.1
Jun-00 0.061 12.2 11.1 11.6 11.4
Jul-00 0.049 10.9 9.8 10.5 10.5
Aug-00 0.082 11 9.5 9.69 9.69
Sep-00 0.082 10.4 9.79 9.89 9.59
Oct-00 0.148 11.1 11 11.2 10.7
Nov-00 0.149 13.36 12.57 12.57 12.67
Dec-00 0.229 12.86 12.46 12.48 12.46
Jan-01 0.229 13.06 12.44 13.36 12.85
Feb-01 0.224 13.57 12.95 13.06 13.06
Mar-01 0.257 12.62 13.03 13.73 12.42
Apr-01 0.187 12.32 11.81 11.81 11.71
May-01 0.11 12.18 11.97 12.08 11.47
Jun-01 0.074 11.4 11.6 11.7 11
Jul-01 0.089 10.71 9.89 10.51 11.22
Aug-01 0.135 9.89 10 10 9.28
Sep-01 0.181 9.9 10.1 10 9.2
Oct-01 0.309 12.22 11.81 12.42 11.81
Nov-01 0.193 12.09 11.29 11.49 11.49
Dec-01 0.305 12.82 12.32 11.91 13.33
Jan-02 0.22 13.26 12.07 12.57 12.57
Feb-02 0.207 14.03 12.74 13.83 13.33
Mar-02 0.245 13.83 12.53 13.23 12.83
Apr-02 0.149 12.9 12.31 12.51 12.21
May-02 0.099 11.96 11.47 11.76 11.47
Jun-02 0.077 11.73 11.25 11.35 11.35
Jul-02 0.068 11 9.69 10.8 10.1
Aug-02 0.046 9.8 9.19 9.8 9.69
Sep-02 0.096 11.34 9.75 10.34 9.95
Oct-02 0.08 10.65 10.85 10.25 10.65
Nov-02 0.349 12.32 11.51 11.71 11.81
Dec-02 0.205 12.69 12.18 11.97 12.48
Jan-03 0.21 13.16 13.06 12.65 12.55
Feb-03 0.219 13.6 12.89 13.6 12.99
Mar-03 0.18 12.5 11.4 12.4 11.8
Apr-03 0.153 12.4 11 11.2 10.4
May-03 0.105 12.08 11.67 12.18 11.26
Jun-03 0.054 11.06 9.64 10.86 10.15
Jul-03 0.11 10.3 8.97 10.2 8.87
Aug-03 0.094 9.2 9.6 9 8.8
Sep-03 0.19 10.6 10 10.1 9.19
Oct-03 0.25 10.9 10.7 11.11 10.8
Nov-03 0.2745 12.46 11.71 11.91 11.91
Dec-03 0.23 12.56 12.16 12.56 12.16
Jan-04 0.273 13.06 12.36 12.56 12.46
Feb-04 0.193 12.7 11.7 12.5 12.1
Mar-04 0.15 12.3 11.7 11.7 11.8
Apr-04 0.12 11.9 11.2 11.8 11.1
May-04 0.08 11.7 10.8 12 11.2
Jun-04 0.073 10.6 9.69 10.5 9.8
Jul-04 0.076 11.1 8.8 9.4 8.8
Aug-04 0.12 9.4 8.69 8.69 8.6
Sep-04 0.15 11.11 10.5 11.11 10.7
Oct-04 0.18 11.2 11.2 11.3 11.2
Nov-04 0.2 11.7 11 11.3 11.3
Dec-04 0.24 12.2 12.53 12.1 11.8
Jan-05 0.15 12.5 11 12.3 11.6
Feb-05 0.19 13.2 12.1 12.3 12.5
Mar-05 0.251 12.2 11.8 12.1 11.3
Apr-05 0.2 12 11.7 12 11.5
May-05 0.094 11.5 11.4 12 10.7
Jun-05 0.087 11.4 10.5 11.1 10.5
Jul-05 0.13 10.19 9.3 9.69 8.9
Aug-05 0.13 9.31 8.81 9.21 8.51
Sep-05 0.13 10.19 9.69 9.6 9.5
Oct-05 0.19 10.8 10.4 10.7 10.19
Nov-05 0.329 12.6 12 12.3 12.4
Dec-05 0.22 13.3 12.9 13 13.1
Jan-06 0.263 12.8 11.7 12.4 12.3
Feb-06 0.24 13.3 11.9 12.7 12.4
Mar-06 0.253 13.1 11.6 12.4 12.3
Apr-06 0.2 12.5 11.9 12.5 12.2
May-06 0.1 12.2 10.8 11.8 11.2
Jun-06 0.052 11.9 10.8 11.6 11.2
Jul-06 0.067 11.1 9.5 10.4 9.9
Aug-06 0.094 9.5 9.19 8.9 9.9
Sep-06 0.11 10.4 10 9.8 9.5
Oct-06 0.308 10.7 10.6 10.6 10.3
Nov-06 0.2645 12.61 11.2 11.8 11.4
Dec-06 0.23 12.19 13 12.9 12.8
Jan-07 0.24 13 12.3 12.7 12.8
Feb-07 0.14 12.9 11.8 12.7 12.2
Mar-07 0.12 12.8 12.4 12.8 12.3
Apr-07 0.099 12.6 11.6 12.4 11.9
May-07 0.061 12.26 11.25 11.85 11.95
Jun-07 0.073 11.7 11.2 11.2 11.5
Jul-07 0.1 10.5 9.4 9.8 9.9
Aug-07 0.098 10 8.9 9.4 9.19
Sep-07 0.12 11.18 10.29 9.9 9.5
Oct-07 0.25 11.8 11.9 11.8 11.5
Nov-07 0.2 12.63 12.13 12.33 12.33
Dec-07 0.24 12.5 11.6 11.9 11.9
Jan-08 0.28 13.23 12.33 12.33 12.63
Feb-08 0.2 12.95 12.04 12.44 12.34
Mar-08 0.2 13.1 12 12.4 12.3
Apr-08 0.215 12.63 11.94 12.33 12.03
May-08 0.13 12.7 12.1 12.5 12.3
Jun-08 0.098 12.1 11 12 11.4
Jul-08 0.06 11.1 10.1 10.4 10.7
Aug-08 0.055 10 8.69 9.1 9.19
Sep-08 0.12 11.1 10 10.19 9.8
Oct-08 0.18 12.3 11.8 11.8 11.7
Nov-08 0.21 9.6 11.7 11.4 10.5
Dec-08 0.216 13.6 13.4 13.6 13.6
Jan-09 0.212 13.5 12.4 12.9 12.9
Feb-09 0.19 12.3 11.5 11.4 12.4
Mar-09 0.214 13.4 12.8 13.5 12.8
Apr-09 0.227 12.2 11.6 12.6 11.6
May-09 0.091 12.5 12 12 12
Jun-09 0.04 11.3 10.4 10.8 10.6
Jul-09 0.083 10.1 9 9.1 9.19
Aug-09 0.067 10 9.5 9.3 9.5
Sep-09 0.159 9.4 9.3 9.19 8.19
Oct-09 0.327 10.6 10.19 10.6 10
Nov-09 0.163 12.4 12 12.3 12
Dec-09 0.279 13 12.6 12.6 12.7
Jan-10 0.276 12.5 11.8 12.1 12.1
Feb-10 0.182 12.8 12.6 12.1 12.6
Mar-10 0.161 12.4 11.6 12.2 12
Apr-10 0.14 12.3 11.9 12.4 11.6
May-10 0.076 12.1 11.3 11.9 10.8
Jun-10 0.049 11.3 10.6 11.1 11.1
Jul-10 0.078 10.3 9 9.4 8.8
Aug-10 0.083 10.4 9.59 9.89 9.49
Sep-10 0.109 11.1 10 10.4 10.5
Oct-10 0.17 10.94 10.74 10.94 11.34
Nov-10 0.162 11.85 11.75 11.75 11.85
Dec-10 0.267 13.13 11.6 12.2 11.9
Jan-11 0.217 13.1 12.6 12.8 13.2
Feb-11 0.161 12.6 12 12.2 12.6
Mar-11 0.224 12.3 11.9 11.7 11.8
Apr-11 0.154 12 11.8 11.7 11.9
May-11 0.095 12.6 12.3 12.2 12.2
Jun-11 0.042 11.87 11.16 11.47 11.47
Jul-11 0.04 11.4 10.8 10.6 10.8
Aug-11 0.05 10.65 9.44 9.64 10.05
Sep-11 0.149 10.02 9.12 9.62 9.82
Oct-11 0.188 11.12 10.92 11.22 10.5
Nov-11 0.233 13.7 12.8 13.2 12.7
Dec-11 0.255 13.6 13.1 13 12.8
Jan-12 0.356 13.2 12.6 12.8 12.8
Feb-12 0.191 12.8 12 12.3 12.8
Mar-12 0.233 12.8 12.2 12.6 13.2
Apr-12 0.123 12.83 12.42 13.03 12.73
May-12 0.115 12.73 11.52 12.73 12.02
Jun-12 0.094 12.2 11.9 11.9 11.5
Jul-12 0.041 11.2 10.1 10.5 10
Aug-12 0.061 10.8 9 9.5 9.6
Sep-12 0.081 10.8 10.1 9.5 10.16
Oct-12 0.21 11.6 11.3 11.3 11.2
Nov-12 0.227 12.6 11.5 12.2 11.7
Dec-12 0.42 12 11.8 11.9 12.1
Jan-13 0.337 13.3 12.9 12.7 12.7
Feb-13 0.244 12.5 11.7 12.1 11.3
Mar-13 0.175 12.8 12.2 12.6 11.7
Apr-13 0.171 12.2 11.6 12 11.8
May-13 0.067 11.8 11.2 11.6 11.3
Jun-13 0.058 11.4 10.2 11 10.3
Jul-13 0.084 9.8 8.9 9.2 9
Aug-13 0.226 9.6 8.7 8.7 8.9
Sep-13 0.177 10 9.2 10.3 9.3

In general, before we proceed to investigate the temporal dependence using copulas, we first evaluate whether there exists periodicity (or seasonality) in the sequence. For monthly TPN and DO, we suspect that there should exist seasonality. We can use the sample autocorrelation function plot or cumulative periodogram through spectral analysis (Box et al., 2007) to assess the seasonality.


The sample autocorrelation coefficient [γk]γk for time series xtxt at lag k can be written as follows:


ck=1N∑t=1N−kxt−x¯xt+k−x¯(12.1a)

γk=ckc0;c0=1N∑t=1Nxt−x¯2(12.1b)

The cumulative periodogram [C(fk)Cfk] for time series xtxt can be written as follows:


Ifj=2n∑t=1Nxt−2πifjt=2N∑t=1Nxtcos2πfjt2+∑t=1Nxtsin2πfjt212(12.2a)


Cfk=∑j=1kIfjNσx2̂(12.2b)

In Equations (12.2a) and (12.2b), I(fj)Ifj stands for the periodogram function; fj=jN,j=1,…⌊N2⌋; σx2̂ is the estimated variance for the time series.


Applying Equations (12.1) and (12.2), Figure 12.3 plots the sample autocorrelation function and cumulative periodogram for the TPN and DO at station C70. From the sample autocorrelation function plots in Figure 12.1, we clearly see that both DO and TPN have a 12-month cycle. From the cumulative periodogram plot for TPN at C70, we notice a discontinuity at frequency f=0.0833≈112. The discontinuity of cumulative periodogram indicates the existence of periodicity (or seasonality). From the cumulative periodogram plot for DO at C70, again we see the discontinuity at the same frequency as that of TPN; we see another very small discontinuity at frequency f = 0.1667≈1/6f=0.1667≈1/6, which means six-month period may also exist for the DO sequence. Comparatively speaking, the six-month subcycle is not significant, and we will only deal with the dominating 12-month periodicity for both TPN and DO sequences.





Figure 12.3 Autocorrelation and cumulative periodogram plots for original monthly TPN and DO series.


To remove the periodicity, we will introduce a simple but effective method (called the full deseasonalization method). For our monthly water quality study, we will actually remove the monthly average and monthly standard deviation from the water quality time series using the following:


xr,mde−season=xr,m−μ̂mσ̂m,m=1,2,…S(12.3)

In this case study, we have S = 12S=12 to show that we have monthly period. After applying Equation (12.3), we can then use the deseasonalized sequence to reevaluate whether the periodicity has been successfully removed as shown in Figure 12.4. As seen in Figure 12.4, the periodicity has been successfully removed. Table 12.2 tabulates the monthly sample mean and sample standard deviation for TPN and DO time series, respectively.





Figure 12.4 Autocorrelation and cumulative periodogram plots for deseasonalized TPN and DO series.




Table 12.2. Monthly sample mean and standard deviation of TPN and DO series.



























































































Month TPN (mg/L) DO (mg/L)
μ̂ σ̂ μ̂ σ̂
January 0.26 0.06 12.94 0.36
February 0.22 0.07 12.87 0.47
March 0.21 0.05 12.67 0.46
April 0.16 0.04 12.31 0.33
May 0.10 0.03 12.10 0.35
June 0.07 0.02 11.50 0.44
July 0.07 0.02 10.65 0.48
August 0.09 0.04 10.09 0.58
September 0.14 0.04 10.48 0.53
October 0.21 0.07 11.28 0.52
November 0.24 0.07 12.21 0.82
December 0.26 0.06 12.71 0.47

With the successful removal of periodicity, we can now proceed to study the temporal dependence using the copula-based Markov process. As stated in Chapter 9, with the application of the copula-based Markov process, the time series does not need to belong or transform to the Gaussian process. In addition, the marginals and serial dependence can be studied separately to avoid possible misidentification. Following the discussion in Sections 9.39.5, we will illustrate the application of the copula-based Markov process to the water quality time series. As stated in Chapter 9, the procedure involved for the copula-based Markov process is as follows:




  1. i. Identify the Markov order for the stationary time series.



  2. ii. Investigate the marginal distribution of the Markov process.



  3. iii. Study the serial dependence using copula.



  4. iv. Perform one-step ahead forecasting with the copula-based Markov process.



Identification of the Proper Markov Order for the Deseasonalized TPN and DO Time Series

The Markov order will be identified using the method discussed in Section 9.5.2. The meta-Gaussian copula is applied as the building block for the order identification purpose only. The kernel density method is applied to estimate the marginals nonparametrically. Following the order identification procedure, we obtain that the deseasonalized TPN and DO may be modeled using the first- and second-order Markov process, respectively (as listed in Table 12.3).




Table 12.3. Markov order identification using the meta-Gaussian copula.












































Variable Ft, Ft − 1Ft,Ft−1 Ft ∣ t − 1, Ft − 2 ∣ t − 1Ft∣t−1,Ft−2∣t−1 Ft ∣ t − 1, t − 2, Ft − 3 ∣ t − 1, t − 2Ft∣t−1,t−2,Ft−3∣t−1,t−2 Order
τ p-Val τ p-Val τ p-Val
TPN 0.18 < 0.01 0.02 0.64 1
DO 0.14 < 0.01 0.16 < 0.01 0.07 0.14 2

With the identified Markov order, we can move on to choose the best-fitted copula functions. For the deseasonalized TPN series, the most common bivariate copulas (i.e., Gumbel, meta-Gaussian, meta-Student t, and Frank) will be selected as the candidates. For the deseasonlized DO series, the D-vine copula application to time series discussed in Chapter 9 will be selected. The pseudo-MLE discussed in Section 9.5.3 is applied for parameter estimation with the use of empirical distribution estimated from kernel densities. To illustrate the empirical distribution with the use of kernel density, we selected a simple Gaussian kernel with the bandwidth of 0.3097 and 0.3507 for deseasonalized TPN and DO, respectively. As shown in Figure 12.5, the kernel density fits the histogram very well. The CDF computed from kernel density also fits the empirical CDF computed with the use of Weibull plotting-position formula very well. Figure 12.5 verifies that the kernel density may be applied to model the marginal distribution of time series.





Figure 12.5 Plots of deseasonalized TPN and DO time series, kernel density, as well as the CDF computed from kernel density.



Parameter Estimation for the Deseasonalized TPN and DO Series


Deseasonalized TPN Series

Table 12.4 lists the parameter, likelihood, and AIC values estimated using the four previously discussed copula candidates. From Table 12.4, it is seen that the Gaussian copula is the best choice based on the AIC criterion. Results of the SnB goodness-of-fit test (SnB = 0.034, P = 0.23) further confirm that the Gaussian copula may properly model the deseasonalized TPN series.




Table 12.4. Results from the four copula candidates for first-order deaseasonalized TPN series.


































Gumbel–Houggard θθ Gaussian ρρ Student t [ρν]ρν Frank θθ
Parameters 1.23 0.29 [0.30, 11.40] 1.97
ML 6.83 9.34 9.71 8.73
AIC –11.67 –16.68 –15.43 –15.48


Deseasonalized DO Series

As discussed in Chapter 9, the copula-based second-order Markov process is fully governed by the joint distribution of (DOt − 2, DOt − 1, DOt)DOt−2,DOt−1,DOt) through the trivariate copula, i.e., three-dimensional D-vine copula shown as Figure 9.10 in Chapter 9. In this structure, (DOt − 1DOt)DOt−1DOt and (DOt − 2, DOt − 1DOt−2,DOt−1) for the lag-1 dependence possess the same copula. Table 12.5 lists the results for parameter estimation, including the SnB goodness-of-fit statistical test. The results in Table 12.5 show that (1) the Gumbel–Hougaard copula can be applied to model the lag-1 temporal dependence; and (2) the Gaussian copula can be applied to model the conditional dependence of (t|t-1 and t-2|t-1). With the selected copula models (i.e., Gaussian for deseasonalized TPN, Gumbel–Gaussian for deseasonalized DO), we will show the simulation and forecast in what follows.




Table 12.5. Results from the four copula candidates for second-order deseasonalized DO series.










































T1 Gumbel–Hougaard Gaussian Student t Frank
Parameters 1.18 0.16 [0.18, 4.01] 1.24
ML 6.83 2.85 7.52 3.41
AIC –11.66 –3.71 –11.04 –4.82
SnB = 0.024, P = 0.56 (t,t-1)
SnB = 0.024, P = 0.59 (t-2,t-1)

































T2 Gumbel–Hougaard Gaussian Student t Frank
Parameters 1.16 0.22 [0.23, 3E+06] 1.45
ML 4.77 5.15 5.15 4.89
AIC –7.55 –8.31 –6.31 –7.78


Note: SnB = 0.033, P = 0.25 (t|t-1, t-2|t-1)



Monthly TPN and DO Simulation and Forecast


Deseasonalized TPN Series

The simulation method discussed in Section 9.4.3 is applied to the first-order TPN series. Likewise, Section 9.4.4 is applied for the one-step ahead median and VaR forecasts. Using one simple example, we will show the inversion of simulated variate in the frequency domain back to the real domain.


Suppose that we simulated UJune = 0.8UJune=0.8 from the Guassian copula fitted to the first-order deseasonalized TPN series. Looking up the empirical CDF computed from the kernel density function, we see the simulated UJune = 0.8UJune=0.8 is bounded by [CDF, TPN] in {[0.787, 0.761], [0.804, 0.844]}. Applying the interpolation, we compute the simulated deseasonalized TPN as follows:


TPNsimdeseason=0.761+0.844−0.7610.804−0.7870.8−0.787=0.8245.

Adding back the monthly average and standard deviation for the month of June, we can compute the simulated TPN of June as follows:



TPNsim = 0.8245(0.0175) + 0.0666 = 0.0811 mg/L
TPNsim=0.82450.0175+0.0666=0.0811mg/L.

Applying the one-step ahead forecast discussed in Example 9.3, we can proceed with the median forecast as well as the 95% and 5% VaR. To compute the VaR, Equation (9.22) can be rewritten as follows:


Zt+195%=Fn−1(CFnzt+1∣Fnzt0.95Fnzt;α̂(12.4a)

Zt+15%=Fn−1(CFnzt+1∣Fnzt0.05Fnzt;α̂(12.4b)

Figure 12.6 plots the comparison of simulated monthly TPN with the observed TPN. It also plots the forecasted monthly TPN, its 5% and 95% VaR versus the observed monthly TPN. Figure 12.6 indicates that (a) the simulated deseasonal TPN from the fitted Gaussian copulas well presents the lag-1 temporal dependence compared to the observed deseasonal TPN series; (b) simulated monthly TPN also well presents the dependence of the observed monthly TPN series; (c) the one-step ahead monthly TPN forecast captured the main trend of monthly TPN; and (d) though there is an obvious error for the extreme TPN values, the VaR values may help identify these extreme values. The forecasted and VaR values are listed in Table 12.6.





Figure 12.6 Simulations of deseasonal monthly, monthly TPN, and monthly TPN forecast with 95% and 5% VaRs.




Table 12.6. Forecast and VaR results computed from the fitted copula-based Markov model.













































































































































































































































Date TPN (mg/L) DO (mg/L)
Observed Forecast 5%VaR 95%VaR Observed forecast 5%VaR 95%VaR
12-Jan 0.356 0.253 0.176 0.366 13.2 13.29 12.58 13.80
12-Feb 0.191 0.242 0.141 0.382 12.8 13.12 12.37 13.84
12-Mar 0.233 0.195 0.135 0.284 12.8 12.69 12.02 13.47
12-Apr 0.123 0.165 0.115 0.236 12.83 12.29 11.81 12.84
12-May 0.115 0.089 0.058 0.136 12.73 12.24 11.64 12.81
12-Jun 0.094 0.068 0.044 0.101 12.2 11.86 11.04 12.50
12-Jul 0.041 0.080 0.046 0.126 11.2 11.02 10.17 11.70
12-Aug 0.061 0.072 0.021 0.146 10.8 10.40 9.46 11.28
12-Sep 0.081 0.124 0.076 0.195 10.8 10.71 9.84 11.55
12-Oct 0.21 0.175 0.091 0.298 11.6 11.44 10.64 12.28
12-Nov 0.227 0.231 0.144 0.359 12.6 12.35 11.11 13.69
12-Dec 0.42 0.256 0.182 0.366 12 12.78 12.08 13.55
13-Jan 0.337 0.297 0.207 0.420 13.3 12.87 12.34 13.51
13-Feb 0.244 0.237 0.137 0.375 12.5 12.76 12.03 13.58
13-Mar 0.175 0.205 0.142 0.295 12.8 12.67 12.01 13.46
13-Apr 0.171 0.151 0.105 0.221 12.2 12.25 11.77 12.80
13-May 0.067 0.100 0.066 0.148 11.8 12.08 11.57 12.67
13-Jun 0.058 0.058 0.037 0.089 11.4 11.38 10.75 12.14
13-Jul 0.084 0.067 0.037 0.111 9.8 10.50 9.82 11.31
13-Aug 0.226 0.095 0.040 0.175 9.6 9.89 9.05 10.90
13-Sep 0.177 0.163 0.104 0.244 10 10.16 9.38 11.04

Only gold members can continue reading. Log In or Register to continue

Oct 12, 2020 | Posted by in Water and Sewage | Comments Off on 12 – Water Quality Analysis
Premium Wordpress Themes by UFO Themes