Abstract
In this chapter, the copula modeling is applied to analyze compound extremes. The number of warm days (NWDs) and monthly precipitation are applied for the case study. The time-varying generalized extreme value (GEV) distribution with a linear trend in the location parameter is applied to model the NWDs after the change. The time-varying copula is applied to model the compound risk of hot and dry, as well as wet and cold days.
14.1 Introduction
Extreme events (e.g., peak flow, heat wave, etc.) have been conventionally analyzed as univariate variables with the use of such distributions as generalized extreme value (GEV) distribution. These events have also been analyzed in bivariate (multivariate) frameworks considering their intrinsic characteristics (e.g., peak discharge, flood volume and flood duration in flood frequency analysis; drought severity, duration, and interarrival time in drought frequency analysis). This multivariate framework applies the intrinsic properties to better represent the risk induced by the events. However, there may be other variables (factors) that may either increase or decrease the risk of occurrence of extreme events. For example, heat wave (or high temperature) in general increases drought severity, stresses plant growth, increases evapotranspiration, impacts bacterial or viral activity, etc. When more variables (or extremes of different types) than one are analyzed, analysis of extremes is called compound (or concurrent) analysis. In what follows, we will first briefly review recent studies.
Using the hypothesis of flood and sea surge being more likely to occur concurrently on the east coast of Britain than the north coast, Svensson and Jones (2002) proposed the χχ empirical dependence measure to evaluate the flood, surge, and precipitation for the spatial dependence of flood, surge, or precipitation of different stations, as well as for the cross variable, with the assumption of flood, surge, and precipitation being independent identically distributed (i.i.d.) random variables. The proposed χχ dependence measure may be applied to investigate the concurrence of extremes, i.e., the probability of one variable being extreme provided the other one is extreme.
Hao et al. (2013) evaluated the occurrence of the compounding monthly precipitation and temperature extremes using the data from the Climate Research unit, University of Delaware, and the simulations from CMIP5 models. Pertaining to precipitation and temperature, four combinations were considered for evaluation: wet/warm (P75/T75); dry/warm (P25/T75); wet/cold (P75/T25); and dry/cold (P25/T25). Their investigation concluded the increasing occurrences of wet/warm and dry/warm for some regions in the world with the decreasing occurrences of wet/cold and dry/cold for a majority of the world.
Wahl et al. (2015) studied the compound flooding risk from storm surge and heavy rainfall for major coastal cities in the United States. Using rank-based correlation, their study revealed that the compounding flood risk was higher at the Atlantic/Gulf coast than at the Pacific coast. Additionally, the number of events increased due to the long-term sea level rise in the past century (Wahl et al., 2015). Using the copula theory, Miao et al. (2016) studied the stochastic relation of precipitation and temperature in the Loess Plateau in China.
Sedlmeier et al. (2016) investigated compound extremes under climate change. In their study, heavy precipitation and low temperature in winter, and high temperature and dry days in summer, were applied for compound extreme analysis using the Markov Chain method. Through the study, they were able to identify three regions that may be more likely to be impacted due to the future change in terms of heavy precipitation and low temperature in the winter. They also identified one region likely to be impacted by the future change of dry and hot summer. In this chapter, we will focus on applying the copula theory to analyze compound extremes.
14.2 Dataset
To illustrate the analysis, maximum daily temperature and daily precipitation were collected from NOAA at USC00411720 (Choke Canyon Dam, Texas). The range of data was from water year 1983 (October 1, 1983–April 7, 2017). In the data collected from NOAA, there were five months of missing data as listed in Table 14.1.
Table 14.1. The entire month of missing precipitation and temperature data.
Jan. 1985 | Oct. 1986 | Aug. 1988 | Dec. 1989 | Oct. 2003 |
To obtain the complete time series, the nearby station, i.e., USC00411337 (Calliham, Texas), close to USC00411720, is chosen to fill the missing precipitation and temperature. By replacing the missing precipitation and temperature with those at USC00411337, we see that the missing precipitation is successfully replaced. However, the missing temperature cannot be successfully filled for the months listed in Table 14.1 except for October 2003. Thus, to keep the continuity of daily precipitation and temperature, daily information starting from the calendar year of 1990 is applied for analysis.
Besides the missing values listed in Table 14.1, Table 14.2 lists the days with missing precipitation (and/or temperature) as well as the replaced values. These missing values are filled, with the rules as follows:
i. Replacing the missing precipitation (and/or temperature) with the available observation at USC00411337 on the same day;
ii, Otherwise, replacing the missing precipitation (and/or temperature) with the average values of one day before and one day after of both two stations. Using February 4, 2011, as an example, the missing temperature of that day is filled using the temperatures of February 3, 2011, and February 5, 2011, at both stations USC00411720 and USC00411337.
Table 14.2. Days of missing daily precipitation and temperature after 1990.
Precipitationa (mm/day) | Temperatureb (oC) | ||||||
---|---|---|---|---|---|---|---|
01/13/1997 | 0 | 02/13/2012 | 0.5 | 03/29/2012 | 39.1 | 02/04/2011 | 2.2 |
09/18/2011 | 0 | 03/09/2012 | 5.1 | 07/11/2012 | 5.1 | 05/05/2011 | 27.8 |
12/11/2011 | 14.5 | 03/10/2012 | 5.1 | 09/14/2012 | 44.5 | 05/25/2014 | 30.6 |
01/25/2012 | 9.1 | 03/11/2012 | 17.8 | 09/29/2012 | 81.3 | 04/05/2015 | 20 |
Note: a Applied rule (i); b applied rule (ii).
With missing daily precipitation and maximum temperature data filled, we may compute monthly precipitation and the number of warm days (NWD) for each month. The NWD is computed as follows:

in which: i, j represent the year and month of observation, njnj represents the number of days for month j, and Tj¯ represents the sample average monthly maximum temperature computed from the entire dataset.
Figure 14.1 plots the individual time series and the scatter plot. The scatter plot indicates the negative relation between monthly precipitation and NWDs. The negative relation is supported by the rank-based sample Kendall’s tau coefficient of correlation, and we get τN≈ − 0.38τN≈−0.38. To assess the stationarity for the time series, the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) and Mann–Kendall tests are performed.
Figure 14.1 Time series of monthly precipitation and NWD.
The null hypothesis of the KPSS test is that the time series is trend stationary (or level stationary, i.e., no trend). The alternative hypothesis of KPSS test is that the time series is a unit-root process. To perform the KPSS test, the time series {Xt : t = 1, 2, …, n}Xt:t=12…n is expressed as a sum of three components, deterministic trend, random walk, and stationary residual, as follows:
In Equation (14.1), αα represents the deterministic trend with α = 0α=0 for the test of level stationary; rtrt represents the random walk; e1te1t represents the stationary process; and e2t~i. i. d. (0, σ2)e2t~i.i.d.0σ2. With Equation (14.1), the null hypothesis may be rewritten as follows:
To assess the stationarity of the univariate time series, we can directly apply KPSS test function in MATLAB using the following: [h, P-Value, Statistics, Critical Value]=KPSS test(X, ‘lags’, a, ‘trend’ true/false, ‘alpha’, alpha), where X is the time series tested; a is the number of lag considered; ‘trend’, true represents the trend stationary (default) and false represents the level stationary; and ‘alpha’ represents the significance level (default = 0.05)].
Originally proposed by Mann (1945) and Kendall (1970), the nonparametric Mann–Kendall test evaluates whether there exists a monotonic trend in the dataset. The null hypothesis is that the data are i.i.d. random variables with the alternative hypothesis of monotonic trend existing in the dataset. The Mann–Kendall test statistics is computed using the S-score as follows:

The S-score in Equation (14.2a) has the following statistics:

In Equation (14.2b), p represents the number of tied groups in the dataset; and tjtj represents the number of data in the jth tied group. Furthermore, the test statistics S may be transformed to Z-score (i.e., following the standard normal distribution) as follows:

The P-value can then be computed by computing the exceedance probability as follows:
Based on the sample autocorrelation and partial autocorrelation plots shown in Figure 14.2, the KPSS test is performed up to a two-month lag for the monthly time series using the matlab function (kpsstest). Table 14.3 lists the results of KPSS test with the null hypothesis of level stationary, and Mann–Kendall test with the null hypothesis of observed data being i.i.d. random variables. Results listed in Table 14.3 show that (1) monthly precipitation may be viewed as a stationary time series (i.e., level stationary at all lags and monotonic trend is not detected by Mann–Kendall test); and (2) there exists a trend in the NWD per month. Applying a linear regression of NWD with respect to sequential month, we have NWD = b1 + b2x; x = 1, …, 327; b1 = 17.774, b2 = − 0.007, Pvalue = 0.076NWD=b1+b2x;x=1,…,327;b1=17.774,b2=−0.007,Pvalue=0.076. The P-value computed is slightly higher than 0.05, which means the null hypothesis may not be rejected with significance level of α = 0.05α=0.05; however, the rejection by the Mann–Kendall test suggests that there may be a monotonic trend or a sudden change existing in the NWDs.
Figure 14.2 Sample autocorrelation and partial autocorrelation plots for monthly precipitation and number of the warm days.
Table 14.3. Results of KPSS and Mann–Kendall tests.
Variables | H | KPSS | Cri. | Mann–Kendall | |||
---|---|---|---|---|---|---|---|
Stat. | S_score | P-value | |||||
Precipitation | Lag = 0 | 0 | 0.059 | 0.463 | –0.6 | 0.5456 | |
Lag = 1 | 0 | 0.054 | 0.463 | ||||
Lag = 2 | 0 | 0.048 | 0.463 | ||||
NWDs per month | Lag = 0 | 1 | 0.014 | 0.463 | –1.99 | 0.047 | |
Lag = 1 | 1 | 0.043 | 0.463 | ||||
Lag = 2 | 0 | 0.074 | 0.463 |
In this case study, the Pettitt test (Pettitt, 1979) is applied to detect the change point of NWDs. The Pettitt test is a version of Mann–Whitney’s U-test. The null hypothesis of the Pettitt test is that there is no change point detected. Similar to Mann–Kendall test, the U-score of the Pettitt test is given as follows:

The test statistic is then given as follows:

and the P-value is approximated as follows:

In Equation (14.3), N is the sample size, and X is the observed series.
Applying the Pettitt test, we detect the change point at month 150 (i.e., June 2002). Now, with the initial analysis, we can proceed to further analyze the monthly precipitation and NWDs.
14.3 Univariate Analysis of Monthly Precipitation and NWDs
In the previous section, we have shown that monthly precipitation belongs to stationary signal, while there exists a changing point at month 150 (June 2002) for NWDs. To this end, the exponential distribution is fitted to model monthly precipitation and the time-varying GEV distribution is applied to model the NWDs. In the case of GEV distribution applied, we only consider a linear change in the location parameter. Table 14.4 lists the fitted parameters and GoF statistics for the fitted univariate distributions, and Figure 14.3 plots the histogram and fitted probability density functions as well as the change of the location parameter for the NWDs after month 150 (June 2002).
Table 14.4. Results of univariate analysis.
Variables | Distribution | Parameters | GoF | ||
---|---|---|---|---|---|
Test stat. | P-value | ||||
Monthly precipitation | Exponentiala | μ = 64.65 mmμ=64.65mm | 0.05 | 0.46 | |
NWDs | Before June 2002 | GEVb | k = − 0.32, s = 8.21, μ = 15.70k=−0.32,s=8.21,μ=15.70 | 0.21 | 0.19 |
After June 2002 | Trend | μt = 16.91 − 0.014t, t = 151 : 327μt=16.91−0.014t,t=151:327 |
Notes: a KS test for GoF evaluation; b generalized extreme value distribution.
Figure 14.3 Fitted distributions for monthly precipitation, NWDs, as well as the change of location parameter of GEV distribution for NWDs after month 150 with moving window size 1.
14.4 Bivariate Analysis of Monthly Precipitation and NWDs
The bivariate analysis of monthly precipitation and NWDs is investigated with the use of copula theory. Unlike stationary copula models applied in the previous chapters, the time-varying copula is applied to model monthly precipitation and NWDs. The time-varying copula may be written using C(u, v; θt)Cuvθt, where the stationary copula is applied before June 2002 (month 150 before the change) and the time-varying copula with a moving average window size 1 applied after the change. Figure 14.4 plots the sample Kendall’s tau coefficients for the monthly precipitation and NWDs before the change for the entire dataset, assuming the NWDs as stationary, and those with the moving average window size 1 after the change point. Figure 14.4 shows a decreasing trend after June 2002, i.e., monthly precipitation and NWDs get more negatively correlated, or equivalently longer (severer) drought may be expected with less precipitation.
Figure 14.4 Sample Kendall correlation coefficients computed.
With the negative Kendall correlation coefficient estimated, the Frank copula (Archimedean family) and meta-Student t and meta-Gaussian copulas (the meta-elliptic family) are applied to model the monthly precipitation and NWDs. The stationary copula is applied for the bivariate data before June 2002, while the time-varying copula is applied for the bivariate data after June 2002.
Applying the pseudo-MLE to the monthly precipitation and NWDs before June 2002, Table 14.5 lists the parameter and log-likelihood estimated for each copula candidate. It is seen from Table 14.5 that the meta-Student t copula converges to the meta-Gaussian copula. From comparison of log-likelihood values obtained from all three candidates, the meta-Gaussian copula is applied to model the monthly precipitation and NWDs before June 2002 (SnB=0.028,P=0.623). Figure 14.5 compares simulated variables with observed variables before June 2002. Comparison shows that the Gaussian copula properly models monthly precipitation and NWDs before the change point.
Table 14.5. Estimated parameters and corresponding LogLs.
Frank | Gaussian | Student t | |
---|---|---|---|
Parameters | –3.282 | –0.513 | [−0.532, 4.67 × 106]−0.5324.67×106 |
LogL | 19.25 | 22.86 | 22.91 |
Figure 14.5 Comparison of simulated variables with observed variables before June 2002.
With the moving window size 1, the time-varying Gaussian copula is applied to monthly precipitation and NWDs after the changing point with the estimated parameters plotted in Figure 14.6. Figure 14.6 shows the overall decreasing trend as that of the Kendall correlation coefficient.
Figure 14.6 Parameters estimated after the change point with moving window size 1 (the meta-Gaussian copula).
14.5 Risk Analysis with Meta-Gaussian Copula
To assess the compound risk, 25 and 75 percentiles of monthly precipitation and NWDs are computed from the original dataset as follows:
To assess the duration and severity of drought, one may look at two types of compound risks, i.e.,
Equations (14.4a)–(14.4b) indicates the risk (probability) of occurrence of dry condition and warm days as well as the warm days conditioned on the dry condition. According to return periods discussed in Chapter 3, Equations (14.4a)–(14.4b) may be rewritten as follows:


In Equations (14.5a)–(14.5b), the fitted time-varying Gaussian copula is applied in which the copula parameter is constant before June 2002 and changes with moving window size 1 after June 2002. Before June 2002, the joint probability is computed as P = 0.112 using Equation (14.5a), and the conditional probability is computed as P = 0.649 using Equation (14.5b), with the use of the stationary meta-Gaussian copula with parameter θ = − 0.513θ=−0.513. After June 2002, the joint and conditional probabilities are computed for each moving window and plotted in Figure 14.7. Figure 14.7 shows that the joint probabilities of concurrence of NWDs and dry conditions are within the range of [0.092, 0.111] with the average of 0.099, and the conditional probabilities (i.e., NWDs provided dry weather conditions) are within the range of [0.532, 0.642] with the average of 0.578. Comparing to the conditional probabilities computed, the joint probabilities are more stable. The risk of having more abnormal warmer days in a month is higher providing the dry weather conditions (i.e., monthly precipitation is at lower 25 percentile).
Figure 14.7 Time-varying joint and conditional probability assessed with the time-varying Gaussian copula.
Wet and cold are another compound risk in which one may be interested, especially in the case of a long cold and wet winter, using the following:
Similar to the hot and dry conditions, Equations (14.6) may be rewritten as follows:


Similar to the risk analysis for the dry and hot condition, the joint probability using Equation (14.7a) is computed as P = 0.121, and the conditional probability using Equation (14.7b) is computed as P = 0.158, with the use of the stationary meta-Gaussian copula (θ = − 0.514)θ=−0.514). After June 2002, the joint and conditional probabilities are computed for each moving window using the time-varying meta-Gaussian copula and plotted in Figure 14.8.
Figure 14.8 Probability of compound risk for wet and cold.
In the case of the compound risk of wet and cold, Figure 14.8 shows that the joint probabilities are within the range of [0.129, 0.184] with the average of 0.156, and the conditional probabilities are within the range of [0.489, 0.700] with the average of 0.592. Again, the joint probabilities computed are more stable than are the conditional probabilities. The risk of having fewer warm days in a month is higher under the condition of monthly precipitation higher than its 75 percentile than the concurrent joint probability.
14.6 Summary
In this chapter, we have applied copula theory to compound risk analysis. Throughout the study, the NWDs and monthly precipitation are applied to assess the following compound risk:
1. Hot and dry conditions, which may be considered as a compounding factor for severe draught
2. Cold and wet conditions, which may be considered as a compounding factor for cold winter
The application shows that the joint probabilities for both wet/cold and dry/warm conditions are smaller than the marginal exceedance probabilities; while the conditional probabilities for warm condition given dry conditions and cold conditions given wet conditions fall in between marginal exceedance probabilities. In addition, the study of compound risk may better investigate extreme events such as drought and winter storms.