## Abstract

In this chapter, the copula modeling is applied to analyze compound extremes. The number of warm days (NWDs) and monthly precipitation are applied for the case study. The time-varying generalized extreme value (GEV) distribution with a linear trend in the location parameter is applied to model the NWDs after the change. The time-varying copula is applied to model the compound risk of hot and dry, as well as wet and cold days.

### 14.1 Introduction

Extreme events (e.g., peak flow, heat wave, etc.) have been conventionally analyzed as univariate variables with the use of such distributions as generalized extreme value (GEV) distribution. These events have also been analyzed in bivariate (multivariate) frameworks considering their intrinsic characteristics (e.g., peak discharge, flood volume and flood duration in flood frequency analysis; drought severity, duration, and interarrival time in drought frequency analysis). This multivariate framework applies the intrinsic properties to better represent the risk induced by the events. However, there may be other variables (factors) that may either increase or decrease the risk of occurrence of extreme events. For example, heat wave (or high temperature) in general increases drought severity, stresses plant growth, increases evapotranspiration, impacts bacterial or viral activity, etc. When more variables (or extremes of different types) than one are analyzed, analysis of extremes is called compound (or concurrent) analysis. In what follows, we will first briefly review recent studies.

Using the hypothesis of flood and sea surge being more likely to occur concurrently on the east coast of Britain than the north coast, Svensson and Jones (2002) proposed the *χ*χ empirical dependence measure to evaluate the flood, surge, and precipitation for the spatial dependence of flood, surge, or precipitation of different stations, as well as for the cross variable, with the assumption of flood, surge, and precipitation being independent identically distributed (i.i.d.) random variables. The proposed *χ*χ dependence measure may be applied to investigate the concurrence of extremes, i.e., the probability of one variable being extreme provided the other one is extreme.

Hao et al. (2013) evaluated the occurrence of the compounding monthly precipitation and temperature extremes using the data from the Climate Research unit, University of Delaware, and the simulations from CMIP5 models. Pertaining to precipitation and temperature, four combinations were considered for evaluation: wet/warm (P75/T75); dry/warm (P25/T75); wet/cold (P75/T25); and dry/cold (P25/T25). Their investigation concluded the increasing occurrences of wet/warm and dry/warm for some regions in the world with the decreasing occurrences of wet/cold and dry/cold for a majority of the world.

Wahl et al. (2015) studied the compound flooding risk from storm surge and heavy rainfall for major coastal cities in the United States. Using rank-based correlation, their study revealed that the compounding flood risk was higher at the Atlantic/Gulf coast than at the Pacific coast. Additionally, the number of events increased due to the long-term sea level rise in the past century (Wahl et al., 2015). Using the copula theory, Miao et al. (2016) studied the stochastic relation of precipitation and temperature in the Loess Plateau in China.

Sedlmeier et al. (2016) investigated compound extremes under climate change. In their study, heavy precipitation and low temperature in winter, and high temperature and dry days in summer, were applied for compound extreme analysis using the Markov Chain method. Through the study, they were able to identify three regions that may be more likely to be impacted due to the future change in terms of heavy precipitation and low temperature in the winter. They also identified one region likely to be impacted by the future change of dry and hot summer. In this chapter, we will focus on applying the copula theory to analyze compound extremes.

### 14.2 Dataset

To illustrate the analysis, maximum daily temperature and daily precipitation were collected from NOAA at USC00411720 (Choke Canyon Dam, Texas). The range of data was from water year 1983 (October 1, 1983–April 7, 2017). In the data collected from NOAA, there were five months of missing data as listed in Table 14.1.

Jan. 1985 | Oct. 1986 | Aug. 1988 | Dec. 1989 | Oct. 2003 |

To obtain the complete time series, the nearby station, i.e., USC00411337 (Calliham, Texas), close to USC00411720, is chosen to fill the missing precipitation and temperature. By replacing the missing precipitation and temperature with those at USC00411337, we see that the missing precipitation is successfully replaced. However, the missing temperature cannot be successfully filled for the months listed in Table 14.1 except for October 2003. Thus, to keep the continuity of daily precipitation and temperature, daily information starting from the calendar year of 1990 is applied for analysis.

Besides the missing values listed in Table 14.1, Table 14.2 lists the days with missing precipitation (and/or temperature) as well as the replaced values. These missing values are filled, with the rules as follows:

i. Replacing the missing precipitation (and/or temperature) with the available observation at USC00411337 on the same day;

ii, Otherwise, replacing the missing precipitation (and/or temperature) with the average values of one day before and one day after of both two stations. Using February 4, 2011, as an example, the missing temperature of that day is filled using the temperatures of February 3, 2011, and February 5, 2011, at both stations USC00411720 and USC00411337.

Precipitation^{a} (mm/day) |
Temperature^{b} (^{o}C) | ||||||
---|---|---|---|---|---|---|---|

01/13/1997 | 0 | 02/13/2012 | 0.5 | 03/29/2012 | 39.1 | 02/04/2011 | 2.2 |

09/18/2011 | 0 | 03/09/2012 | 5.1 | 07/11/2012 | 5.1 | 05/05/2011 | 27.8 |

12/11/2011 | 14.5 | 03/10/2012 | 5.1 | 09/14/2012 | 44.5 | 05/25/2014 | 30.6 |

01/25/2012 | 9.1 | 03/11/2012 | 17.8 | 09/29/2012 | 81.3 | 04/05/2015 | 20 |

*Note:* ^{a} Applied rule (i); ^{b} applied rule (ii).

With missing daily precipitation and maximum temperature data filled, we may compute monthly precipitation and the number of warm days (NWD) for each month. The NWD is computed as follows:

in which: *i, j* represent the year and month of observation, *n*_{j}nj represents the number of days for month *j*, and Tj¯ represents the sample average monthly maximum temperature computed from the entire dataset.

Figure 14.1 plots the individual time series and the scatter plot. The scatter plot indicates the negative relation between monthly precipitation and NWDs. The negative relation is supported by the rank-based sample Kendall’s tau coefficient of correlation, and we get *τ*_{N}≈ − 0.38τN≈−0.38. To assess the stationarity for the time series, the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) and Mann–Kendall tests are performed.

Figure 14.1 Time series of monthly precipitation and NWD.

The null hypothesis of the KPSS test is that the time series is trend stationary (or level stationary, i.e., no trend). The alternative hypothesis of KPSS test is that the time series is a unit-root process. To perform the KPSS test, the time series {*X*_{t} : *t* = 1, 2, …, *n*}Xt:t=12…n is expressed as a sum of three components, deterministic trend, random walk, and stationary residual, as follows:

*X*

_{t}=

*αt*+

*r*

_{t}+

*e*

_{1t}

*r*

_{t}=

*r*

_{t − 1}+

*e*

_{2t}

In Equation (14.1), *α*α represents the deterministic trend with *α* = 0α=0 for the test of level stationary; *r*_{t}rt represents the random walk; *e*_{1t}e1t represents the stationary process; and *e*_{2t}~*i*. *i*. *d*. (0, *σ*^{2})e2t~i.i.d.0σ2. With Equation (14.1), the null hypothesis may be rewritten as follows:

*H*

_{0}:

*α*≠0,

*σ*

^{2}= 0,

*for trend stationary*;

*α*= 0,

*σ*

^{2}= 0

*for level stationary*

To assess the stationarity of the univariate time series, we can directly apply KPSS test function in MATLAB using the following: [*h*, *P*-Value, Statistics, Critical Value]=KPSS test(X, ‘lags’, a, ‘trend’ true/false, ‘alpha’, alpha), where *X* is the time series tested; *a* is the number of lag considered; ‘trend’, true represents the trend stationary (default) and false represents the level stationary; and ‘alpha’ represents the significance level (default = 0.05)].

Originally proposed by Mann (1945) and Kendall (1970), the nonparametric Mann–Kendall test evaluates whether there exists a monotonic trend in the dataset. The null hypothesis is that the data are i.i.d. random variables with the alternative hypothesis of monotonic trend existing in the dataset. The Mann–Kendall test statistics is computed using the *S*-score as follows:

The *S-*score in Equation (14.2a) has the following statistics: