Abstract
In previous chapters, we have discussed the Archimedean and non-Archimedean copula families. In this chapter, we will introduce entropic copulas. To be more specific, we will concentrate on the entropic copulas (i.e., most entropic canonical copulas) for the bivariate case. With proper constraints (e.g., the pair rank-based correlation coefficients), the bivariate entropic copula may be easily extended to the higher dimension.
8.1 Entropy Theory and Its Application
Entropy theory has been widely applied to univariate frequency analysis for obtaining the most probable probability distribution of a random variable or the so-called maximum entropy (MaxEnt)–based distribution. The MaxEnt-based distribution is derived with the use of the principle of maximum entropy (Jaynes, 1957a, 1957b), subject to given constraints for the random variable, e.g., first moment, second moment, first moment in logarithm domain, etc. The univariate MaxEnt-based distribution is capable of capturing the shape, mode, as well as the tail of the univariate random variable, since the first four noncentral moments of the random variable almost fully approximate its probability density function. In a similar vein, the entropy theory can also be employed for multivariate hydrological frequency analysis. Conventionally, the MaxEnt-based joint distributions are constructed with the use of covariance (or Pearson’s linear correlation coefficient) as constraints (Singh and Krstanovic, 1987; Krstanovic and Singh, 1993a, b; Hao and Singh, 2011; Singh et al., 2012; Singh, 2013, 2015). With the copulas gaining popularity in bivariate/multivariate frequency analysis in hydrology and water resources engineering (Favre, 2004; De Michele, et al., 2005; Kao and Govindaraju, 2007; Vandenberghe et al., 2011; Zhang and Singh, 2012), the entropy theory has been introduced to copula-based bivariate/multivariate frequency analysis. The entropy-based copula modeling may be generalized as follows:
1. The marginal distributions are derived with the use of maximum entropy principle (i.e., MaxEnt-based marginals), and the dependence structure is studied with the use of parametric copulas (e.g., Hao and Singh, 2012; Zhang and Singh, 2012).
2. The dependence function (i.e., copula function) is also derived from the entropy theory (e.g., Chu, 2011).
In the following sections, we will first briefly introduce the Shannon entropy (Shannon, 1948) followed by the derivation of entropic copula.
8.2 Shannon Entropy
In general, entropy is a measure of uncertainty or information of a random variable or its underlying probability distribution, and Shannon entropy (Shannon, 1948) is one measure of uncertainty. The MaxEnt-based distribution may be derived by maximizing the Shannon entropy, subjected to given constraints, which is the least biased and most probable distribution in concert with the principle of maximum entropy. The Shannon entropy for a continuous univariate random variable X can be written as follows:
where H denotes the Shannon entropy, and f(x)fx denotes the probability density function of random variable X.
The commonly applied constraints to derive the MaxEnt-based distribution from Equation (8.1) may be the following:
Similarly, the Shannon entropy for the continuous bivariate variables X and Y can be written as follows:
Besides the constraints defined in Equation (8.1a) for a continuous univariate random variable, the other common constraints to derive the MaxEnt-based joint density function f(x, y)fxy are as follows:
E(xy)Exy in Equation (8.2a) can be written through covariance (i.e., dependence) between random variables X and Y as follows:
One may refer to Singh (1998, 2013, 2015) in regard to its classical application and parameter estimation. In the section that follows, we will focus on the entropy application to copulas.
8.3 Entropy and Copula
In the previous chapters, we have shown that the joint probability density function may be expressed through the copula density function (i.e., c(u, v)cuv) as follows:
where fX, fY, FX, FyfX,fY,FX,Fy represent, respectively, the probability density function (pdf) and distribution function (cdf) of random variables X and Y; f(x, y)fxy denotes the joint probability density function (jpdf) of random variables X and Y; and c(u, v)cuv denotes the copula density of random variables X and Y.
Equation (8.3) shows that the dependence function and marginal distributions of bivariate random variables can be investigated separately. The Shannon entropy of the copula function may be written as follows:
Substituting Equation (8.3) into Equation (8.4), we can show that the Shannon entropy of the copula (i.e., Equation (8.4)) is equivalent to the negative mutual information of random variables X and Y as follows:
Assigning proper constraints, the entropic copula can be derived by maximizing Shannon entropy of the copula (i.e., Equation (8.4)), subject to appropriate constraints. The common constraints for deriving the most entopic copulas are the constraints of total probability of marginals (i.e., for uniform distributed variable on [0, 1]), and measure of dependence (also called association):
In Equation (8.6d), Spearman’s rho can be applied as the constraint to measure the dependence if aj(u, v) = uvajuv=uv with Θj=ρs+312. From Equation (3.69), it is clear that with aj(u, v) = uvajuv=uv, we have ∫01∫01uvcuvdudv=ρs+312. One can also apply other dependence measures, such as Blest’s measure and Gini’s gamma, discussed in Nelsen (2006) and Chu (2011). Additionally, Equations (8.6b) and (8.6c) indicate we don’t need to know the true underlying marginal distribution to solve for the multipliers of the constraints regarding the marginal variables, since the CDF of any marginal distribution follows the uniform distribution in [0, 1].
Using the constraints (Equations (8.6a)–(8.6d)), the Lagrangian function for the most entropic canonical copula (MECC) can be written as follows:
where λ0, …, λm, γ1, …, γm, λm + 1, …, λm + kλ0,…,λm,γ1,…,γm,λm+1,…,λm+k are the Lagrange multipliers. λU = [λ1, …, λm], γV = [γ1, …, γm]λU=λ1…λm,γV=γ1…γm are the Lagrange multipliers for the first n noncentral moments of uniformly (0, 1) distributed random variables U and V, respectively. More specifically for MECC, λU = γV : λr = γr, r = 1, …, mλU=γV:λr=γr,r=1,…,m. λm + 1, …, λm + kλm+1,…,λm+k are the Lagrange multipliers pertaining to the constraints of rank-based dependence measure.
Differentiating Equation (8.7) with respect to c(u, v)cuv, we have the following:
Similar to the univariate MaxEnt-based distribution, the partition (also called potential) function of the entropic copula can be written as follows:
or equivalently
In Equations (8.9a) and (8.9b),
To this end, the Lagrange multipliers may be estimated by minimizing the partition function given as Equation (8.9a)–(8.9b).
So far, we have derived the MECC. The MECC may be generalized to most entropic copula (MEC) with respect to a given parametric copula (Chu, 2011). In the case of MEC, Equations (8.8), (8.9a) and (8.9b) can be rewritten as follows:
In Equations (8.10) and (8.11a), b is a generic constant, and c˜uv is the given copula. It is seen that the MECC is obtained by setting b = 0 (i.e., Equation (8.11b)). In what follows, we will provide examples to illustrate applications of MECC for bivariate cases.
The true copula modeling the dependence of random variables X and Y is the Gumbel–Hougaard copula with parameter θ = 2.5θ=2.5:
i. Construct MECC using empirical marginals.
ii. Construct MECC using MaxEnt-based marginals.
iii. Construct MECC using the true underlying population X~Gamma (3,4), Y~Gaussian (5,32).
iv. Compare the constructed MECC with the underlying copula function as the Gumbel–Hougaard copula with parameter θ = 2.5θ=2.5.
X | Y | X | Y | X | Y | X | Y |
---|---|---|---|---|---|---|---|
22.73 | 10.53 | 4.20 | 4.37 | 4.42 | 1.33 | 16.80 | 6.38 |
8.46 | 1.78 | 17.27 | 8.12 | 26.97 | 10.26 | 12.73 | 6.43 |
18.68 | 8.37 | 17.18 | 7.41 | 19.05 | 5.89 | 8.77 | 0.79 |
11.41 | 4.85 | 14.50 | 7.73 | 8.80 | 3.33 | 5.45 | 4.26 |
13.73 | 5.56 | 8.11 | 1.39 | 11.63 | 1.24 | 11.04 | 4.61 |
11.74 | 4.55 | 26.87 | 11.63 | 13.37 | 5.58 | 13.68 | 6.74 |
3.90 | 0.15 | 8.62 | 1.00 | 2.46 | −0.20 | 12.40 | 7.07 |
14.77 | 6.12 | 20.14 | 4.85 | 5.73 | 1.81 | 19.56 | 8.62 |
12.09 | 5.48 | 19.97 | 7.84 | 4.20 | −0.05 | 9.56 | 2.44 |
8.17 | 3.51 | 24.13 | 10.92 | 26.37 | 10.13 | 13.00 | 6.02 |
16.60 | 3.30 | 11.79 | 5.13 | 14.04 | 6.83 | 9.92 | 5.37 |
16.70 | 7.21 | 3.05 | 0.17 | 25.73 | 9.96 | 9.05 | 2.16 |
12.12 | 6.63 | 14.30 | 5.11 | 15.90 | 4.24 | 11.11 | 2.18 |
7.73 | 4.13 | 12.45 | 7.83 | 8.93 | 2.51 | 25.19 | 9.11 |
13.16 | 5.71 | 4.83 | 0.12 | 7.34 | 4.30 | 6.90 | 4.31 |
13.45 | 2.15 | 17.13 | 8.02 | 11.90 | 5.78 | 6.35 | 4.98 |
10.96 | 1.88 | 22.03 | 8.55 | 7.81 | 5.46 | 29.83 | 10.39 |
6.67 | 3.24 | 15.66 | 6.50 | 4.39 | −2.38 | 14.50 | 5.69 |
19.41 | 8.85 | 7.35 | 5.74 | 10.07 | 4.45 | 11.18 | 1.80 |
7.54 | 2.92 | 9.00 | 4.34 | 9.90 | 4.13 | 5.27 | 0.02 |
7.54 | 4.00 | 3.07 | −1.52 | 10.60 | 5.43 | 24.82 | 10.28 |
10.79 | 3.15 | 7.58 | 2.90 | 14.09 | 3.80 | 6.67 | 3.19 |
14.57 | 3.08 | 12.08 | 5.17 | 9.59 | 2.29 | 19.74 | 9.12 |
11.03 | 5.02 | 8.57 | 6.03 | 4.47 | 3.02 | 15.10 | 4.63 |
23.81 | 9.98 | 7.31 | 5.73 | 2.52 | 7.36 | 14.24 | 6.09 |
Solution: Before we proceed to build the MECC, we first plot the histograms, the frequency computed from the true population and MaxEnt-based probability distribution in Figure 8.1. The MaxEnt-based univariate distribution (plotted in Figure 8.1) will be further explained in later sections. One purpose of applying empirical, true population and MaxEnt-based univariate distributions is to evaluate the impact of marginals on the derived copula function.
Figure 8.1 Histograms and underlying true probability density functions.
Furthermore, throughout the example, the first two noncentral moments of the marginals (Equations (8.12a) and (8.12b)) and E(UV)EUV, which is one-to-one related to the rank-based correlation coefficient, Spearman’s rho (Equation (8.12c)), will be applied as the constraints for the MECC as follows:
In Equation (8.12c), sample ρ̂s is computed using Equation (3.70).
In what follows, we will proceed with constructing MECC with different marginal distributions.
Construct MECC using empirical distribution
The empirical probability is computed with the use of Weibull plotting position formula (Equation (3.103)) that is partially listed in Table 8.2. Minimizing the partition (i.e., objective) function (Equation (8.9a)) using the MATLAB optimization toolbox (e.g., the GA/fminsearch function), Table 8.3 lists the Lagrange multipliers estimated and the relative differences between moment constraints computed from the MECC (Equation (8.12)) and the corresponding sample moments. Figure 8.2 compares the constructed MECC with the empirical copula.
No. | Empirical (Weibull) | MaxEnt-based | Underlying population | |||
---|---|---|---|---|---|---|
X | Y | X | Y | X~Gamma (3,4) | Y~Gaussian (5,32) | |
1 | 0.901 | 0.970 | 0.928 | 0.972 | 0.922 | 0.967 |
2 | 0.287 | 0.139 | 0.308 | 0.155 | 0.354 | 0.141 |
3 | 0.822 | 0.851 | 0.823 | 0.861 | 0.845 | 0.869 |
4 | 0.485 | 0.485 | 0.479 | 0.483 | 0.543 | 0.480 |
5 | 0.644 | 0.584 | 0.607 | 0.571 | 0.667 | 0.574 |
6 | 0.505 | 0.446 | 0.498 | 0.446 | 0.562 | 0.441 |
7 | 0.050 | 0.069 | 0.064 | 0.059 | 0.076 | 0.053 |
8 | 0.723 | 0.693 | 0.660 | 0.640 | 0.713 | 0.646 |
9 | 0.545 | 0.574 | 0.518 | 0.561 | 0.582 | 0.563 |
10 | 0.277 | 0.327 | 0.291 | 0.321 | 0.335 | 0.310 |
11 | 0.762 | 0.307 | 0.744 | 0.298 | 0.783 | 0.286 |
12 | 0.772 | 0.772 | 0.748 | 0.759 | 0.786 | 0.769 |
13 | 0.554 | 0.733 | 0.520 | 0.698 | 0.583 | 0.706 |
14 | 0.248 | 0.366 | 0.266 | 0.394 | 0.305 | 0.386 |
15 | 0.604 | 0.614 | 0.577 | 0.590 | 0.639 | 0.594 |
… | … | … | … | … | … | … |
… | … | … | … | … | … | … |
… | … | … | … | … | … | … |
86 | 0.396 | 0.545 | 0.393 | 0.547 | 0.451 | 0.549 |
87 | 0.356 | 0.188 | 0.342 | 0.186 | 0.394 | 0.172 |
88 | 0.465 | 0.198 | 0.462 | 0.188 | 0.525 | 0.174 |
89 | 0.941 | 0.891 | 0.965 | 0.911 | 0.950 | 0.915 |
90 | 0.178 | 0.406 | 0.219 | 0.416 | 0.250 | 0.409 |
91 | 0.149 | 0.495 | 0.189 | 0.499 | 0.214 | 0.498 |
92 | 0.990 | 0.960 | 0.999 | 0.968 | 0.979 | 0.964 |
93 | 0.693 | 0.604 | 0.647 | 0.587 | 0.702 | 0.590 |
94 | 0.475 | 0.149 | 0.466 | 0.156 | 0.530 | 0.143 |
95 | 0.119 | 0.050 | 0.132 | 0.054 | 0.147 | 0.048 |
96 | 0.931 | 0.950 | 0.961 | 0.964 | 0.947 | 0.961 |
97 | 0.168 | 0.287 | 0.206 | 0.286 | 0.234 | 0.273 |
98 | 0.861 | 0.901 | 0.857 | 0.911 | 0.870 | 0.915 |
99 | 0.733 | 0.465 | 0.676 | 0.455 | 0.727 | 0.451 |
100 | 0.673 | 0.683 | 0.633 | 0.636 | 0.690 | 0.642 |
Variable | λ0λ0 | λ1λ1 | λ2λ2 | λ3λ3 | |
---|---|---|---|---|---|
X | Parameters | −0.134 | −2.666 | 5.116 | 0.000 |
Relative diff. | 3.11E−07 | 3.73E−07 | 2.61E−03 | ||
Y | Parameters | 1.945 | −9.615 | 9.189 | 0.000 |
Relative diff. | 2.10E−07 | 3.68E−07 | 8.35E−04 |