3 – Copulas and Their Properties




Abstract




The term copula is derived from the Latin verb copulare, meaning “to join together.” In the statistics literature, the idea of a copula can be dated back to the nineteenth century in modeling multivariate non-Gaussian distributions. By formulating a theorem, now called Sklar theorem, Sklar (1959) laid the theoretical foundation for the modern copula theory. In general, copulas couple multivariate distribution functions to their one-dimensional marginal distribution functions, which are uniformly distributed in [0, 1]. In other words, copula functions enable us to represent a multivariate distribution with the use of univariate probability distributions (sometimes simply called marginals, or margins), regardless of their forms or types. In this chapter, we will discuss the general concepts of copulas, including their definition, properties, composition and construction, dependence structure, and tail dependence.





3 Copulas and Their Properties




3.1 Definition of Copulas


Based on the Sklar’s theorem definition (Sklar, 1959), a copula has two or more dimensions. Let d be the dimension of a copula. Then, a d-dimensional copula can be defined as a mapping function of [0, 1]d→[0, 1]01d→01, i.e., a multivariate cumulative distribution function can be defined in [0, 1]d01d with standard uniform univariate margins.


Copula has the following properties:




  1. 1. Let u = [u1, …, ud], ui = Fi(xi) ∈ [0, 1]u=u1…ud,ui=Fixi∈01, if ui = 0ui=0 for any i ≤ di≤d(at least one coordinate of u equals 0).



    C(u1, …, ud) = 0
    Cu1…ud=0
    (3.1)



  2. 2. C(u) = uiCu=ui, if all the coordinates are equal to 1 except uiui, i.e.,



    C(1, 1, …, ui, …, 1, 1) = ui,  ∀ i ∈ {1, 2, …, d}, ui ∈ [0, 1]
    C11…ui…11=ui,∀i∈12…d,ui∈01
    (3.2)



  3. 3. C(u1, …, ud)Cu1…ud is bounded, i.e., 0 ≤ C(u1, …, ud) ≤ 10≤Cu1…ud≤1. This property represents the limit of the cumulative joint distribution, i.e., in the range of [0, 1].



  4. 4. C(u1, …, ud)Cu1…ud is d-increasing. This means that the volume of any d-dimensional interval is nonnegative, ∀{(a1, …, ad), (b1, …, bd)} ∈ [0, 1]d∀a1…adb1…bd∈01d, where ai ≤ bi,ai≤bi,


    ∑i1=12∑i2=12⋯∑i1=12−1i1+i2+…+idCx1i1x2i2…xdid≥0(3.3)


This property indicates the monotone increasing property of the cumulative probability distribution.




  1. 5. For every copula C(u1, …, ud)Cu1…ud and every (u1, …, ud)u1…ud in [0, 1]d01d, the following version of the Fréchet–Hoeffding bounds hold:



    W(u1, …, ud) ≤ C(u1, …, ud) ≤ M(u1, …, ud); d ≥ 2
    Wu1…ud≤Cu1…ud≤Mu1…ud;d≥2
    (3.4)


where Wu1…ud=max1−d+∑i=1dui0 represents the perfectly negatively dependent random variables; M(u1, …, ud) =  min (u1, …, ud)Mu1…ud=minu1…ud represents the perfectly positively dependent random variables.


Here, we will first explain the first two properties using the bivariate flood variables (i.e., peak discharge (Q) and flood volume (V)) as an example. Let Q~FQ(q), V~FV(v)Q~FQq,V~FVv in which FQ ≡ u1, FV ≡ u2FQ≡u1,FV≡u2 represent the probability distribution functions of {Q : Q ≥ Qmin}, {V : V ≥ Vmin}Q:Q≥Qmin,V:V≥Vmin, respectively.


To explain property (1), we set u1 = FQ(q), q>Qminu1=FQq,q>Qmin and u2 = FV(v ≤ Vmin) = 0u2=FVv≤Vmin=0. We have C(u1, 0) = H(Q ≤ qV ≤ Vmin)Cu10=HQ≤qV≤Vmin. With the joint distribution being nondecreasing, we know the volume of the interval [QminVmin] × [qVmin] = [0, 0] × [u1, 0] ≡ 0QminVmin×qVmin=00×u10≡0 which means when the flood volume is lower than the minimum flood volume, the joint distribution of H(Q ≤ qV ≤ Vmin) = C(u1, 0) ≡ 0HQ≤qV≤Vmin=Cu10≡0. Similarly, we have the following:



H(Q ≤ QminV < v) = C(0, u2) ≡ 0.
HQ≤QminV<v=C0u2≡0.

To explain property (2), we will again use the bivariate flood variable (i.e., peak discharge and flood volume) as an example. Based on the probability theory, we have the following:



C(u1, 1) = H(Q ≤ qV <  + ∞) = FQ(q) ≡ u1  and
Cu11=HQ≤qV<+∞=FQq≡u1and


C(1, u2) = H(Q <  + ∞, V ≤ v) = FV(v) ≡ u2
C1u2=HQ<+∞V≤v=FVv≡u2



Example 3.1 Explain and prove the first three copula properties.


Solution: Proof of properties (1) and (2).


Properties (1) and (2) may be explained directly using the Fréchet–Hoeffding bounds.




  1. a. C(u1, …, 0, …, ud) = 0, if ui = 0.Cu1…0…ud=0,ifui=0.


Since copula C(u1, …, ud)Cu1…ud represents the joint cumulative probability distribution of random variables {X1, …, Xd}X1…Xd, from Equation (3.4), we have the following:



W(u1, …, 0, …, ud) ≤ C(u1, …, 0, …, ud) ≤ M(u1, …, 0, …ud)
Wu1…0…ud≤Cu1…0…ud≤Mu1…0…ud

From


Wu1…0…ud=max1−d+∑i=1dui0=max1−d+u1+…+ui−1+ui+1+…+ud0


u1. + … + ui − 1 + ui + 1 + … + ud ≤ d − 1;  ∃ u ∈ [0, 1]; and we have
u1+…+ui−1+ui+1+…+ud≤d−1;∃u∈01;andwe have


1 − d + u1 + … + ui − 1 + ui + 1 + … + ud ≤ 1 − d + d − 1 ≤ 0
1−d+u1+…+ui−1+ui+1+…+ud≤1−d+d−1≤0


W(u1, …, 0, …ud) = 0
⇒Wu1…0…ud=0

and



M(u1, …, 0, …, ud) =  min (u1, …, 0, …, ud) = 0
Mu1…0…ud=minu1…0…ud=0

Now we have C(u1, …, ud) = 0,  ∃ ui = 0, i ∈ [1, d]Cu1…ud=0,∃ui=0,i∈1d. This proves property (1) with ui = 0ui=0. Similarly, property (1) holds for more than one variable equal to zero.




  1. b. C(u1, …, ud) = ui,  ∃ uj = 1 j ∈ [1, dand jiCu1…ud=ui,∃uj=1j∈1dandj≠i


Applying the Fréchet–Hoeffding bounds, we have the following:


Wu1…ud=max1−d+d−1+ui0=uiMu1…ud=minu1…ud=min1…ui…1=ui

Thus, we have C(1, …, 1, ui, 1, …, 1) = uiC1…1ui1…1=ui. This proves property 2.


Proof of property (3): It can be shown that if the copula represents the joint cumulative probability distribution of d-dimensional variables, the limit of copula should be [0, 1]. Property (4), i.e., Fréchet–Hoeffding bounds, further ensures property (3).




Example 3.2 Illustrate a case for d = 2d=2 in Equation (3.3) of property (4).


Solution: For d = 2d=2, we have (a1a2), (b1b2) ∈ [0, 1]2a1a2,b1b2∈012 and a1 ≤ a2, b1 ≤ b2a1≤a2,b1≤b2 as shown in Figure 3.1(a):


∑i1=12∑i2=12−1i1+i2Cx1i1,x2i2≥0(3.5)


∑i1=12∑i2=12−1i1+i2Cx1i1x2i2=∑i1=12−1i1+1Cx1i1x21+−1i1+2Cx1i1x22=−12Cx11x21+−13Cx11x22+−13Cx12x21+−14Cx12x22=Cx11x21−Cx11x22−Cx12x21+Cx12x22

Therefore, Equation (3.5) follows:



C(a1a2) − C(a1b2) − C(b1a2) + C(b1b2) ≥ 0
Ca1a2−Ca1b2−Cb1a2+Cb1b2≥0
(3.6)




Figure 3.1 Schematic plots: (a) Example 3.2 and (b) Example 3.3.




Example 3.3 Illustrate a case for d = 3d=3 in Equation (3.3) of property (4).


Solution: For d = 3 with {(xyz) : (x1x2), (y1y2), (z1z2) ∈ [0, 1]3}d=3withxyz:x1x2y1y2z1z2∈013, where x1 ≤ x2, y1 ≤ y2, z1 ≤ z2x1≤x2,y1≤y2,z1≤z2 as shown in Figure 3.1(b),


∑i1=12∑i2=12∑i3=12−1i1+i2+i3Cx1i1x2i2x3i3≥0(3.7)

and


∑i1=12∑i2=12∑i3=12−1i1+i2+i3Cx1i1x2i2x3i3=Cx12x22x32−Cx12x22x31−Cx12x21x32−Cx11x22x32+Cx12x21x31+Cx11x22x31+Cx11x21x32−Cx12x21x31.

Using the notation in Figure 3.1(b) in Equation (3.7), we have the following:



C(a2b2c2) − C(a2b2c1) − C(a1b2c2) + C(a2b1c1) + C(a1b2c1) + C(a1b1c2) − C(a1b1c1) ≥ 0; (a1a2), (b1b2), (c1c2) ∈ [0, 1]2
Ca2b2c2−Ca2b2c1−Ca1b2c2+Ca2b1c1+Ca1b2c1+Ca1b1c2−Ca1b1c1≥0;a1a2,b1b2,c1c2∈012
(3.8)

As introduced previously, copulas are multivariate distribution functions, and each copula induces a probability measure on [0, 1]d01d. In the bivariate case, C(a1a2)Ca1a2 can be expressed as a joint probability in the rectangle [0, a1] × [0, a2]0a1×0a2. Thus, Equation (3.6) can be interpreted as follows:



C(a1a2) − C(a1, 0) − C(0, a2) + C(0, 0) ≥ 0
Ca1a2−Ca10−C0a2+C00≥0
(3.9)

Similarly in the trivaraite case, C(a1a2a3)Ca1a2a3 can be expressed as a joint probability measure in the cube of [0, a1] × [0, a2] × [0, a3]0a1×0a2×0a3. Equation (3.8) can be interpreted as follows:



C(a1a2a3) − C(a1a2, 0) − C(a1, 0, a3) + C(a2, 0, 0) + C(0, a2, 0) + C(0, 0, a3) − C(0, 0, 0) = C(a1a2a3) ≥ 0
Ca1a2a3−Ca1a20−Ca10a3+Ca200+C0a20+C00a3−C000=Ca1a2a3≥0
(3.10)



  1. 6. Let X1, …, XdX1,…,Xd be random variables with margins F1, …, FdF1,…,Fd and joint distribution function F(x1, …, xd)Fx1…xd and ui = Fi(xi), i = 1, …, dui=Fixi,i=1,…,d. X1, …, XdX1,…,Xd are mutually independent if and only if Fx1…xd=∏i=1dFixi. Copula C(u1, …, ud)Cu1…ud is called the independent or product copula and is defined as follows:


    Cu1…ud=∏i=1dui(3.11)


According to Sklar’s theorem, there exists a copula C such that for all x ∈ R : R ∈ (−∞, +∞)x∈R:R∈−∞+∞, the relation between cumulative joint distribution function F(x1, …, xd)Fx1…xd and copula C(u1, …, ud)Cu1…ud can be expressed as follows:



F(x1, …, xd) = P(X1 ≤ x1, …, Xd ≤ xd) = C(F1(x1), …, Fd(xd)) = C(u1, …, ud)
Fx1…xd=PX1≤x1…Xd≤xd=CF1x1…Fdxd=Cu1…ud
(3.12)

where ui = F(xi) = P(Xi ≤ xi), i = 1, …, d; ui~U(0, 1)ui=Fxi=PXi≤xi,i=1,…,d;ui~U01, if FiFi is continuous. Another way to think about the copula is as follows:


Cu1…ud=FF1−1u1…Fd−1ud;u1…ud∈01d(3.13)

where xi=Fi−1ui if X is continuous.




Example 3.4 Illustrate Equation (3.12) using the Farlie–Gumbel–Morgenstern (FGM) model.


The FGM model is as follows:



f(xy) = fX(x)fY(y)(1 + η(2F(x) − 1)(2FY(y) − 1))
fxy=fXxfYy1+η2Fx−12FYy−1.

Solution: The joint CDF (JCDF) of the FGM model above can be expressed as follows:



F(xy) = FX(x)FY(y){1 + η[1 − FX(x)][1 − FY(y)]}, |η| ≤ 1
Fxy=FXxFYy1+η1−FXx1−FYy,η≤1

Let u1 = FX(x), u2 = FY(y)u1=FXx,u2=FYy, and we have the following:



 F(xy) = C(u1u2) = u1u2[1 + η(1 − u1)(1 − u2)], |η| ≤ 1
Fxy=Cu1u2=u1u21+η1−u11−u2,η≤1.

The copula captures the essential features of the dependence of bivariate (multivariate) random variables. C is essentially a function that connects the multivariate probability distribution to its marginals. Then the problem of determining H (i.e., the joint cumulative distribution of correlated random variables) reduces to one of determining C. Let c(u1,  … , ud)cu1…ud denote the density function of copula C(u1, …, ud)Cu1…ud as follows:


cu1…ud=∂Cdu1…ud∂u1…∂ud(3.14a)

The mathematical relation between copula density function c(u1,  … , ud)cu1…ud and joint density function f(x1,  … , xd)fx1…xd can be expressed as follows:


fx1…xd=∂Fdx1…xd∂x1…∂xd=∂CdF1x1…Fdxd∂x1…∂xd=∂CdF1x1…Fdxd∂F1x1…∂Fdxd∂F1x1∂x1…∂Fdxdxd=∂Cdu1…ud∂u1…∂ud∏i=1dfixi=cu1…ud∏i=1dfixi(3.14b)

where fi, Fifi,Fi are, respectively, the probability density function and the probability distribution function for random variable XiXi.


Equation (3.14b) may be rewritten as follows:


cu1…ud=fx1…xd∏i=1dfixi(3.14c)



Example 3.5 Using the FGM model in Example 3.4, derive the copula density function and its relation to joint density function.


Solution: From Example 3.4, the FGM model may be represented through the copula function as follows:


C(u1u2) = u1u2(1 + η(1 − u1)(1 − u2))Cu1u2=u1u21+η1−u11−u2. Then the copula density function can be derived using Equation (3.14a) as follows:


cu1u2=∂C2u1u2∂u1∂u2=1+η1+4u1u2−2u1−2u2=1+η2u1−12u2−1,η≤1(3.15a)

The relation between copula density function c(u1u2)cu1u2 and joint probability density function of the FGM model described in Example 3.4 can be expressed as follows:



f(x1x2) = c(u1u2)f1(x1)f2(x2) = f1(x1)f2(x2)[1 + η(2u1 − 1)(2u2 − 1)]
fx1x2=cu1u2f1x1f2x2=f1x1f2x21+η2u1−12u2−1
(3.15b)

where ui = Fi(xi), i = 1, 2ui=Fixi,i=1,2.


As an illustrative example, let X1~ exp (λ)X1~expλ and X2~gamma(αβ)X2~gammaαβ, we may rewrite the probability density function of f(x1x2)fx1x2 as follows:


fx1x2=f1x1f2x21+η2u1−12u2−1=exp−λx1βαxα−1Γαexp−βx1+η1−2exp−λx12γαβx2Γα−1(3.15c)


3.1.1 Bivariate Copula


A bivariate copula C(u1u2)Cu1u2 is a function from [0, 1] × [0, 1]01×01 into [0,1] to represent the joint cumulative probability distribution function of bivariate random variables with the following properties directly deduced from the discussions earlier as follows:


For every u1, u2u1,u2 in [0, 1]:



C(u1, 0) = C(0, u2) = 0; C(u1, 1) = u, C(1, u2) = u2
Cu10=C0u2=0;Cu11=u,C1u2=u2
(3.16)



  1. 1. For every u11 ≤ u12, u21 ≤ u22u11≤u12,u21≤u22 in [0, 1]:



    C(u12u22) − C(u12u21) − C(u11u22) + C(u11u21) ≥ 0
    Cu12u22−Cu12u21−Cu11u22+Cu11u21≥0
    (3.17)


Equation (3.17) represents the volume:


VCB=Δu11u12Δu21u22Cu1u2≥0(3.18)

Equation (3.18) represents the second-order derivative of function C(u1u2)Cu1u2 (Nelsen, 2006). As the representation of the joint distribution of random variables X and Y C(u1u2) ≡ C(FX(x), FY(y)) ≡ H(xy)i.e.,Cu1u2≡CFXxFYy≡Hxy, the second-order derivative of C(u1u2)Cu1u2 represents the copula density function of the bivariate random variable cu1u2=fxyfXxfYy≥0. This further explains Equations (3.17) and (3.18).




  1. 2. When random variables X1X1 and X2X2 are independent, one obtains the so-called product copula:



    C(u1u2) = H(x1x2) = u1u2, ui = Fi(xi)
    Cu1u2=Hx1x2=u1u2,ui=Fixi
    (3.19)



  2. 3. For every u1, u2u1,u2 in [0, 1] with the corresponding copula C(u1u2)Cu1u2, the following Fréchet–Hoeffding bounds hold:



    max(u1 + u2 − 1, 0) ≤ C(uv) ≤  min (u1u2)
    maxu1+u2−10≤Cuv≤minu1u2
    (3.20)




Example 3.6 Express the bivariate Gaussian copula and its density function.


Solution: The bivariate Gaussian copula is a distribution over the unit square [0, 1]2012, which is constructed from the bivariate normal distribution through the probability integral transform.


For a given correlation matrix, RR, the bivariate Gaussian copula can be given as follows:


CRGAUu=ΦRΦ−1u1Φ−1u2,u=u1u2(3.21)

where Φ−1Φ−1 denotes the inverse cumulative distribution function of standard normal distribution; and ΦRΦR denotes the joint cumulative distribution function of bivariate normal distribution with mean vector of zero and covariance matrix of RR.


The density function of bivariate Gaussian copula can be given as follows:


cRGAUu=12π1−ρ2exp−x∗2−2ρx∗y∗+y∗221−ρ2(3.22)

where x, yx∗,y∗ are the transformed variables as x = Φ−1(u1), y = Φ−1(u2)x∗=Φ−1u1,y∗=Φ−1u2; and ρρ denotes the correlation coefficient of the bivariate random variable that may be expressed through the Kendall correlation coefficient as follows:


ρ=sinπτ2(3.22a)

It is worth noting that the Gaussian copula may also be called the meta-Gaussian distribution with no constraints on the type of marginal distributions. In what follows, we will further illustrate the bivariate Gaussian copula with two different marginal distributions:



X~N(μσ2), Y~ exp (λ)
X~Nμσ2,Y~expλ.

Let u1 = FX(x) = N(xμσ2)u1=FXx=Nxμσ2 and u2 = FY(y) = 1 −  exp (−λy)u2=FYy=1−exp−λy. We have


x∗=Φ−1u1=Φ−1Nxμσ2,y∗=Φ−1u2=Φ−11−exp−λy,ρ=sinπτXY2.

Finally, we obtain the bivariate Gaussian copula and its density function as follows:


CGAUu=ΦΦ−1Nxμσ2Φ−11−exp−λysinπτXY2cGAUu=12π1−sinπτXY22exp−Φ−1Nxμσ22−2Φ−1Nxμσ2Φ−11−exp−λy+Φ−11−exp−λy221−sinπτXY22

Consider a simple numerical example with the random sample values x = 2.5 and y = 4x=2.5andy=4 drawn from the probability distributions of X~N(0, 22); Y~ exp (0.5).X~N022;Y~exp0.5. The rank based Kendall correlation coefficient of X, YX,Y is τXY = 0.7τXY=0.7.


Applying Equation (3.22a), we may compute the Pearson correlation coefficient as follows: ρ=sinπτ2=sin0.7π2=0.891.


From the parent normal and exponential distributions, we can compute the transformed variables:


X~N022⇒FX2.5=N2.5022=0.894⇒x∗=Φ−1FX2.501=Φ−10.894401=1.25

Y~exp0.5⇒FY4=1−exp−0.54=0.8647⇒y∗=Φ−1FY401=Φ−10.864701=1.1015

Substituting x = 1.25, y = 1.1015, ρ = 0.891x∗=1.25,y∗=1.1015,ρ=0.891 into the bivariate Gaussian copula and the corresponding density function, we have the following:



CGAU(0.8944, 0.8647; 0.891) = 0.8406
CGAU0.89440.86470.891=0.8406


cGAU(0.8944, 0.8647; 0.891) = 4.0396
cGAU0.89440.86470.891=4.0396


3.1.2 Trivariate Copula


A trivariate copula C(uvw)Cuvw is a function from [0, 1]3013 into [0, 1]01. It should again satisfy all the properties discussed in the definition of copula such that the trivariate copula derived may represent the cumulative joint probability distribution of trivariate random variables.




  1. 1. For every u, v, wu,v,w in [0, 1], use the following:



    C(0, vw) = C(u, 0, w) = C(uv, 0) = 0
    C0vw=Cu0w=Cuv0=0
    (3.23)


    C(u, 1, 1) = u, C(1, v, 1) = v, C(1, 1, w) = w
    Cu11=u,C1v1=v,C11w=w
    (3.24)



  2. 2. For every u1 ≤ u2, v1 ≤ v2u1≤u2,v1≤v2, and w1 ≤ w2w1≤w2 in [0, 1], use the following:


    Cu2v2w2−Cu1v2w2−Cu2v1w2−Cu2v2w1+Cu1v1w2+Cu1v2w2+Cu2v1w1−Cu1v1w1≥0(3.25)


Similar to the bivariate case, Equation (3.22) represents the volume:


VCB=Δu1u2Δv1v2Δw1w2Cuvw≥0(3.26)

For the function C(uvw)Cuvw to represent the trivariate joint distribution function, Equations (3.25) and (3.26) hold as a necessary condition, that is, the copula density is nonnegative.




  1. 3. When random variables {X1X2X3}X1X2X3 are independent with u = F1(x1), v = F2(x2), w = F3(x3)u=F1x1,v=F2x2,w=F3x3, one obtains the so-called product copula



    C(uvw) = uvw
    Cuvw=uvw
    (3.27)



  2. 4. For every u, v, wu,v,w in [0, 1] with the copula function C(uvw)Cuvw, the following Fréchet–Hoeffding bounds hold:



    max(u + v + w − 2, 0) ≤ C(uvw) ≤  min (uvw)
    maxu+v+w−20≤Cuvw≤minuvw
    (3.28)


The CDF and PDF of the trivariate copula can be written as follows:



C(uvw) = F(x1x2x3)
Cuvw=Fx1x2x3
(3.29)

cuvw=∂C3uvw∂u∂v∂w=fx1x2x3f1x1f2x2f3x3(3.30)

Again, with the use of trivariate flood variables (i.e., peak discharge (Q), flood volume (V) and flood duration (D)), we may further illustrate these properties by setting the following:


u1=FQq,u2=FVv,u3=FDdandu1=0=FQQ≤qmin,u2=0=FVV≤vmin,u3=0=FDD≤dmin.

In case of property (1), we may evaluate C(u1u2, 0) = H(Q ≤ qV ≤ vD ≤ dmin)Cu1u20=HQ≤qV≤vD≤dmin and C(u1, 1, 1) = H(Q ≤ qV <  + ∞, D <  + ∞)Cu111=HQ≤qV<+∞D<+∞ as an example.



H(Q ≤ qV ≤ vD ≤ dmin) = P(D ≤ dminQ ≤ qV ≤ v)P(Q ≤ qV ≤ v)
HQ≤qV≤vD≤dmin=PD≤dminQ≤qV≤vPQ≤qV≤v
(3.31)

With the assumption of flood variables (i.e., {(QVD)| Q ≥ qminV ≥ vminD ≥ dmin}QVDQ≥qminV≥vminD≥dmin), we have P(D ≤ dminQ ≤ qV ≤ v) = 0, 0 ≤ P(Q ≤ qV ≤ v) < 1PD≤dminQ≤qV≤v=0,0≤PQ≤qV≤v<1 and H(Q ≤ qV ≤ vD ≤ dmin) = 0 = C(u1u2, 0)HQ≤qV≤vD≤dmin=0=Cu1u20.


From the probability theory, it is obvious that H(Q ≤ qV <  + ∞, D <  + ∞)HQ≤qV<+∞D<+∞ reduces to the marginal probability distribution of peak discharge, i.e., FQ(q)FQq. Thus, we obtain the following:



C(u1, 1, 1) = u1
Cu111=u1.

In the same way as for the bivariate case, property (2) may be explained through the copula density function. Equation (3.26) may be rewritten as the third-order derivative of the copula function C(u1u2u3)Cu1u2u3, i.e., c(u1u2u3)cu1u2u3. Related to the joint probability density function to Equations (3.14a)-(3.14c), it is clear that Equations (3.25) and (3.26) are nonnegative.




Example 3.7 Express the trivariate Gaussian copula and its density function.


Solution: The trivariate Gaussian copula is a distribution over the unit cube [0, 1]3013 which is constructed from the trivariate normal distribution through the probability integral transform.


For a given correlation matrix, RR, the trivariate Gaussian copula can be given as follows:


CRGAUu=ΦRΦ−1u1Φ−1u2Φ−1u3,u=u1u2u3(3.32)

where Φ−1Φ−1 denotes the inverse cumulative distribution function of the standard normal distribution; and ΦRΦR denotes the joint cumulative distribution function of trivariate normal distribution with a mean vector of zero and a covariance matrix of RR.


The density function of trivariate Gaussian copula can be given as follows:


cRGAUu=1∣R∣exp−12Φ−1u1Φ−1u2Φ−1u3TR−1−IΦ−1u1Φ−1u2Φ−1u3(3.33)

where the mean vector is [0,0,0], R denotes the covariance matrix of the random variables, and I is the three-by-three identity matrix.


Similar to the bivariate Gaussian copula example (i.e., Example 3.6), there is no restriction in regard to the marginal distribution that the random variables may follow. More examples will be given in the chapter focused on meta-elliptical copulas.



3.2 Construction of Copulas


Copulas may be constructed using different methods, e.g., the inversion method, the geometric method, and the algebraic method. Nelsen (2006) discussed how to use these methods to construct copulas. In this section, these methods are briefly introduced.



3.2.1 Inversion Method


As the name of this method suggests, a copula is obtained through the joint distribution function F and its continuous maginals. Taking an example of a two-dimensional copula, the copula obtained by the inversion method can be expressed as follows:


Cuv=FF1−1uF2−1v(3.34)

where u = F1(x1), v = F2(x2)u=F1x1,v=F2x2. The inversion method can be applied only if one knows the joint distribution of random variables X1 and X2.




Example 3.8 Construct a copula using the Gumbel mixed distribution as joint distribution and the Gumbel distributions as marginals.


Solution: Suppose that random variables X1, X2 each follow the Gumbel distribution as follows: X1 ~ Gumbel (a1, b1), and X2 ~ Gumbel (a2, b2). Their joint distribution follows the Gumbel mixed distribution. In this example, the univariate Gumbel distribution can be expressed as follows:


Fx=exp−exp−x−ba(3.35)

and the bivariate Gumbel mixed distribution can be expressed as follows:


Fx1x2=F1x1F2x2exp−α1lnF1x1+1lnF2x2−1;α∈01(3.36)

Again, let u = F1(x1), v = F2(x2)u=F1x1,v=F2x2 with F1(x1)F1x1 and F2(x2)F2x2 each following the Gumbel distribution given by Equation (3.35). Then, we have



C(uv) = C(F1(x1), F2(x2)) = uv exp (−α((lnu)−1 + (lnv)−1)−1)
Cuv=CF1x1F2x2=uvexp−αlnu−1+lnv−1−1
(3.37)

where αα is the parameter of the copula.


The copula function derived as Equation (3.37) is actually the Gumbel-mixed model. Thus, it should be noted that Equations (3.37) can be successfully constructed to represent the joint distribution if and only if the random variables are positively correlated with the correlation coefficient not exceeding 2/3. This may be explained from the properties of the Gumbel-mixed model. Given by Oliveria (1982), the parameter of Gumbel-mixed model is related to the Pearson correlation coefficient as follows:


α=21−cosπρ6;α=0⇒ρ=0;α=1⇒ρ=23(3.38)



Example 3.9 Construct a copula from bivariate exponential distribution with exponential marginals.


Solution: Suppose that random variables X1, X2 with X1~ exp (θ1), X2~ exp (θ2)X1~expθ1,X2~expθ2, the joint distribution of X1 and X2, F(x1x2),Fx1x2, follows the bivariate exponential distribution presented by Singh and Singh (1991) as follows:


Fx1x2=1−e−x1θ11−e−x2θ21+δe−x1θ1−x2θ2(3.39)

Let u=F1x1=1−e−x1θ1,v=F2x2=1−e−x2θ2; then we have



C(uv) = C(F1(x1), F2(x2)) = uv(1 + δ(1 − u)(1 − v))
Cuv=CF1x1F2x2=uv1+δ1−u1−v
(3.40)

where δδ is the parameter of the copula in Equation (3.39).


In Equations (3.39) and (3.40), |δ| ≤ 1. In the case of the bivariate exponential distribution in this example, the correlation of bivariate random variables is in the range of [–0.25, 0.25] to guarantee that the bivariate distribution so derived is valid. In addition, the FGM copula is also expressed as Equation (3.40). In the case of the FGM copula, the correlation of bivariate random variables needs to be in the range of [–1/3, –1/3] (Schucany et al., 1978).



3.2.2 Geometric Method


Rather than deriving the copula functions by inverting the joint distribution functions based on the Sklar theorem, the geometric method derives the copula directly based on the definition of the copulas, e.g., the bivariate copula is 2-increasing and bounded. The geometric method does not require the knowledge of either distribution function or random variables. As the name of the method suggests, the geometric method requires the knowledge in regard to the geometric nature or support region of the random variables (Nelsen, 2006). In what follows, two bivariate copula examples borrowed from exercise problems (Nelsen, 2006) are used to illustrate the method.




Example 3.10 Singular copula with prescribed support.


Let (αβ)αβ be a point in I2 such that α>0, β>0, and α + β < 1α>0,β>0,andα+β<1. Suppose that the probability mass α is uniformly distributed on the line segment joining (α,β) and (0, 1), the probability mass β is uniformly distributed on the line segment joining (α,β) and (1, 0), and the probability mass 1-α-β is uniformly distributed on the line segment joining (α,β) and (1, 1). Determine the copula function with these supports.


Solution: Based on the description of the problem statements, Figure 3.2(a) graphs the prescribed support (depicted by the solid line). It is seen from Figure 3.2(a) that (uv)uv may be reside either in the upper triangle (i.e., Figure 3.2(b)) or in the lower triangle (i.e., Figure 3.2(c)). In addition, we will also check what may happen if (uv)uv fall out of the prescribed support (i.e., beneath the two triangles). Now, to determine the copula function with the corresponding prescribed support, we will look at three different cases: (a) (u,v) is in the upper triangular support region (Figure 3.2(b)); (b) (u,v) is in the lower triangular support region (Figure 3.2(c)); and (c) (u,v) does not fall into either support region individually.




  1. 1. If (u,v) falls into the region bounded by the upper triangular region with vertices (α,β), (0, 1), and (1, 1), as shown in Figure 3.2(b), then according to the definition of the copula, Figure 3.2(b) clearly shows the following:


    VC0u×v1=VC0α1−v1−β×v1(3.41)


    VC([0, u] × [v, 1]) = C(u, 1) − C(uv) − C(0, 1) + C(0, v) = u − C(uv)
    VC0u×v1=Cu1−Cuv−C01+C0v=u−Cuv
    (3.42)

    VC0α1−v1−β×v1=Cα1−v1−β1−Cα1−v1−βv−C01+C0v=α1−v1−β−Cα1−v1−βv(3.43)
    Equating Equation (3.42) to Equation (3.43), we get the following:
    Cuv=u−α1−v1−β(3.44a)
    In order to determine the copula function in this region, we can also look at the rectangle α1−v1−βu×v1. This rectangle is not intercepting any support line segment, thus we know the C-volume is zero, as follows:
    VCα1−v1−βu×v1=Cu1−Cuv−Cα1−v1−β1+Cα1−v1−βv=0⇒Cuv=u−α1−v1−β(3.44b)



  2. 2. Similarly, If (u,v) falls into the region bounded by the lower triangular region with vertices (α,β), (1, 0), and (1, 1), as shown in Figure 3.2(c), then we can use the same approach to find the following:


    Cuv=v−β1−α1−u(3.45)



  3. 3. If (u,v) is not falling into the two triangles bounded by the support segment, then we immediately know that the C-volume is zero and C(uv)Cuv can be found as follows:



    VC([0, u] × [0, v]) = C(uv) − C(0, v) − C(u, 0) − C(0, 0) = 0 ⇒ C(uv) = 0
    VC0u×0v=Cuv−C0v−Cu0−C00=0⇒Cuv=0
    (3.46)


Note the following for the limiting cases:




  1. 1. If α = β = 0α=β=0, the support line segment is the main diagonal on I2. Nelsen (2006) proved that in this case, C(uv)Cuv is the Fréchet–Hoeffding upper bound, i.e., C(uv) = M(uv) =  min (uv)Cuv=Muv=minuv.



  2. 2. If β = 1 − αβ=1−α, Equation (3.44a) and Equation (3.45) reduce to the following:


    Cuv=u−β1−α1−v=u+v−1(3.47a)


    and


    Cuv=v−α1−β1−u=u+v−1(3.47b)


    Equation (3.47) represents the Fréchet–Hoeffding lower bound, i.e.,



    C(uv) = W(uv) =  max (u + v − 1, 0)
    Cuv=Wuv=maxu+v−10.
    (3.47)





Figure 3.2 Schematic of singular copulas with prescribed support.




Example 3.11 Copulas with prescribed horizontal or vertical support.


Show for each of the following choices of the ΨΨ function, the function C given as



C(uv) = uv − Ψ(v)u(1 − u)
Cuv=uv−Ψvu1−u
(3.48)

is a copula:




  1. a. Ψv=θπsinπv;θ∈−11



  2. b. Ψ(v) = θ[ζ(v) + ζ(1 − v)], θ ∈ [−1, 1]; ζΨv=θζv+ζ1−v,θ∈−11;ζ is the piecewise linear function with the graph connecting [0, 0] to (1/4, 1/4) to (1/2, 0) to (1, 0).


Solution: According to Nelsen (2006), if Equation (3.48) is a copula, it is a copula with quadratic sections in u.




  1. a. Ψv=θπsinπv,θ∈−11


Corollary 3.2.5 (from Nelsen, 2006) can be applied to prove that the C function with the ΨΨ function so defined is a copula. Corollary 3.2.5 states the necessary and sufficient conditions for the C function to be a copula:




  1. 1. Ψ(v)Ψv is absolutely continuous on I.



  2. 2. (v)| ≤ 1Ψ’v≤1 almost everywhere on I.



  3. 3. |Ψ(v)| ≤  min (v, 1 − v)Ψv≤minv1−v.


Based on corollary 3.2.5, we conclude the following:




  1. 1. It is easy to see that Ψ(v)Ψv is absolutely continuous on I with sine function being an absolutely continuous function.



  2. 2. It is seen that for θ ∈ [−1, 1], |θ/π| < 1θ∈−11,θ/π<1, so we have the following: ∣Ψ’v∣=∣θπcosπv∣<1.



  3. 3. For Ψv=θπsinπv,v∈I, we have the following:


    0≤πv≤π,sinπv≤πv⇒Ψv=θπsinπv≤θππv=θv≤v(3.49)


    Similarly,


    sinπv=sinπ−πv=sinπ1−v≤π1−v⇒Ψv=θπsinπ1−v≤θ1−v≤1−v(3.50)


    From Equations (3.49) and (3.50), we have |Ψ(v)| ≤  min (v, 1 − v)Ψv≤minv1−v for v ∈ Iv∈I.


Now, all the conditions are satisfied and function C with Ψ(v)Ψv defined in a. is a copula.




  1. b. Ψ(v) = θ[ζ(v) + ζ(1 − v)], θ ∈ [−1, 1]Ψv=θζv+ζ1−v,θ∈−11; ζζ is the piecewise linear function with the graph connecting {[0, 0] to (1/4, 1/4)} to {(1/2, 0) to (1, 0)}.


Theorem 3.2.4 in Nelsen (2006) can be applied to prove that function C is a copula. Theorem 3.2.4 states the necessary and sufficient conditions for C to be a copula as follows:




  1. 1. Ψ(0) = Ψ(1) = 0Ψ0=Ψ1=0



  2. 2. Ψ(v)Ψv satisfies the Lipschitz condition: |Ψ(v2) − Ψ(v1)| ≤ |v2 − v1|; v1, v2 ∈ IΨv2−Ψv1≤v2−v1;v1,v2∈I



  3. 3. C is absolutely continuous.


The schematic plot for the piecewise linear function is given in Figure 3.3(a).





Figure 3.3 Plots of functions ζ(v)ζv and Ψ(v)Ψv.


The Ψ(v)Ψv function can be written as follows:


Ψv=θv;v∈014θ12−v;v∈1434θv−1;v∈341(3.51)



  1. 1. For θ ∈ [−1, 1]θ∈−11, we have the following:



    Ψ(0) = 0; Ψ(1) = θ(1 − 1) = 0
    Ψ0=0;Ψ1=θ1−1=0



  2. 2. Prove the Lipschitz condition with v1 ≤ v2v1≤v2 and θ ∈ [−1, 1]θ∈−11.




    1. i. If v1∈014, we have the following:



    |Ψ(v2) − Ψ(v2)|
    Ψv2−Ψv2

    =θv2−v1≤v2−v1;v2∈014θ12−v2−θv1=θ12−v1+v2<θv2−v1≤v2−v1;v2∈1434θv2−1−θv1<θv2−v1−1<v2−v1;v2∈341(3.52)



    1. ii. If v1∈1434, we have the following:


    Ψv2−Ψv1=θ12−v2−θ12−v1=θv2−v1≤v2−v1;v2∈1434θv2−1−θ12−v1=∣θ(v2+v1−32∣≤θv2−34≤θv2−v1≤v2−v1;v2∈341(3.53)



    1. iii. Similarly, it can be easily shown that the Lipschitz condition is also satisfied for v1∈341.




  3. 3. Following Nelsen (2006), to prove the absolute continuity of C follows the absolute continuity of Ψ(v)Ψv with the second condition. Figure 3.3(b) plots the Ψ(v)Ψv function with θ = 0.8θ=0.8; as an example, it is shown that there is no discontinuity in domain I:


    Ψ’v=θ,v∈014−θ,v∈1434θ,v∈341(3.54)


    with θ ∈ [−1, 1]θ∈−11, we have proved that (v)| ≤ 1Ψ’v≤1 in domain I.


Now all the conditions are satisfied and the C function with the Ψ(v)Ψv function defined in (b) is a copula.


It is worth noting that the copula defined as Equation (3.48) is a copula with quadratic sections in u. The reader can refer to Nelsen (2006) for more complete details of the geometric method and other types of geometric support to construct copulas.



3.2.3 Algebraic Method


Copulas may be constructed using the algebraic relationship between joint distribution and univariate distributions of random variables X1 and X2, which is called the algebraic method. Nelsen (2006) introduced this approach by constructing the Plackett and Ali–Mikhail–Haq copula through an “odd” ratio in which the Plackett copula is constructed by measuring the dependence of two-by-two contingency tables, and Ali–Mikhail–Haq copula is constructed by using the survival odds ratio. In order to discuss the method, the Ali–Mikhail–Haq copula construction example presented in Nelsen (2006) is used here.


The survival odds ratio for a univariate random variable X with X ~F(x) can be expressed as follows:


PX>xPX≤x=1−FxFx=F¯xFx(3.55)

Similarly, the survival odds ratio for bivariate random variables X1 and X2 with joint distribution F (x1, x2) and marginals F1(x1), F2(x2)F1x1,F2x2 can be expressed as follows:


PX1>x1orX2>x2PX1≤x1andX2≤x2=1−Fx1x2Fx1x2=F¯x1x2Fx1x2(3.56)



Example 3.12 The Ali–Mikhail–Haq copula.


The Ali–Mikhail–Haq copula (Ali et al., 1978) can be expressed as follows:


Cuv=uv1−θ1−u1−v(3.57)

Construct the copula by using the algebraic method.


Solution: Ali et al. (1978) proposed that Ali–Mikhail–Haq copula belongs to the bivariate logistic distribution family with the standard bivariate logistic distribution and standard logistic marginals.


The standard bivariate logistic distribution can be given as follows:



F(x1x2) = (1 + ex1 + ex2)−1
Fx1x2=1+e−x1+e−x2−1
(3.58a)

The standard logistic marginal can be given as follows:



F(x) = (1 + ex)−1
Fx=1+e−x−1
(3.58b)

The survival ratio of Equation (3.58a) is


1−Fx1x2Fx1x2=1−1+e−x1+e−x2−11+e−x1+e−x2−1=e−x1+e−x2(3.59)

From Equation (3.59), it is seen that the survival ratio of the standard bivariate logistic distribution can be rewritten as follows:


1−Fx1x2Fx1x2=e−x1+e−x2=1−1+e−x1−11+e−x1−1+1−1+e−x2−11+e−x2−1(3.60a)

Substituting Equation (3.58b) into Equation (3.60a), we have the following:


1−Fx1x2Fx1x2=1−F1x1F1x1+1−F2x2F2x2(3.60b)

In Ali et al. (1978), the Ali–Mikhail–Haq copula was considered a bivariate distribution satisfying the survival ratio as follows:


1−Fx1x2Fx1x2=1−F1x1F1x1+1−F2x2F2x2+1−θ1−F1x1F1x11−F2x2F2x2(3.61)

It is concluded from Equation (3.61) that θ = 1 implies that the joint distribution F(x, y) of random variables X1X1 and X2X2 follows the standard biviariate logistic distribution; and θ = 0 implies that X and Y are independent with the proof given in example 3.19 in Nelsen (2006).


Applying Sklar’s theorem to Equation (3.59) and letting



C(uv) = F(x1x2), u = F1(x1), v = F2(x2)
Cuv=Fx1x2,u=F1x1,v=F2x2

Equation (3.61) can be rewritten as follows:


1−CuvCuv=1−uu+1−vv+1−θ1−uu1−vv(3.62)

With simple algebra, we have


Cuv=uv1−θ1−u1−v(3.63)

where θ is the parameter of the Ali–Mikhail–Haq copula.



3.3 Families of Copula


There are a multitude of copulas. Generally speaking, copulas may be grouped into the Archimedean copulas, meta-elliptical copulas, and copulas with prescribed geometric support (e.g., copulas with quadratic or cubic sections). According to their exchangeable properties, copulas may also be classified as symmetric copulas and asymmetric copulas. For example, one-parameter Archimedean copulas are symmetric copulas, and periodic copulas (Alfonsi and Brigo, 2005) and mixed copulas (Hu, 2006) are asymmetric copulas. Here we will only discuss the general concepts of each copula family. The copula functions pertaining to a given copula family will be discussed in detail in subsequent chapters.



3.3.1 Archimedean Copulas


Archimedean copulas are widely applied in finance, water resources engineering, and hydrology due to their simple form, dependence structure, and other “nice” properties. Chapters 4 and 5 discuss the symmetric and asymmetric Archimedean copulas.



3.3.2 Plackette Copula


The Plackette copula has been applied in recent years. It will be discussed in Chapter 6.



3.3.3 Meta-elliptical Copulas


Meta-elliptical copulas are a flexible tool for modeling multivariate data in hydrology. They will be further discussed in Chapter 7.



3.3.4 Entropic Copula


Similar to the entropy-based univariate probability distributions, the entropy theory (e.g., Shannon entropy) may be applied to derive entropic copulas with the use of constraints in regard to the total probability theory, properties of marginals (i.e., EUi=1i+1), and the dependence measure (e.g., Spearman rank-based correlation coefficient). The entropic copula will be further discussed in Chapter 8.



3.3.5 Mixed Copulas


Parametric copulas place restrictions on the dependence parameter. When data are heterogeneous, it is desirable to have additional flexibility to model the dependence structure (Trivedi and Zimmer, 2007). A mixture model, proposed by Hu (2006), is able to measure dependence structures that do not belong to the aforementioned copula families. By choosing component copulas in the mixture, a model can be constructed that is simple and flexible enough to generate most dependence patterns and provide such a flexibility in practical data. This also facilitates the separation of the degree of dependence and the structure of dependence. These concepts are respectively embodied in two different groups of parameters: the association parameters and the weight parameters (Hu, 2006). For example, the given bivariate data may be modeled as a finite mixture with three bivariate copulas CI(uv), CII(u1u2), CIII(u1u2)CIuv,CIIu1u2,CIIIu1u2; the mixture model is defined as follows:



Cmix(uvθ1θ2θ3w1w2) = w1CI(uvθ1) + w2CII(uvθ2) + w3CIII(uvθ3)
Cmixuvθ1θ2θ3w1w2w3=w1CIuvθ1+w2CIIuvθ2+w3CIIIuvθ3
(3.64)

where Cmix(uvθ1θ2θ3w1w2)Cmixuvθ1θ2θ3w1w2w3 denotes the mixed copula; CI(uvθ1), CII(uvθ2), CIII(uvθ3)CIuvθ1,CIIuvθ2,CIIIuvθ3 are the three bivariate copulas, each with θ1, θ2, θ3θ1,θ2,θ3 as the corresponding copula parameters; and w1, w2, w3w1,w2,w3 may be interpreted as weights for each copula such that 0<wj<1;j=1,2,3,∑j=13wj=1,0<wj<1.



3.3.6 Empirical Copula


Sometimes, we analyze data with an unknown underlying distribution. The empirical data distribution can be transformed into what is called an “empirical copula” by warping such that the marginal distributions become uniform. Let x1 and x2 be two samples each of size n. The empirical copula frequency function can often be computed for any pair (x1x2)x1x2 by


Cninjn=∑i=1n1x1≤x1jandx2≤x2jn(3.65)

where {(x1ix2j) : 0 ≤ ij ≤ n}x1ix2j:0≤ij≤n represent, respectively, the ith- and jth-order statistic of x1 and x2.




Example 3.13 Using the peak discharge (Q: m3/s) and flood volume (V: m3) given in Table 3.1, calculate the empirical copula with the use of Equation (3.65).




Table 3.1. Peak discharge and flood volume data (from Yue, 2001).









































































































































































































































































Pair Year V (m3) Q (cms) Pair Year V (m3) Q (cms)
1 1942 8,704 371 28 1969 11,272 416
2 1943 6,907 245 29 1970 8,640 246
3 1944 4,189 189 30 1971 6,989 248
4 1945 8,637 229 31 1972 9,352 297
5 1946 8,409 240 32 1973 12,825 371
6 1947 13,602 331 33 1974 13,608 442
7 1948 8,788 206 34 1975 8,949 260
8 1949 5,002 157 35 1976 12,577 236
9 1950 5,167 184 36 1977 11,437 334
10 1951 10,128 275 37 1978 9,266 310
11 1952 12,035 286 38 1979 14,559 383
12 1953 10,828 230 39 1980 5,057 151
13 1954 8,923 233 40 1981 9,645 197
14 1955 11,401 351 41 1982 7,241 283
15 1956 6,620 156 42 1983 13,543 390
16 1957 3,826 168 43 1984 15,003 405
17 1958 8,192 343 44 1985 6,460 176
18 1959 6,414 214 45 1986 7,502 181
19 1960 8,900 303 46 1987 5,650 233
20 1961 9,406 300 47 1988 7,350 187
21 1962 7,235 143 48 1989 9,506 216
22 1963 8,177 232 49 1990 6,728 196
23 1964 7,684 182 50 1991 13,315 424
24 1965 3,306 121 51 1992 8,041 255
25 1966 8,026 186 52 1993 10,174 257
26 1967 4,892 173 53 1994 14,769 232
27 1968 8,692 292 54 1995 8,711 286

Only gold members can continue reading. Log In or Register to continue

Oct 12, 2020 | Posted by in Water and Sewage | Comments Off on 3 – Copulas and Their Properties
Premium Wordpress Themes by UFO Themes