6 – Plackett Copula




Abstract




Similar to the Archimedean copulas, the non-Archimedean copulas can be classified as one-parameter non-Archimedean bivariate copulas, two-parameter non-Archimedean bivariate copulas, and multivariate (d ≥ 3)d≥3) non-Archimedean copulas. In recent years, successful applications of non-Archimedean copulas, such as meta-elliptical copulas and Plackett copulas, have been reported in hydrology and water resources management. In this chapter, we will focus on Plackett copulas and more specifically bivariate and trivariate Plackett copula.





6 Plackett Copula




6.1 Bivariate Plackett Copula


In this section, we will introduce the definition, parameter estimation, as well as the random variate simulation with the use of bivariate Plackett copulas.



6.1.1 Definition of Bivariate Plackett Copula


As discussed in Chapter 3, the Plackett copula is constructed using the algebraic method. The cross-product ratio θθ, or odds ratio, is a measure of “association” or “dependence” in 2 × 22×2 contingency tables. Here, we label the categories for each variable as “low” and “high” and give four categories in Table 6.1, where a, b, c, and d represent the observed counts in the four categories, respectively. From Table 6.1, the cross-product ratio (θ : θ>0)θ:θ>0) is defined as θ=adbc. Following Palaro and Hotta (2006), the dependence may be explained through θθ as follows:




  1. 1. 0 < θ < 10<θ<1 corresponds to negative dependence, i.e., observations are more concentrated in the “low-high” and “high-low” cells.



  2. 2. θ = 1θ=1 corresponds to independence, each “observed” entry; for example, a is equal to its “expected value” under independence [i.e.,a+ba+ca+b+c+d].



  3. 3. θ>1θ>1 corresponds to positive dependence, i.e., observations are more concentrated in the “low-low” and “high-high” cells.




Table 6.1. Two-by-two contingency table.


































Column variable
Row variable Low (X ≤ xX≤x) High (X>xX>x)
Low (Y ≤ y)Y≤y) a b a+b
High (Y>y)Y>y) c d c+d
a+c b+d a+b+c+d

With the use of the 2 × 22×2 contingency table, Plackett (1965) developed what is now called the Plackett copula for bivariate continuous random variables. Assuming the continuous random variables X and Y with marginals FX and FYFXandFY and the joint distribution function H(xy) = P(X ≤ xY ≤ y)Hxy=PX≤xY≤y, then the “low” and “high” categories for the column and row variables are replaced by events X ≤ x, X>xX≤x,X>x and Y ≤ y, Y>yY≤y,Y>y, respectively. According to the definition of cross-product ratio θ=adbc, it is clear that a, b, c, and d denote the probabilities of P(X ≤ xY ≤ y), P(X>xY ≤ y), P(X ≤ xY>y),PX≤xY≤y,PX>xY≤y,PX≤xY>y, and P(X>xY>y)PX>xY>y, respectively.


Now, based on the bivariate probability relation discussed in Chapter 3, we have the following:



a = P(X ≤ xY ≤ y)
a=PX≤xY≤y
(6.1a)


b = FY(y) − H(xy)
b=FYy−Hxy
(6.1b)


c = FX(x) − H(xy)
c=FXx−Hxy
(6.1c)


d = 1 − FX(x) − FY(y) + H(xy)
d=1−FXx−FYy+Hxy
(6.1d)

Replacing the values of a, b, c, and d, we obtain the expression of parameter θ as follows:


θ=Hxy1−FXx−FYy+HxyFXx−HxyFYy−Hxy(6.1e)

Let u = FX(x)u=FXx and v = FY(y)v=FYy. Equation (6.1e) may be written in the copula form by applying Sklar’s theorem as follows:


θ=Cuv1−u−v+Cuvu−Cuvv−Cuv(6.2)

Solving for C in Equation (6.2), we obtain the Plackett copula:


Cuvθ=1+θ−1u+v−1+θ−1u+v2−4θθ−1uv2θ−1;θ>0&θ≠1(6.3a)



C(uvθ) = uv; θ = 1
Cuvθ=uv;θ=1
(6.3b)

Taking the partial derivatives with respect to u and v, its copula density function can be written as follows:


cuvθ=∂2Cuvθ∂u∂v=θ1+θ−1u+v−2uv1+θ−1u+v2−4θθ−1uv1.5(6.4)

Taking the partial derivative of equation (6.3a) with respect to u or v, the conditional probability distributions can be obtained as follows:


CV≤vU=u=PY≤yX=x=∂Cuvθ∂u=12+−1+u+v−uθ+vθ21+θ−1u+v2−4θθ−1uv(6.5)

CU≤uV=v=PX≤xY=y=∂Cuvθ∂v=12+−1+u+v+uθ−vθ21+θ−1u+v2−4θθ−1uv(6.6)



Example 6.1 Graph the Plackett copula function and its density function with θ = 20, θ = 1, and θ = 0.5θ=20,θ=1,andθ=0.5.


Solution: Using Equations (6.3) and (6.4), we can graph the Plackett copula function and its density function in Figure 6.1 using u, v ∈ [0, 1]u,v∈01. From the copula density function plots with different parameters in Figure 6.1, it is seen that (i) the density is higher if both u and v take on smaller or bigger values at the same time for θ = 20θ=20, i.e., high follows high and low follows low as the representation of positive dependence; (ii) the density is constant, i.e., 1, if θ = 1θ=1 for the independent random variables; and (iii) the negative dependence is observed from the density function plot for θ = 0.5θ=0.5, in this case, smaller u and bigger v reach higher density and vice versa.





Figure 6.1 Plackett copula function and its density function plot for θ = 20, θ = 1 and θ = 0.5θ=20,θ=1andθ=0.5.



6.1.2 Simulation of Bivariate Plackett Copula


Following the Rosenblatt transform (Rosenblatt, 1952), the random variable can be simulated as follows:




  1. 1. Simulate two independent random variables (w1w2)w1w2 from the uniform distribution U(0, 1)U01.



  2. 2. Set u = w1u=w1.



  3. 3. Using Equation (6.5a) and set w2 = C(vu)w2=Cvu, i.e.,


    w2=∂Cuvθ∂u=12+−1+u+v−uθ+vθ21+θ−1u+v2−4θθ−1uv(6.7)


After some algebraic manipulation of Equation (6.7), vv can be solved as follows:


v=c−1−2w2d2b(6.8)

where



b = θ + S(θ − 1)2; c = 2S(2 + 1 − u) + θ(1 − 2S); d = θ0.5(θ + 4Su(1 − u)(1 − θ)2)0.5S = w2(1 − w2).
b=θ+Sθ−12;c=2Suθ2+1−u+θ1−2S;d=θ0.5θ+4Su1−u1−θ20.5S=w21−w2.



Example 6.2 Generate the random variables from the Plackett copula function.


To generate the variables, use the following information:




  1. 1. Simulate Plackett random variables from the uniformly distributed independent random variables w1 = 0.1645w1=0.1645, w2 = 0.9629w2=0.9629, and θ = 50θ=50.



  2. 2. Given θ = 50θ=50, θ = 2.5θ=2.5, and θ = 0.1θ=0.1, graph the the random variables generated from the Plackett copula with a sample size of 100.


Solution: We can use the procedure discussed in Section 6.1.2 to generate the random variables from Plackett copula:




  1. 1. w1 = 0.1645w1=0.1645, w2 = 0.9629w2=0.9629, and θ = 50θ=50.


    Set u = w1 = 0.1645u=w1=0.1645. We may then compute the random variate vv using w2 = C(vU = uθ)w2=CvU=uθ.


    Solving Equation (6.8), we have the following:



    S = 0.0357; b = 135.7723; c = 75.8700; d = 69.6972
    S=0.0357;b=135.7723;c=75.8700;d=69.6972.
    Then we have the following:
    v=c−1−2w2d2b=75.8700−1−20.962969.69722135.7723=0.5170


    Thus, the generated random variables are (uv) = (0.1645, 0.5170)uv=0.16450.5170.



  2. 2. Set θ = 50θ=50, θ = 2.5θ=2.5 and θ = 0.1θ=0.1 with a sample size of 100.


    Using the same procedure as in step 1, we graph the simulated random variables with a sample size of 100 in Figure 6.2. Again, Figure 6.2 clearly shows that (i) the random variables generated are positively dependent with θ = 50θ=50; (ii) the random variables generated are negatively dependent with θ = 0.1θ=0.1; and (iii) the random variables generated are more scattered within [0, 1]2 that are near independent when θ = 2.5θ=2.5.





Figure 6.2 Scatter plot of simulated random variables from the Plackett copula.



6.1.3 Parameter Estimation for Bivariate Plackett Copulas


As discussed in Section 3.6, the full ML, IFM, and semiparametric (pseudo-ML) methods may be applied to estimate the parameter numerically for the Plackett copula function. Here, without further discussion, we will give one example to illustrate the procedure of parameter estimation.




Example 6.3 Using the random variables (Table 6.2) and assuming (a) random variables X and Y are sampled from the normal distribution and gamma distribution, respectively, and (b) the joint distribution may be modeled using Plackett copula, estimate the parameters using full ML, IFM, and semiparametric methods.




Table 6.2. Sample data for Example 6.3.

































































































































































































No. X Y No. X Y
1 11.276 5.049 26 12.793 12.942
2 19.570 12.015 27 16.772 4.140
3 10.864 3.691 28 12.215 4.522
4 14.517 9.233 29 24.909 7.689
5 17.512 6.862 30 17.580 12.331
6 14.312 5.343 31 17.200 7.060
7 17.785 12.689 32 10.621 5.583
8 9.457 8.182 33 10.310 19.026
9 13.290 8.531 34 8.957 3.648
10 15.470 31.129 35 18.735 7.534
11 18.392 20.848 36 11.536 7.519
12 9.411 8.567 37 16.264 10.727
13 18.883 15.874 38 21.382 21.947
14 11.749 12.142 39 19.153 11.813
15 14.173 10.224 40 17.355 7.988
16 14.044 6.223 41 17.877 12.159
17 13.032 7.594 42 14.799 9.622
18 18.374 14.827 43 11.457 11.147
19 17.979 14.283 44 18.601 14.626
20 7.656 4.639 45 11.636 4.732
21 14.642 10.039 46 11.427 6.263
22 19.871 16.856 47 15.067 11.378
23 7.769 17.575 48 16.328 14.778
24 12.870 7.763 49 21.471 29.678
25 14.119 6.964 50 15.327 9.639

Solution: With the assumption of X following the Gumbel distribution (Equation (2.10)) and Y following the gamma distribution (Equation (2.8)), applying MLE, we can initially estimate the parameters of random variables X and Y as follows:




  • Random variable X: μX = 14.9358; σX = 3.8484μX=14.9358;σX=3.8484.



  • Random variable Y: αY = 4.0031; βY = 0.3668αY=4.0031;βY=0.3668.


In addition, using Equation (3.72), we can compute the sample Kendall correlation coefficient as τn = 0.3690τn=0.3690.




  1. 1. Full ML Method:


    As discussed in Section 3.6.1, we will need to estimate the parameters of marginal distributions and copula function simultaneously with the full log-likelihood function given as follows:


    LL=∑ilncplackett(FXNormalxiμXσXFYGammayiαYβYθ+∑ilnfXNormalxiμXσX+∑ilnfYGammayiαYβY


    Using the parameters initially estimated for marginal distributions and assuming the initial estimate of the Plackett copula parameter θ = 10θ=10, we can use optimization toolbox in MATLAB to estimate the full set of parameters. The fitted marginal distribution is listed in Table 6.3 with the estimated parameters listed in Table 6.4.



  2. 2. IFM Method:


    As discussed in Section 3.6.2, the parameters of marginal distributions and copulas are estimated separately with the use of IFM method. We will first compute the cumulative probability using the parameters initially estimated for the marginal distributions listed in Table 6.3. Then we will estimate the parameter of the Plackett copula using the ML method (the optimization toolbox in MATLAB) and the computed cumulative probabilities as random variates as follows:


    LL=∑ilncplackettF̂Xxiμ̂Xσ̂XF̂Yyiα̂Yβ̂Yθ


    The estimated copula parameter is listed in Table 6.4.



  3. 3. Semiparametric Method:


    As discussed in Section 3.6.3, the semiparametric method is also called the pseudo-ML method. The marginal distributions are estimated nonparametrically using the Weibull plotting-position formula (Equation (3.92)) as listed in Table 6.3. Now with the use of the probability estimated nonparametrically, the pseudo-log-likelihood function can be written as follows:


    LL=∑ilncplackettF̂nxiF̂nyiθ


    The estimated parameter is again estimated using the optimization toolbox in MATLAB and listed in Table 6.4. From Table 6.4, it is seen that there is minimal difference in regard to the parameters of the marginal distributions estimated separately from the copula using the IFM method and those estimated simultaneously using the full ML method. Figure 6.3 further indicates this similarity through the univariate probability density comparison. Figure 6.4 compares the observed variates with the simulated variates from the fitted copula function. Figure 6.4 shows that the performances are very similar for the copulas with parameters estimated using three different techniques.




Table 6.3. Cumulative probability computed using the fitted normal and gamma distributions and Weibull probability plotting-position formula.

Only gold members can continue reading. Log In or Register to continue

Oct 12, 2020 | Posted by in Water and Sewage | Comments Off on 6 – Plackett Copula
Premium Wordpress Themes by UFO Themes