Transitive comparison of random variables

Bernard De Baets , ... Bart De Schuymer , in Logical, Algebraic, Analytic and Probabilistic Aspects of Triangular Norms, 2005

15.5.3 Continuous random variables

We first consider a continuous random vector ( X 1, X 2, …, Xm ) with arbitrary marginal distributions pairwisely coupled by T M . We have demonstrated that [8]:

15.5.1 Proposition

Let (X 1, X 2, …, Xm ) be a continuous random vector. Then the probabilistic relation QM   =   [qM ij ] can be computed as:

(15.13) q i j M = x : F X i x < F X j x f X i x d x + 1 2 x : F X i x < F X j x f X i x d x .

Note that if Xi and Xj are identically distributed, i.e. F X i F X j ,then q i j M = 1 2 , as expected. In Figure 15.1 we give a graphical interpretation of formula(15.13). The two curves correspond to the marginal cumulative distribution functions FX and FY .

Figure 15.1. Comparison of two continuous r.v. coupled by T M .

According to (15.13) we have to distinguish between three domains: the domain where FX lies beneath FY , the domain where FX lies above FY , and the domain where FX and FY coincide. The value of qM XY is computed as the sum of the increment of FX over the first domain and half of the increment of FX (or FY ) over the third domain. With the notations shown on the figure, we obtain for the example of Figure 15.1:

q X Y M = t 1 + t l 3 + 1 2 t 2 .

This example also illustrates that the probabilistic relation associated to a random vector pairwisely coupled by T M , can still be regarded as a graded version of the concept of stochastic dominance.

Note that the case of coupling by means of T P , has been treated before as the case of independent random variables, and the computation of QP   =   [qP ij ] according to (15.9), can be concisely written as

q i j P = E X i F X j ,

where E X denotes the expected value of the function between square brackets w.r.t. the distribution of X.

Next, we consider a continuous random vector (X 1, X 2, … ,Xm ) with arbitrary marginal distributions pairwisely coupled by T L In [8], we have also shown that:

15.5.2 Proposition

Let (X 1, X 2, … ,Xm) be a continuous random vector. Then the probabilistic relation QL   =   [qL ij ] can be computed as:

q i j L = x : F X i x + F X j x 1 f X i x d x ,

or, equivalently:

(15.14) q i j L = F X j u with u such that F X i u + F X j u = 1 .

Note that u in (15.14) may not be unique, in which case any u fulfilling the right equality may be considered. Then qL ij is simply the height of F X j in u. This is illustrated on Figure 15.2, where qL X , Y   = FY (u)   = t 1, since t 1  + t 2  =   1.

Figure 15.2. Comparison of two continuous r.v. coupled by T L .

One can again easily verify that Q X Y L = 1 2 when FX   = FY . Also, the probabilistic relation associated to a random vector pairwisely coupled by T L , can again be regarded as a graded version of stochastic dominance.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978044451814950015X

Multivariate stochastic orders

Félix Belzunce , ... Julio Mulero , in An Introduction to Stochastic Orders, 2016

3.5 The multivariate likelihood ratio order

In a similar way to the univariate case, it is possible to check the multivariate stochastic and hazard rate orders in terms of a property of the joint density functions. This property leads to the definition of the multivariate likelihood ratio order, which is a generalization of the univariate likelihood ratio order. In this section, the results are given in the continuous case but they can be restated for the general case. The main intention of the multivariate likelihood ratio order is to provide a sufficient condition for the multivariate hazard rate order.

Definition 3.5.1

Given two continuous random vectors X = (X 1,…,X n ) and Y = (Y 1,…,Y n ) with joint density functions f and g, respectively, we say that X is smaller than Y in the multivariate likelihood ratio order, denoted by Xlr Y, if

f ( x ) g ( y ) f ( x y ) g ( x y ) , for all x , y R n .

Clearly, this is a generalization of the likelihood ratio order in the univariate case. However, the multivariate likelihood ratio order is not necessarily reflexive. When a random vector X satisfies Xlr X, we have the MTP2 property introduced in Section 1.3.

The following result provides a set of sufficient conditions for the multivariate likelihood ratio order.

Theorem 3.5.2

Let X = (X 1,…,X n ) and Y = (Y 1,…,Y n ) be two continuous random vectors with joint density functions f and g, respectively. If X or Y or both are MTP2, and

(3.13) f ( y ) g ( x ) f ( x ) g ( y ) , for all x , y R n , such that x y ,

then

X lr Y .

Proof

Let us assume that X is MTP2 (in the other case the proof is similar), then we have the following inequalities:

f ( x ) g ( y ) f ( x y ) f ( x y ) f ( y ) g ( y ) f ( x y ) g ( x y ) ,

for all x , y R n , such that xy, where the first inequality follows from the MTP2 property and the second one from (3.13). Therefore, Xlr Y.

The multivariate likelihood ratio order is preserved under conditioning on sublattices, as we see next. Recall that a subset A R n is called a sublattice if x,yA implies xyA and xyA. This result will be used to show the relationship among the multivariate likelihood ratio order and the multivariate dynamic hazard rate order. The proof is obvious from the definition.

Theorem 3.5.3

Let X = (X 1,…,X n ) and Y = (Y 1,…,Y n ) be two continuous random vectors. If Xlr Y, then

[ X | X A ] lr [ Y | Y A ] , for all sublattice A R n .

In particular, from the previous theorem, the multivariate likelihood ratio order is preserved under marginalization. This result is useful because, in some cases, it is easier to provide a result in the multivariate case rather that in the univariate case.

Theorem 3.5.4

Let X = (X 1,…,X n ) and Y = (Y 1,…,Y n ) be two continuous random vectors. If Xlr Y, then

X I lr Y I , for all I { 1 , , n } .

Next, it is showed that the multivariate likelihood ratio order is stronger than the multivariate hazard rate order.

Theorem 3.5.5

Let X = (X 1,…,X n ) and Y = (Y 1,…,Y n ) be two continuous random vectors. If Xlr Y, then

X hr Y .

Proof

Let us check the conditions for the definition of the multivariate dynamic hazard rate order. In particular, denoting by η and λ the multivariate dynamic hazard rates of X and Y, respectively, let us see if

η k ( t | h t ) λ k ( t | h t ) , for all t 0 ,

where

h t = { X I J = x I J , X I J ¯ > t e } ,

and

h t = { Y I = y I , Y I ¯ > t e } ,

whenever I J = , 0x I y I t e, 0x J t e, and for all k I J ¯ .

The result will follow by proving that, as we shall see later,

(3.14) X I J ¯ h t lr Y I J ¯ h t .

Condition (3.14) will follow if

(3.15) X I J ¯ X I J = x I J lr Y I J ¯ Y I = y I , Y J > t e ,

holds.

Let us see that (3.15) follows if Xlr Y holds. Denoting by f and g the joint density functions of ( X I , X J , X I J ) and ( Y I , Y J , Y I J ) , respectively, and by f I,J and g I,J the joint densities of X I,J and Y I,J , respectively, we see that the joint density function of X I J ¯ X I J = x I J is given by

f I J ¯ ( x I J ¯ ) = f ( x I , x J , x I J ¯ ) f I , J ( x I , x J ) ,

and the joint density function of Y I J ¯ Y I = y I , Y J > t e is given by

g I J ¯ ( x I J ¯ ) = y J > t e g ( y I , y J , x I J ¯ ) d y J y J > t e g I , J ( y I , y J ) d y J .

Given y J > t e, we see that x J < t e <y J and, analogously, x I y I . Since Xlr Y, given y I J ¯ , we see that

f ( x I , x J , x I J ¯ ) g ( y I , y J , y I J ¯ ) f ( x I , x J , x I J ¯ y I J ¯ ) g ( y I , y J , x I J ¯ y I J ¯ ) ,

which, upon integration, yields

f I J ¯ ( x I J ¯ ) g I J ¯ ( y I J ¯ ) f I J ¯ ( x I J ¯ y I J ¯ ) g I J ¯ ( x I J ¯ y I J ¯ ) .

Therefore, from previous inequality, we see that (3.15) holds. Now, from Theorem 3.5.3, we see that (3.14) also holds. In particular, X k h t lr Y k h t holds, for all k I J ¯ . Now, since (2.25), we see that the hazard rates of X k h t and Y k h t are ordered and, consequently, we see that

η k ( t | h t ) λ k ( t | h t ) .

Finally, a result for the multivariate likelihood ratio order among random vectors with conditionally independent components is provided. In particular, we consider the same background as that in Theorem 3.3.8.

Theorem 3.5.6

Assume that

(i)

X i (θ) =st Y i (θ), for all θ and for all i = 1,…,n,

(ii)

X i (θ) ≤lr Y j (θ′), for all θθ′ and for all 1 ≤ ijn, and

(iii)

θ 1lr θ 2.

Then,

( X 1 , , X n ) lr ( Y 1 , , Y n ) .

Proof

Let f i (x|θ) be the density function of X i (θ). From condition (ii), we see that

i = 1 n f i ( x i | θ i ) is TP 2 in ( x 1 , , x n , θ 1 , , θ n ) .

Furthermore, condition (iii) is equivalent to the fact that h i (θ 1,…,θ n ) is TP2 in (θ 1,…,θ n ,i) ∈ S ×{1,2}. Due to these facts and from Theorem 1.2.3, we see that f ( x 1 , , x n | θ 1 , , θ n ) d H i ( θ ) is TP2 in (x 1,…,x n ,i), and we get the result.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012803768300003X

Truth, Possibility and Probability

In North-Holland Mathematics Studies, 1991

Uniform distributions

Let a, b be real numbers with a < b. Then the standard density function

f ( x ) = { 1 b a , if a x b , 0 , otherwise

is called a uniform density over [a, b]. It is clear that f is a density, since

f ( x ) d x = a b f ( x ) d x = 1

We now show a probability distribution that is approximately uniform on the interval [0, 1], i.e., with a uniform density over [0, 1]. We take the interval [0, 1] just for simplicity. Let Ω be an equally spaced hyperfinite approximation of the unit interval [0, 1]. That is, Ω = { t 0 , t 1 , , t v } such that ν is an infinite natural number,t 0 = 0,t ν = 1, and we also asumme that d t = t n + 1 t n = 1 / v for every n ≤ ν. It is clear that #Ω = ν + 1. Thus, we are assuming that Ω = { 0 , d t , 2 d t , 3 d t , , v d t } . We define the distribution pr on Ω by setting for any n ≤ ν

pr ( t n ) = 1 ν + 1

Thus, for any internal subset A of Ω, we have the probability distribution

Pr A = # A # Ω

The probability measure Pr is called the counting measure on Ω. It is clear that 〈Ω, pr〉 is an internal probability space. We now show that Pr has a uniform density.

The partition that we need in order to apply Theorem X.4 is formed by cubes, which in this case are intervals, [ t 0 , t 1 ) , undefined , undefined [ t i , t i + 1 ) , undefined , undefined [ t v 1 , t v ) of length dt = 1/ν. Then

Pr ( [ t i , t i + 1 ) Ω ) d t = Pr ( { t i } ) d t = 1 ( ν + 1 ) d t 1

Since the uniform density for [0, 1] is f(x) = 1, for 0 ≤ x ≤ 1, by Theorem X.4, we get that Pr has this uniform density.

Thus, if c, d ∈ [0, 1] with c < d, we have

Pr ( Ω [ c , d ] ) c d d x = d c

We also can see that, in a sense, Theorem X.4, 2, cannot be improved: it is not true that for any partition with infinitesimal intervals, any interval r in the partition, and xr, satifies

Pr ( r Ω ) V ( r ) 1

This formula is true, as it is asserted in the theorem, only for partitions into cubes of volume ≥ 1/μ, for μ ≤ ν, for a certain fixed ν. In order to find a counterexample, take a partition with subintervals of length 1/2ν. Then there are some subintervals which do not contain any point in Ω, and, thus Pr ( r Ω ) = 0 .

The same construction can be carried out in the unit circle C. As in Example VIII.4, let

C = { e i k d θ 0 k ν } ,

where d θ = 2 π / v and ν is an infinite natural number, be a hyperfinite approximation of C. Again, let pr be the counting measure on C. Then, as above, if A is an arc of C, we have that Pr ( C A ) the length of A.

We discuss, now, Bertrand's mixture paradox (see page 61) from an infinitesimal standpoint. The uniform distribution of the ratio of water to wine between 1 and 2 is obtained by a near interval, T 1 = { t 0 , t 1 , , t v } , for [1,2]. We must assume that t i + 1 t i = d t = 1 / ( v + 1 ) . Every point in T 1 is equally probable, that is Pr 1 ( t i ) = 1 / ( v + 1 ) .

On the other hand, the uniform distribution of the ratio of wine to water between 1/2 and 1, is obtained by an equally spaced near interval for [1/2,1]. If we take the reciprocals of the elements of T 1, we do get a near interval, T 2 = { u 0 , u 1 , , u v } , for [1/2, 1], but its points are not equally spaced. Thus, we do not get in this way a uniform distribution. Suppose that we assign the same probability to each point, that is

pr 2 ( u i ) = pr 1 ( t i ) = a ν + 1 .

We have that

d u i = u i + 1 u i = 1 t i + 1 1 t i = t i t i + 1 t i t i + 1

Then

Pr ( [ u i + 1 , u i ) Ω ) d u i = 1 ν + 1 t i t i + 1 t i + 1 t i = ν ν + 1 t i t i + 1 t i 2 = 1 u i 2

Thus, the density is, in this case, f ( x ) = 1 x 2 , which is clearly not uniform.

We conclude this section with a corollary to the theorem on change of variables (Theorem IX.18). Notice, that if R is a region and T is continuously differentiate transformation defined on R, then T(R) is also a region.

Theorem X.7

Let X be a continuous random vector with density f X over the region R, and let T be a continuously differentiable one-one transformation of R. Then the random vector Y = T ( X ) is also continuous with density

f Y ( y ) = f X ( T 1 ( y ) ) J T 1 ( y ) ,

for any y T ( R ) .

Proof

Let X be defined over 〈Ω, pr〉, and let R ν and P be as in (2) of Theorem X.7, such that P is composed of infinitesimal cubes of area 1/μ. Thus, we have, for any r P and x R

Pr [ X r Ω ] = Pr X ( r Ω ) f ( x ) V ( r ) ( V ( r ) )

Let Q be the partition of T(R) obtained by applying T to the elements of P . Then Q is an infinitesimal partition of T(R), and, by Lemma IX.16

V ( T ( r ) ) V ( r ) J T ( x )

We have for any s in Q , since f ( T 1 ( x ) ) and J T ( x ) are finite

Pr Y ( s Ω ) V ( s ) = Pr [ T ( X ) s ] V ( s ) = Pr [ X T 1 ( s ) ] V ( s ) = Pr [ X T 1 ( s ) ] V ( T 1 ( s ) ) V ( T 1 ( s ) ) V ( s ) f ( T 1 ( x ) ) J T 1 ( x )

Since f T 1 J T 1 is continuous and, by Theorem IX.18, integrable over T(R), by Theorem X.7 (1) it is a density of Y .

Recall that T is called an affine transformation of a region R if there exists an n × n matrix A and a vector c such that T ( x ) = x A + c , for x R . If c = 0 , then T is called linear. The function T is one-one, if A is nonsingular, and then

T 1 ( y ) = ( u c ) A 1 ,

for y T ( R ) . We have the following corollary.

Corollary X.8

Suppose that X is a continuous random vector with density f X and that T is a one-one affine transformation, as defined above. Then Y = T ( X ) is continuous with density

f Y ( y ) = det A 1 f X ( ( y c ) A 1 ) ,

for every y n * ,where det A is the determinant of A.

Proof

The corollary follows from the preceding theorem and

J T ( T 1 ( y ) ) = det A

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0304020808724216

Lattice Vector Quantization for Wavelet-Based Image Coding

Mikhail Shnaider , Andrew P. Papliński , in Advances in Imaging and Electron Physics, 1999

B Distortion Measure and Quantization Regions

Let us consider an input stream of continuous random vectors x = (x 1, x 2, ..., xN ) with the probability density function p( x ). Let us also consider an output stream y = (y 1, y 2, ..., yN ) related to an input stream x by the conditional probability density function q(y|x ). Assuming that the stream x is the input to a quantizer, then the corresponding values from the output stream y are obtained by a reference to the quantizer codebook.

According to Shannon (1948), the amount of uncertainty R( x , y ) of a value of x when we receive its quantized counterpart y from a transmission channel is given by

(17) R ( x , y ) = h ( x ) h ( x | y )

where h( x ) = ∫p( x ) log p( x )dx is the differential entropy of a variable x The value of R( x , y ) is effectively the actual transmission rate. With this in mind, we can pose the problem of optimal quantization in terms of minimization of the amount of uncertainty, that is,

(18) R ( D ) = min q Q D ( R ( x , y ) )

with QD being a set specified by the conditional probability q( y | x ) so that

(19) Q D = { q ( y | x ) : E [ d ( x , y ) ] D }

where d(·,·) is adistance function (to be defined). The substitution of Eq. (17) in Eq. (18) leads to

(20) R ( D ) = min q Q D ( h ( x ) h ( x | y ) ) .

As the entropy h( x ) is independent of the conditional probability q( y, x ) and the entropy h( x | y ) = h( x x | y ) ≤ h( x y ), we can further modify Eq. (20) in the following way:

(21) R ( D ) = h ( x ) max q Q D ( h ( x | y ) ) = h ( x ) max q Q D ( h ( x y | y ) ) h ( x ) max q Q D ( h ( x y ) ) .

In information theory the last equation is known as the Shannon lower bound(Berger, 1971; Sakrison, 1979; Gibson and Sayood, 1988).

Now let us return to the process of quantization. We collect the input vectors x j = (x 1, x 2..., x N) in blocks X = x 1, x 2..., x L) and, for each input block, the quantizer finds a corresponding output block Y = y 1, y 2..., y L consisting of vectors yj = (y 1, y 2..., y N). The average bit rate per vector for a vector quantizer can be calculated as

(22) h Q ( Y ) = 1 L h ( Y ) = 1 L j q ( Y j ) log q ( Y j ) .

Assuming that the distribution of x is uniform within each quantization region V and that all regions are simple translations of each other, we have the following approximation for the average bit rate per vector (Gibson and Sayood, 1988):

(23) h Q ( Y ) 1 L h ( X ) = 1 L log V .

If the input vectors x are independent we have that

h ( x ) = 1 L h ( X ) and h ( x y ) = 1 L h ( X Y ) .

Furthermore, it can be shown (Berger, 1971; Sakrison, 1979) that for sufficiently high dimensionality of vectors the following equality is satisfied:

(24) h ( X Y ) = log V .

Thus, we can conclude that the performance of an uniform vector quantizer asymptotically achieves the Shannon lower bound for large dimensions.

An average distortion of a quantizer can be defined in the following way:

(25) D Q ( m ) = d ( m ) ( x ) p ( x ) d x

where for every input x with the PDF p(x) the quantizer produces an output y . The mth-power distortion d(m) introduced by this operation is defined by

(26) d k ( m ) ( x , y ) = | y x | m .

By varying m, different distortion measures can be obtained. Closely related to the distortion measure is an optimal shape of the quantization region. It is easy to see that for a uniform source the optimal shape of quantization region is a pyramid, if m is selected to be 1, as depicted inFig. 11. For mequal to 2, Eq. (26) gives rise to the well-known mean squared error (MSE) distortion measure, the optimal quantization region for a uniform source being a sphere (Conway and Sloane, 1982). Furthermore, the radius of the sphere is equal to L D where D and L are the target distortion and dimensionality of quantization space, respectively.

Figure 11. A pyramidal quantization region for m = 1.

From the foregoing considerations we can conjecture that, assuming the MSE distortion measure, an optimal quantizer for an L-dimensional uniform source constitutes a set of uniformly distributed spheres with code vectors as centroids of the spheres.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S1076567008701988

Reliability Theory

N. Unnikrishnan Nair , ... N. Balakrishnan , in Reliability Modelling and Analysis in Discrete Time, 2018

Blomqvist's β

The Blomqvist measure is also based on concordance, but with the difference that instead of two random vectors here we choose a random vector and a fixed point ( x 0 , y 0 ) with x 0 as the median of X 1 and y 0 as the median of X 2 . Thus, the measure in the case of a continuous random vector ( X 1 , X 2 ) becomes

β = P [ ( X 1 x 0 ) ( X 2 y 0 ) > 0 ] P [ ( X 1 x 0 ) ( X 2 y 0 ) < 0 ] = 2 { P [ X 1 < x 0 , X 2 < y 0 ] + P [ X 1 > x 0 , X 2 > y 0 ] } 1 .

In the discrete case, this modifies to

(1.67) β = 4 P [ X 1 < x 0 , X 2 < y 0 ] 1 + P [ X 1 = x 0  or X 2 = y 0 ] .

The medians are chosen to be the nearest integer values.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128019139000014

Preliminaries

Félix Belzunce , ... Julio Mulero , in An Introduction to Stochastic Orders, 2016

1.3.1 The standard construction

The problem of defining a suitable multivariate extension of the univariate quantile function has a long history in statistics and probability. Next, the definition of the multivariate quantile transform is recalled. This transformation is also known as the standard construction, and is introduced by Arjas and Lehtonen [28], O'Brien [29], Rosenblatt [30], or Rüschendorf [31 ]. Given a continuous random vector X = (X 1,…,X n ), its standard construction is recursively defined as

(1.11) Q X , 1 ( p 1 ) = F X 1 1 ( p 1 ) , Q X , 2 ( p 1 , p 2 ) = F [ X 2 | X 1 = Q X , 1 ( p 1 ) ] 1 ( p 2 ) , Q X , n ( p 1 , , p n ) = F X n j = 1 n 1 X j = Q X , j ( p 1 , , p j ) 1 ( p n ) ,

for all (p 1,p 2,…,p n ) ∈ (0,1) n , where F X 1 1 denotes the quantile function of X 1 and

F X i | j = 1 i 1 X j = Q X , j ( p 1 , , p j ) 1 , for all i = 2 , , n ,

denotes the quantile functions of the univariate conditional random variables given by

X i j = 1 i 1 X j = Q X , j ( p 1 , , p j ) , for all i = 2 , , n .

It is worth mentioning that this well-known transform is widely used in simulation theory and it plays the role of the quantile in the multivariate case. Hence, given n independent uniformly distributed random variables U 1,…,U n on the interval (0,1), then

(1.12) ( X 1 , , X n ) = st Q X ( U 1 , , U n ) ,

where

Q X ( p 1 , , p n ) = ( Q X , 1 ( p 1 ) , Q X , 2 ( p 1 , p 2 ) , , Q X , n ( p 1 , , p n ) ) ,

for all (p 1,…,p n ) ∈ (0,1) n . As we shall see later, another transform of interest is the multivariate distributional transform. Given a continuous random vector X = (X 1,…,X n ), its multivariate distributional transform is recursively defined as

(1.13) D X , 1 ( x 1 ) = F X 1 ( x 1 ) , D X , 2 ( x 1 , x 2 ) = F [ X 2 | X 1 = x 1 ] ( x 2 ) , D X , n ( x 1 , , x n ) = F X n j = 1 n 1 X j = x j ( x n ) ,

for all (x 1,…,x n ) in the support of X, where F X 1 denotes the distribution function of X 1 and

F X i j = 1 i 1 X j = x j , for all i = 2 , , n

denote the distribution function of the conditional random variables given by

X i j = 1 i 1 X j = x j , for all i = 2 , , n .

According to the previous notation, we find that

(1.14) ( U 1 , , U n ) = st D X ( X 1 , , X n ) ,

where

D X ( x 1 , , x n ) = ( D X , 1 ( x 1 ) , D X , 2 ( x 1 , x 2 ) , , D X , n ( x 1 , , x n ) ) ,

for all ( x 1 , , x n ) R n . Given two continuous random vectors X = (X 1,…,X n ) and Y = (Y 1,…,Y n ), we can consider the transformation

(1.15) Φ ( x 1 , , x n ) = Q Y ( D X ( x 1 , , x n ) ) ,

defined for all (x 1,…,x n ) in the support of X. Observe that

Φ 1 ( x 1 ) = F Y 1 1 ( F X 1 ( x 1 ) ) , Φ 2 ( x 1 , x 2 ) = F Y 2 Y 1 = Φ 1 ( x 1 ) 1 F [ X 2 | X 1 = x 1 ] ( x 2 ) , Φ n ( x 1 , , x n ) = F Y n j = 1 n 1 Y j = Φ j ( x 1 , , x j ) 1 F X n j = 1 n 1 { X j = x j } ( x n ) ,

for all (x 1,…,x n ) in the support of X. Since the distribution and quantile functions are increasing, then Φ i (x 1,…,x i ) is also increasing in x i , for all i = 1,…,n. In fact, in cases of differentiability, the Jacobian matrix of Φ is always a lower triangular matrix with strictly positive diagonal elements.

The most relevant property of Φ is that, from (1.12) and (1.14), we have

Y = st Φ ( X ) ,

and, consequently, the function Φ maps the random vector X onto Y.

In addition, Fernández-Ponce and Suárez-Llorens [32] prove in their Theorem 3.1 that if we take a function k : R n R n such that Y =st k(X) and k has a lower triangular Jacobian matrix with strictly positive diagonal elements, then k necessarily has the form of the function Φ given in (1.15).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128037683000016

Fundamental of predictive filtering

Lu Cao , ... Bing Xiao , in Predictive Filtering for Microsatellite Control System, 2021

2.2.1 Random vector

In Probability Theory, a random vector is defined as X = [ X 1 X 2 X n ] T , where X i is a random variable, i = 1 , 2 , , n . Its joint cumulative distribution function is defined as the probability of an n-dimensional semi-infinite rectangle associated with the sample point x = [ x 1 x 2 x n ] T . It is given by

(2.1) F X ( x ) = F X ( x 1 , x 2 , , x n ) = P ( X 1 x 1 , X 2 x 2 , , X n x n )

If X is continuous random vector, then its joint probability density function is defined as

(2.2) f X ( x ) = n x 1 x n F X ( x 1 , x 2 , , x n )

For a discrete vector X , its joint probability mass function is defined as

(2.3) P X ( x ) = P ( X 1 = x 1 , X 2 = x 2 , , X n = x n )

The joint probability density function and the joint probability mass function satisfy

(2.4) f X ( x ) d x 1 d x n = 1

(2.5) x 1 x n P X ( x ) = 1

Moreover, for all continuous random vectors, f X ( x ) is such that

(2.6) F X ( x ) = x 1 x 2 x n f X ( τ 1 , τ 2 , , τ n ) d τ 1 d τ 2 d τ n

Let X = [ X 1 X 2 X n ] T and Y = [ Y 1 Y 2 Y m ] T be two random vectors having n and m components, respectively. The joint cumulative distribution function of X and Y at x = [ x 1 x 2 x n ] T and y = [ y 1 y 2 y m ] T is defined as

(2.7) F X , Y ( x , y ) = P ( X 1 x 1 , , X n x n , Y 1 y 1 , , Y m y m )

When X and Y are discrete, we define the joint probability mass function of X and Y as

(2.8) P X , Y ( x , y ) = P ( X 1 = x 1 , , X n = x n , Y 1 = y 1 , , Y m = y m )

and their marginal probability functions as

(2.9) P X , X ( x ) = y 1 y m P X , Y ( x , y )

(2.10) P Y , Y ( y ) = x 1 x n P X , Y ( x , y )

For continuous X and Y , we define their joint probability density function as

(2.11) f X , Y ( x , y ) = n + m x 1 x n y 1 y m F X , Y ( x , y )

and their marginal probability functions as

(2.12) f X , X ( x ) = f X , Y ( x , y ) d y 1 d y 2 d y m

(2.13) f Y , Y ( y ) = f X , Y ( x , y ) d x 1 d x 2 d x n

Then, the random vectors X and Y are independent if and only if

(2.14) { P X , Y ( x , y ) = P X , X ( x ) P Y , Y ( y ) , if X  and Y undefined are discrete f X , Y ( x , y ) = f X , X ( x ) f Y , Y ( y ) , if X  and Y undefined are continuous

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128218655000152

The Analysis of Structural Equation Model with Ranking Data using Mx

Wai-Yin Poon , in Handbook of Latent Variable and Related Models, 2007

1 Introduction

Ranking data are obtained when subjects are asked to rank p objects from 1 (most preferred) to p (least preferred). A number of approaches have been developed to model and analyze ranking data. Among others, researches on using Thurstonian models (Thurstone, 1927) remain active for more than 70 years (see, e.g., Critchlow and Fligner, 1991; Böckenholt, 1992, 1996, 2001; Chan and Bentler, 1998; Mayedu-Olivares, 1999), confirming their usefulness in a wide range of disciplines. Thurstonian models postulate that the ranking of the p objects are determined by a p × 1 latent continuous random vector Y which is distributed as multivariate normal with mean μ and covariance matrix Σ. Different models are achieved by putting different restrictions on the elements in Σ. In particular, the use of structural equation models by imposing the structures on Σ substantially enriches the horizon of modeling ranking data and hence its practical applicability (see, e.g., Currim, 1982; Böckenholt, 1992; Elrod and Keane, 1995).

In effect, structural equation modeling remains an extremely active area of research. The interest derives from both the practical applicability of the approach in addressing and verifying substantive theories and the technical difficulties involving in modeling of various types of data together with the estimation of the resulted model. User-friendly structural equation modeling packages are widely available, such as PRELIS and LISREL (Jöreskog and Sörbom, 1996a, 1996b), EQS (Bentler and Wu, 1993), Mx (Neale et al., 1999) as well as Mplus (Muthén and Muthén, 1998). The capabilities of these packages continue to be extended and nowadays, they all provide options for analyzing ordinal categorical data in a convenient manner. Although ranking data and ordinal categorical data are different in nature and require different modeling and estimation techniques, the approaches adopted by popular packages for analyzing ordinal categorical data and the Thurstonian approach for analyzing ranking data both operate on the assumption that the observed variables are associated with some underlying continuous variables distributed as multivariate normal. With reference to this similarity, the current study establishes a relationship between the analysis of ordinal categorical data and ranking data with a view to making use of readily available structural equation modeling procedures for analyzing ordinal categorical data to analyze ranking data. Specifically, the implementation of the analysis of the Thurstonian models using options designated for analyzing ordinal categorical data in the widely available software program Mx (Neale et al., 1999) is addressed. Some initial effort along similar direction has been made by Mayedu-Olivares (1999). He has developed a procedure for analyzing ranking data using the package MECOSA (Arminger et al., 1996).

A description of the Thurstonian model of ranking data, the multivariate normal model for analyzing ordinal categorical data, and their similarity is given in Section 2. A summary of the Mx program together with its option for analyzing ordinal categorical data, and the procedure on using the option to implement the Thurstonian models are given in Section 3. It will be seen that the procedure is very flexible. Different identification and across parameters constraints together with various structures on the mean vector and on the covariance/correlation matrix can be incorporated in an efficient and easy manner. Some examples available in the literature are used for illustration. In effect, the procedure can also be applied to analyze models for partial ranking data (Böckenholt, 1992). Two applications are discussed in Section 4. The first application relates to an analysis of rankings of compound objects. The objective is to examine whether or not a mean score for a compound choice alternative consisting of two objects can be predicted by an additive combination of means scores obtained for each of the two objects separately. The second application relates to the analysis of a set of 8 soft drinks, and the rankings of the soft drinks are obtained via a balanced incomplete design. The paper is concluded with a discussion in Section 5.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444520449500124

Elicitation of multivariate prior distributions: A nonparametric Bayesian approach

Fernando A. Moala Anthony O'Hagan , in Journal of Statistical Planning and Inference, 2010

We assume that the expert's prior density is a relatively smooth function f ( · ) : R k R of an unknown continuous random vector θ = ( θ 1 , θ 2 , , θ k ) . We will model a nonparametric prior for f ( · ) hierarchically in terms of a vector α of hyperparameters. An appropriate choice to represent the analyst's prior beliefs about f ( · ) is a Gaussian process, which we denote by

(1) f ( · ) | α GP ( g ( · ) , σ 2 C ( · , · ) ) ,

where the mean and covariance functions

(2) E { f ( θ ) | α } = g ( θ ) , cov [ f ( θ ) , f ( ϕ ) | α ] = σ 2 C ( θ , ϕ )

depend on the hyperparameters α .

Read full article

URL:

https://www.sciencedirect.com/science/article/pii/S0378375810000157