Data Science in Theory and Practice. Maria Cristina Mariani

Чтение книги онлайн.

Читать онлайн книгу Data Science in Theory and Practice - Maria Cristina Mariani страница 21

Data Science in Theory and Practice - Maria Cristina Mariani

Скачать книгу

      A vector bold upper X is said to have a k‐dimensional multivariate normal distribution (denoted upper M upper V upper N Subscript k Baseline left-parenthesis mu comma sigma-summation right-parenthesis, where upper N Subscript k is k‐dimensional multivariate normal distribution) with mean vector mu equals left-parenthesis mu 1 comma ellipsis comma mu Subscript k Baseline right-parenthesis and covariance matrix sigma-summation equals left-parenthesis sigma Subscript i j Baseline right-parenthesis Subscript i j element-of StartSet 1 comma ellipsis comma k EndSet if its density can be written as

f left-parenthesis bold x right-parenthesis equals StartFraction 1 Over left-parenthesis 2 pi right-parenthesis Superscript k slash 2 Baseline det left-parenthesis sigma-summation right-parenthesis Superscript 1 slash 2 Baseline EndFraction e Superscript minus one half left-parenthesis bold x minus mu right-parenthesis Super Superscript upper T Superscript sigma-summation Overscript negative 1 Endscripts left-parenthesis bold x minus mu right-parenthesis Baseline comma

      where we used the usual notations for the determinant, transpose, and inverse of a matrix. The vector of means mu may have any elements in double-struck upper R, but, just as in the one‐dimensional case, the standard deviation has to be positive. In the multivariate case, the covariance matrix sigma-summation has to be symmetric and positive definite.

      The multivariate normal defined thus has many nice properties. The basic one is that the one‐dimensional distributions are all normal, that is, upper X Subscript i Baseline tilde upper N left-parenthesis mu Subscript i Baseline comma sigma Subscript i i Baseline right-parenthesis and Cov left-parenthesis upper X Subscript i Baseline comma upper X Subscript j Baseline right-parenthesis equals sigma Subscript i j Baseline. This is also true for any marginal. For example, if left-parenthesis upper X Subscript r Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis are the last coordinates, then

Start 4 By 1 Matrix 1st Row upper X Subscript r Baseline 2nd Row upper X Subscript r plus 1 Baseline 3rd Row vertical-ellipsis 4th Row upper X Subscript k Baseline EndMatrix tilde upper M upper V upper N Subscript k minus r plus 1 Baseline left-parenthesis Start 4 By 1 Matrix 1st Row mu Subscript r Baseline 2nd Row mu Subscript r plus 1 Baseline 3rd Row vertical-ellipsis 4th Row mu Subscript k Baseline EndMatrix comma Start 4 By 4 Matrix 1st Row 1st Column sigma Subscript r comma r Baseline 2nd Column sigma Subscript r comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript r comma k Baseline 2nd Row 1st Column sigma Subscript r plus 1 comma r Baseline 2nd Column sigma Subscript r plus 1 comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript r plus 1 comma k Baseline 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column sigma Subscript k comma r Baseline 2nd Column sigma Subscript k comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript k comma k Baseline EndMatrix right-parenthesis period

      So any particular vector of components is normal.

      Conditional distribution of a multivariate normal is also a multivariate normal. Given that bold upper X is a upper M upper V upper N Subscript k Baseline left-parenthesis mu comma sigma-summation right-parenthesis and using the vector notations above assuming that bold upper X 1 equals left-parenthesis upper X 1 comma ellipsis comma upper X Subscript r Baseline right-parenthesis and bold upper X 2 equals left-parenthesis upper X Subscript r plus 1 Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis, then we can write the vector mu and matrix sigma-summation as

mu equals StartBinomialOrMatrix mu 1 Choose mu 2 EndBinomialOrMatrix and sigma-summation equals Start 2 By 2 Matrix 1st Row 1st Column sigma-summation Underscript 11 Endscripts 2nd Column sigma-summation Underscript 12 Endscripts 2nd Row 1st Column sigma-summation Underscript 21 Endscripts 2nd Column sigma-summation Underscript 22 Endscripts EndMatrix comma

      where the dimensions are accordingly chosen to match the two vectors (r and k minus r). Thus, the conditional distribution of bold upper X 1 given bold upper X 2 equals bold a, for some vector bold a is

bold upper X 1 vertical-bar bold upper X 2 equals bold a tilde upper M upper V upper N Subscript r Baseline left-parenthesis mu 1 minus sigma-summation Underscript 12 Endscripts sigma-summation Underscript 22 Overscript negative 1 Endscripts left-parenthesis mu 2 minus bold a right-parenthesis comma sigma-summation Underscript 11 Endscripts minus sigma-summation Underscript 12 Endscripts sigma-summation Underscript 22 Overscript negative 1 Endscripts sigma-summation Underscript 21 Endscripts right-parenthesis period

      Furthermore, the vectors bold upper X 2 and bold upper X 1 minus sigma-summation Underscript 21 Endscripts sigma-summation Underscript 22 Overscript negative 1 Endscripts bold upper X 2 are independent. Finally, any affine transformation upper A upper X plus b, where upper A is a k times k matrix and b is a k‐dimensional constant vector, is also a multivariate normal with mean vector upper A mu plus b and covariance matrix upper A sigma-summation 
						<noindex><p style= Скачать книгу