Читать онлайн книгу - Data Science in Theory and Practice. Maria Cristina Mariani. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Data Science in Theory and Practice - Maria Cristina Mariani

Скачать книгу

left-parenthesis 1 minus x right-parenthesis Superscript beta minus 1 Baseline comma 2nd Column if 0 less-than x less-than 1 comma 2nd Row 1st Column 0 comma 2nd Column if otherwise comma EndMatrix"/>

where and .

The Dirichlet distribution , named after Johann Peter Gustav Lejeune Dirichlet (1805–1859), is a multivariate distribution parameterized by a vector of positive parameters .

Specifically, the joint density of an ‐dimensional random vector is defined as:

f left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis equals StartFraction 1 Over bold upper B left-parenthesis bold-italic alpha right-parenthesis EndFraction left-parenthesis product Underscript i equals 1 Overscript n Endscripts x Subscript i Superscript alpha Super Subscript i Superscript minus 1 Baseline bold 1 Subscript left-brace x Sub Subscript i Subscript greater-than 0 right-brace Baseline right-parenthesis bold 1 Subscript left-brace x 1 plus midline-horizontal-ellipsis plus x Sub Subscript n Subscript equals 1 right-brace Baseline comma

where is an indicator function.

Definition 2.23 (Indicator function) The indicator function of a subset of a set is a function

1 Subscript upper A Baseline colon upper X right-arrow StartSet 0 comma 1 EndSet

defined as

1 Subscript upper A Baseline left-parenthesis x right-parenthesis equals Start 2 By 2 Matrix 1st Row 1st Column 1 comma 2nd Column if x element-of upper A comma 2nd Row 1st Column 0 comma 2nd Column if x not-an-element-of upper A period EndMatrix

The components of the random vector bold upper X thus are always positive and have the property upper X 1 plus midline-horizontal-ellipsis plus upper X Subscript n Baseline equals 1 . The normalizing constant bold upper B left-parenthesis bold-italic alpha right-parenthesis is the multinomial beta function, that is defined as:

bold upper B left-parenthesis bold-italic alpha right-parenthesis equals StartFraction product Underscript i equals 1 Overscript n Endscripts normal upper Gamma left-parenthesis alpha Subscript i Baseline right-parenthesis Over normal upper Gamma left-parenthesis sigma-summation Underscript i equals 1 Overscript n Endscripts alpha Subscript i Baseline right-parenthesis EndFraction equals StartFraction product Underscript i equals 1 Overscript n Endscripts normal upper Gamma left-parenthesis alpha Subscript i Baseline right-parenthesis Over normal upper Gamma left-parenthesis alpha 0 right-parenthesis EndFraction comma

where we used the notation alpha 0 equals sigma-summation Underscript i equals 1 Overscript n Endscripts alpha Subscript i and normal upper Gamma left-parenthesis x right-parenthesis equals integral Subscript 0 Superscript infinity Baseline t Superscript x minus 1 Baseline e Superscript negative t Baseline d t for the Gamma function.

Because the Dirichlet distribution creates positive numbers that always sum to 1, it is extremely useful to create candidates for probabilities of possible outcomes. This distribution is very popular and related to the multinomial distribution which needs numbers summing to 1 to model the probabilities in the distribution. The multinomial distribution is defined in Section 2.3.2.

With the notation mentioned above and alpha 0 as the sum of all parameters, we can calculate the moments of the distribution. The first moment vector has coordinates:

upper E left-bracket upper X Subscript i Baseline right-bracket equals StartFraction alpha Subscript i Baseline Over alpha 0 EndFraction period

The covariance matrix has elements:

Var left-parenthesis upper X Subscript i Baseline right-parenthesis equals StartFraction alpha Subscript i Baseline left-parenthesis alpha 0 minus alpha Subscript i Baseline right-parenthesis Over alpha 0 squared left-parenthesis alpha 0 plus 1 right-parenthesis EndFraction comma

and when i not-equals j

Cov left-parenthesis upper X Subscript i Baseline comma upper X Subscript j Baseline right-parenthesis equals StartFraction minus alpha Subscript i Baseline alpha Subscript j Baseline Over alpha 0 squared left-parenthesis alpha 0 plus 1 right-parenthesis EndFraction period

The covariance matrix is singular (its determinant is zero).

Finally, the univariate marginal distributions are all beta with parameters upper X Subscript i Baseline tilde Beta left-parenthesis alpha Subscript i Baseline comma alpha 0 minus alpha Subscript i Baseline right-parenthesis . All these are in the reference (see Balakrishnan and Nevzorov 2004).

Please refer to Lin (2016) for the proof of the properties of the Dirichlet distribution.

2.3.2 Multinomial Distribution

We begin with a definition of the binomial distribution.

Definition 2.24 (Binomial distribution) A random variable

Скачать книгу

Data Science in Theory and Practice. Maria Cristina Mariani

Чтение книги онлайн.

Читать онлайн книгу Data Science in Theory and Practice - Maria Cristina Mariani страница 19

Информация о книге:

2.3.2 Multinomial Distribution