Читать онлайн книгу - Informatics and Machine Learning. Stephen Winters-Hilt. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Informatics and Machine Learning - Stephen Winters-Hilt

Скачать книгу

an estimate of r.v. X given that Y = y_j in terms of the posterior probability:

ModifyingAbove upper X With ampersand c period circ semicolon Subscript upper M upper A upper P Baseline equals argmax Underscript x element-of upper X Endscripts p left-parenthesis x bar y Subscript j Baseline right-parenthesis

ML Estimate:

Provides an estimate of r.v. X given that Y = y_j in terms of the maximum likelihood:

ModifyingAbove upper X With ampersand c period circ semicolon Subscript upper M upper L Baseline equals argmax Underscript x element-of upper X Endscripts p left-parenthesis y Subscript j Baseline bar x right-parenthesis

2.6 Emergent Distributions and Series

In this section we consider a r.v., X, with specific examples where those outcomes are fully enumerated (such as 0 or 1 outcomes corresponding to a coin flip). We review a series of observations of the r.v., X, to arrive at the LLN. The emergent structure to describe a r.v. from a series of observations is often described in terms of probability distributions, the most famous being the Gaussian Distribution (a.k.a. the Normal, or Bell curve).

2.6.1 The Law of Large Numbers (LLN)

The LLN will now be derived in the classic “weak” form. The “strong” form is derived in the modern mathematical context of Martingales in Appendix C.1.

Let X_k be independent identically distributed (iid) copies of X, and let X be the real number “alphabet.” Let μ = E(X), σ² = Var(X), and denote

x overbar Subscript upper N Baseline equals StartFraction 1 Over upper N EndFraction sigma-summation Underscript k equals 1 Overscript upper N Endscripts upper X Subscript upper K

upper E left-parenthesis x overbar Subscript upper N Baseline right-parenthesis equals mu

upper V a r left-parenthesis x overbar Subscript upper N Baseline right-parenthesis equals StartFraction 1 Over upper N squared EndFraction sigma-summation Underscript k equals 1 Overscript upper N Endscripts upper V a r left-parenthesis upper X Subscript k Baseline right-parenthesis equals StartFraction 1 Over upper N EndFraction sigma squared

From Chebyshev: P(| x overbar _N − μ|>k) ≤ Var(_N)/k² = StartFraction 1 Over italic upper N k squared EndFraction σ²

As N➔∞ get the LLN (weak):

If X_k are iid copies of X, for k = 1,2,…, and X is a real and finite alphabet, and μ = E(X), σ² = Var(X), then: P(| x overbar _N − μ| > k)➔0, for any k > 0. Thus, the arithmetic mean of a sequence of iid r.v.s converges to their common expectation. The weak form has convergence “in probability,” while the strong form has convergence “with probability one.”

2.6.2 Distributions

2.6.2.1 The Geometric Distribution(Emergent Via Maxent)

Here, we talk of the probability of seeing something after k tries when the probability of seeing that event at each try is “p.” Suppose we see an event for the first time after k tries, that means the first (k − 1) tries were nonevents (with probability (1 − p) for each try), and the final observation then occurs with probability p, giving rise to the classic formula for the geometric distribution:

upper P left-parenthesis upper X equals k right-parenthesis equals left-parenthesis 1 en-dash p right-parenthesis Superscript left-parenthesis k minus 1 right-parenthesis Baseline p

Schematic illustration of the Geometric distribution, P(X equals k) equals (1 minus p)(k-1)p, with p equals 0.8.

Figure 2.3 The Geometric distribution, P(X = k) = (1 − p)^(k−1) p, with p = 0.8.

As far as normalization, i.e. do all outcomes sum to one, we have:

Total Probability = ∑_{k = 1}(1 – p)^(k−1) p = p[1 + (1 – p) + (1 – p)² + (1 – p)³ + …] = p[1/(1 − (1 − p))] = 1

So total probability already sums to one with no further normalization needed. In Figure 2.3 is a geometric distribution for the case where p = 0.8:

2.6.2.2 The Gaussian (aka Normal) Distribution (Emergent Via LLN Relation and Maxent)

upper N Subscript x Baseline left-parenthesis mu comma sigma squared right-parenthesis equals exp left-parenthesis en-dash left-parenthesis x en-dash mu right-parenthesis squared slash left-parenthesis 2 sigma squared right-parenthesis right-parenthesis slash left-parenthesis 2 normal pi sigma squared right-parenthesis Superscript left-parenthesis 1 slash 2 right-parenthesis

For the Normal distribution the normalization is easiest to get via complex integration (so we'll skip that). With mean zero and variance equal one (Figure 2.4) we get:

2.6.2.3 Significant Distributions That Are Not Gaussian or Geometric

Nongeometric duration distributions occur in many familiar areas, such as the length of spoken words in phone conversation, as well as other areas in voice recognition. Although the Gaussian distribution occurs in many scientific fields (an observed embodiment of the LLN, among other things), there are a huge number of significant (observed) skewed distributions, such as heavy‐tailed (or long‐tailed) distributions, multimodal distributions, etc.

Heavy‐tailed distributions are widespread in describing phenomena across the sciences. The log‐normal and Pareto distributions are heavy‐tailed distributions that are almost as common as the normal and geometric distributions in descriptions of physical phenomena or man‐made phenomena. Pareto distribution was originally used to describe the allocation of wealth of the society, known as the famous 80–20 rule, namely, about 80% of the wealth was owned by a small amount of people, while “the tail,” the large part

Скачать книгу

Informatics and Machine Learning. Stephen Winters-Hilt

Чтение книги онлайн.

Читать онлайн книгу Informatics and Machine Learning - Stephen Winters-Hilt страница 23

Информация о книге:

2.6 Emergent Distributions and Series

2.6.1 The Law of Large Numbers (LLN)

2.6.2 Distributions

2.6.2.1 The Geometric Distribution(Emergent Via Maxent)

2.6.2.2 The Gaussian (aka Normal) Distribution (Emergent Via LLN Relation and Maxent)

2.6.2.3 Significant Distributions That Are Not Gaussian or Geometric

Informatics and Machine Learning. Stephen Winters-Hilt

Чтение книги онлайн.

Читать онлайн книгу Informatics and Machine Learning - Stephen Winters-Hilt страница 23

Информация о книге:

2.6 Emergent Distributions and Series

2.6.1 The Law of Large Numbers (LLN)

2.6.2 Distributions 2.6.2.1 The Geometric Distribution(Emergent Via Maxent)

2.6.2.2 The Gaussian (aka Normal) Distribution (Emergent Via LLN Relation and Maxent)

2.6.2.3 Significant Distributions That Are Not Gaussian or Geometric

2.6.2 Distributions

2.6.2.1 The Geometric Distribution(Emergent Via Maxent)