Informatics and Machine Learning. Stephen Winters-Hilt

Чтение книги онлайн.

Читать онлайн книгу Informatics and Machine Learning - Stephen Winters-Hilt страница 23

Informatics and Machine Learning - Stephen Winters-Hilt

Скачать книгу

an estimate of r.v. X given that Y = yj in terms of the posterior probability:

ModifyingAbove upper X With ampersand c period circ semicolon Subscript upper M upper A upper P Baseline equals argmax Underscript x element-of upper X Endscripts p left-parenthesis x bar y Subscript j Baseline right-parenthesis

       ML Estimate:

      Provides an estimate of r.v. X given that Y = yj in terms of the maximum likelihood:

ModifyingAbove upper X With ampersand c period circ semicolon Subscript upper M upper L Baseline equals argmax Underscript x element-of upper X Endscripts p left-parenthesis y Subscript j Baseline bar x right-parenthesis

      In this section we consider a r.v., X, with specific examples where those outcomes are fully enumerated (such as 0 or 1 outcomes corresponding to a coin flip). We review a series of observations of the r.v., X, to arrive at the LLN. The emergent structure to describe a r.v. from a series of observations is often described in terms of probability distributions, the most famous being the Gaussian Distribution (a.k.a. the Normal, or Bell curve).

      2.6.1 The Law of Large Numbers (LLN)

      The LLN will now be derived in the classic “weak” form. The “strong” form is derived in the modern mathematical context of Martingales in Appendix C.1.

      Let Xk be independent identically distributed (iid) copies of X, and let X be the real number “alphabet.” Let μ = E(X), σ2 = Var(X), and denote

x overbar Subscript upper N Baseline equals StartFraction 1 Over upper N EndFraction sigma-summation Underscript k equals 1 Overscript upper N Endscripts upper X Subscript upper K upper E left-parenthesis x overbar Subscript upper N Baseline right-parenthesis equals mu upper V a r left-parenthesis x overbar Subscript upper N Baseline right-parenthesis equals StartFraction 1 Over upper N squared EndFraction sigma-summation Underscript k equals 1 Overscript upper N Endscripts upper V a r left-parenthesis upper X Subscript k Baseline right-parenthesis equals StartFraction 1 Over upper N EndFraction sigma squared

      From Chebyshev: P(| x overbarNμ|>k) ≤ Var(x overbarN)/k2 = StartFraction 1 Over italic upper N k squared EndFraction σ2

      As N➔∞ get the LLN (weak):

      If Xk are iid copies of X, for k = 1,2,…, and X is a real and finite alphabet, and μ = E(X), σ2 = Var(X), then: P(|x overbarNμ| > k)➔0, for any k > 0. Thus, the arithmetic mean of a sequence of iid r.v.s converges to their common expectation. The weak form has convergence “in probability,” while the strong form has convergence “with probability one.”

      2.6.2 Distributions

      2.6.2.1 The Geometric Distribution(Emergent Via Maxent)

      Here, we talk of the probability of seeing something after k tries when the probability of seeing that event at each try is “p.” Suppose we see an event for the first time after k tries, that means the first (k − 1) tries were nonevents (with probability (1 − p) for each try), and the final observation then occurs with probability p, giving rise to the classic formula for the geometric distribution:

upper P left-parenthesis upper X equals k right-parenthesis equals left-parenthesis 1 en-dash p right-parenthesis Superscript left-parenthesis k minus 1 right-parenthesis Baseline p Schematic illustration of the Geometric distribution, P(X equals k) equals (1 minus p)(k-1)p, with p equals 0.8.

      Total Probability = ∑k = 1(1 – p)(k−1) p = p[1 + (1 – p) + (1 – p)2 + (1 – p)3 + …] = p[1/(1 − (1 − p))] = 1

      2.6.2.2 The Gaussian (aka Normal) Distribution (Emergent Via LLN Relation and Maxent)

upper N Subscript x Baseline left-parenthesis mu comma sigma squared right-parenthesis equals exp left-parenthesis en-dash left-parenthesis x en-dash mu right-parenthesis squared slash left-parenthesis 2 sigma squared right-parenthesis right-parenthesis slash left-parenthesis 2 normal pi sigma squared right-parenthesis Superscript left-parenthesis 1 slash 2 right-parenthesis

      2.6.2.3 Significant Distributions That Are Not Gaussian or Geometric

      Nongeometric duration distributions occur in many familiar areas, such as the length of spoken words in phone conversation, as well as other areas in voice recognition. Although the Gaussian distribution occurs in many scientific fields (an observed embodiment of the LLN, among other things), there are a huge number of significant (observed) skewed distributions, such as heavy‐tailed (or long‐tailed) distributions, multimodal distributions, etc.

Скачать книгу