href="#fb3_img_img_88a4c458-1aef-51a7-b963-ce9fab490ff8.png" alt="upper X"/> is said to have a binomial distribution with parameters and if it has a pmf shown below
where is the probability of success on an individual trial and is number of trials in the binomial experiment.
The multinomial distribution is a generalization of the binomial distribution. Specifically, assume that independent distributions may result in one of the outcomes generically labeled , each with corresponding probabilities . Now define a vector , where each of the counts the number of outcomes in the resulting sample of size . The joint distribution of the vector is
In the same way as the binomial probabilities appear as coefficients in the binomial expansion of , the multinomial probabilities are the coefficients in the multinomial expansion , so they sum to 1. This expansion in fact gives the name of the distribution.
If we label the outcome as a success and everything else a failure, then simply counts successes in independent trials and thus . Thus, the first moment of the random vector and the diagonal elements in the covariance matrix are easy to calculate as and , respectively. The off‐diagonal elements (covariances) are not that complicated to calculate either. However, for multinomial random vectors, the first two moments are difficult to compute. The one‐dimensional marginal distributions are binomial; however, the joint distribution of , the first components, is not multinomial. Instead, suppose we group the first categories into 1 and we let . Because the categories are linked, that is, , we also have that . We can easily verify that the vector , or equivalently , will have a multinomial distribution with associated probabilities .
Next consider the conditional distribution of the first components given the last components. That is, the distribution of
This distribution is also multinomial with the number of elements and probabilities , where .