What is the distribution of the variable $X$ given $$ X = Y + Z, $$where $Y \sim $ Binomial($n$, $P_Y$) and $Z\sim$ Binomial($n$, $P_Z$)?
For the special case, when $P_Y = P_Z = P$, I think that X~Binomial($2n$, $P$) is correct. If $P_A ≠ P_B$, the distribution might eventually just be Binomial$\left(2n, \frac{P_A + P_B}{2}\right)$ but I can't prove it.
If the problem is more complicated than I expect and we can't derive the whole distribution, can we tell something about the mean and the variance of $X$?
$\endgroup$ 24 Answers
$\begingroup$It will be a special case of the Poisson Binomial Distribution.
$\endgroup$ $\begingroup$See the binomial sum variance inequality. Here is an excerpt from the Wikipedia page.
$\endgroup$ $\begingroup$In probability theory and statistics, the sum of independent binomial random variables is itself a binomial random variable if all the component variables share the same success probability. If success probabilities differ, the probability distribution of the sum is not binomial.
Assuming $Y$ and $Z$ are independent, $X=Y+Z$ has mean $E[Y]+E[Z] = n P_Y + n P_Z$ and variance $\text{Var}(Y) + \text{Var}(Z) = n P_Y (1-P_Y) + n P_Z (1 - P_Z)$. The characteristic function is $$ \left( P_Y {{\rm e}^{it}}+1-P_Y \right) ^{n}\left( P_Z {{\rm e}^{it}}+1-P_Z \right) ^{n}$$ But unless $P_Y = P_Z$, there is no special name for the distribution of $X$.
EDIT: Maple does come up with a closed form for the probability mass function involving the associated Legendre function of the first kind:
$$\mathbb P(X=x) = \cases{ \dfrac{n!}{x!} P_n^{x-n}\left(\dfrac{2 P_Y P_Z - P_Y - P_Z}{P_Y - P_Z}\right) (P_Z - P_Y)^n \left(\dfrac{(1-P_Z)(1-P_Y)}{P_Z P_Y}\right)^{(n-x)/2} & if $0 \le x \le n$\cr \dfrac{n!}{(2n-x)!} P_n^{n-x}\left(\dfrac{2 P_Y P_Z - P_Y - P_Z}{P_Y - P_Z}\right) (P_Z - P_Y)^n \left(\dfrac{(1-P_Z)(1-P_Y)}{P_Z P_Y}\right)^{(n-x)/2} & if $n \le x \le 2n$}$$
EDIT: In response to Shakil's request, here is the Maple code:
> sum(binomial(n,k)*P[Z]^k*(1-P[Z])^(n-k)* binomial(n,x-k)*P[Y]^(x-k)*(1-P[Y])^(n-(x-k)),k=0..x) assuming x>=0,x<=n;
> simplify(%);
> sum(binomial(n,k)*P[Z]^k*(1-P[Z])^(n-k)* binomial(n,x-k)*P[Y]^(x-k)*(1-P[Y])^(n-(x-k)),k=x-n..n) assuming x>=n,x<=2*n;
> simplify(%); $\endgroup$ 4 $\begingroup$ In the limit as $n \to \infty$, your binomials become Gaussian and since it seems you are implicitly assuming your two binomials are independent, a sum of two independent Gaussians is Gaussian with mean and variance parameters given by the sum of the parameters for the two Gaussians, so yes, in the limit as $n \to \infty$, your distribution will converge to Binomial$(2n, (P_A+P_B)/2)$.
Also you are right, if $P_A = P_B = P$ and you assume independence, then the distribution is precisely Binomial$(2n,P)$.
However if $P_A \neq P_B$ and you assume independence, then the exact distribution is different from Binomial$(2n,(P_A + P_B)/2)$. If you let $X = X_A + X_B$ be the random variable which is the sum of your two binomials, then $P(X = k)$ is the summation over all the ways that you get $X_A = k_A$ and $X_B = k_B$ where $k_A + k_B = k$. It is easy to write down this summation formula if you know the formulas for binomial distribution, and summation notation. However I'm inclined to believe there is no closed form formula for it, unless it's something crazy like hypergeometric.
$\endgroup$ 6