Why Sampling distribution not skewed when np < 10 and nq < 10?

$\begingroup$

My aim is to study "Sample distributions" via simulations convincing myself of the outcomes.

After struggling with questions here, here and here, finally I hope am somewhat getting my head around it.

I tried to simulate a population of bernoulli random variable Y, with a given probability for Y=1. I tried to sample from it, each time sample size being 'n', for 'N' experiments. The sampling distribution turned out to be normal as expected.

I tried to then show, the problem would not be a good fit for normal approximation, when np < 10 or nq < 10 because the distribution would be skewed. But when I simulate in python, I do not observe much skewness. The "extreme" curves were as good as some of intermediate curves, so I do not get it.

I tried increasing sample size, no of experiments, but no improvement. I tried using bar instead of hist, but in vain. I tried varying width of them but no use (in fact changing width affects size for another purpose - to maintain total height as 1 due to density=True)

Kindly check and let me know if there is any issue in the math I am doing.

Code:

import matplotlib.pyplot as plt
import matplotlib.animation as animation
from SDSPSM import create_bernoulli_population, sample_for_SDSP
import numpy as np
from math import sqrt, pi
# control inputs
T = 10000
p_list = np.arange(0.1,1,0.1) # varying probability
N = 2000 # no of experiments
n = 50 # sample size
fig, ax = plt.subplots(1,1,figsize=(12,4))
plt.close()
def animate(i): ax.clear() p = round(p_list[i],4) # create population to be sampled from - Note pop has to be re created every time due to changing p pops = create_bernoulli_population(T, p) _, Y_mean_list = sample_for_SDSP(pops, N,n) # plot discrete density _, bins,_ = ax.hist(Y_mean_list, density=True) # normal approx mu, var, sigma = get_metrics(Y_mean_list) X = np.linspace(min(bins),max(bins),10*len(bins)) if sigma>0: Cp = 1/(sigma*sqrt(2*pi)) Ep = -1/2*((X-mu)/sigma)**2 G = Cp*np.exp(Ep) ax.plot(X, G, color='red') metrics_text = '$\mu_x:{}$ \n$\sigma_x:{}$'.format(mu, sigma) ax.text(0.97, 0.98,metrics_text,ha='right', va='top',transform = ax.transAxes,fontsize=10,color='red') ax.set_ylim([0,15]) ax.set_xlim([0,1]) np_factor = round(n*p,2) nq_factor = round(n*(1-p),2) metrics_text_2 = '$p:{}$ \n$np:{}$ \n$nq:{}$'.format(p, np_factor, nq_factor) ax.text(0.01, 0.98,metrics_text_2,ha='left', va='top',transform = ax.transAxes,fontsize=10,color='red')
plt.show()
ani1 = animation.FuncAnimation(fig, animate, frames = range(0,len(p_list)), interval=1000)
from IPython.display import HTML
def display_animation(anim): plt.close(anim._fig) return HTML(anim.to_html5_video())
display_animation(ani1)

Helper file: SDSPSM

Output:
enter image description here

$\endgroup$ 12 Reset to default

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like