Send Close Add comments: (status displays here)
Got it!  This site "www.robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.  Note: This appears on each machine/browser from which this site is accessed.
Distributions and sampling
by RS  admin@robinsnyder.com : 1024 x 640


1. Distributions and sampling
One meaning of the term distribute is to spread around as in divide and share. Some international relief programs distribute food to those who need it. A distribution system is a way or process of distributing something. Many companies have distribution warehouses as part of their distribution system. In statistics, a distribution is the way in which data values are put into buckets in order to summarize the data in a meaningful way. Statistical distributionDistributions can be discrete (i.e., counts) or continuous (i.e., measures).

If there are enough values, the discrete counts can be approximated by a continuous measure.

2. Uses
Statistical distributions have many uses. Generating test data can often involve creating sample text.

3. Chebyshev's theorem
Pafnuty Chebyshev (statistician) developed a theorem that says that, for any set of data values, the proportion of values that like within k standard deviations of the mean is at least 1-1/k2, where k is any constant greater than 1.

Russian name: Пафну́тий Чебышёв Chebyshev's theorem formula

4. Bell shape
What is a bell shape? The Liberty Bell is an example of a bell-shaped object. Many frequency distributions are bell-shaped, or can be approximated by a bell-shaped distribution.

5. Confidence intervals
Confidence intervalsFor a symmetrical, bell-shaped, frequency distribution, the following are true.


6. Six Sigma
Six Sigma is a quality control technique based on controlling variance measurements.

7. Central limit theorem
The central limit theorem is a measure of central tendency.

The central limit theorem states that as more and more random samples of a given size are taken from a population, the distribution of the sample means can be approximated by a normal distribution. The more samples, the better the approximation.

8. Simulation
Simulation techniques can be used to simulate many samples from a given distribution.

9. Normal distribution
Normal distributionWhat happens if samples are taken from a population that is normally distributed?

10. Sampling from a normal distribution
Normal distribution samplingHere is the result of sampling from a normal distribution.

The distribution of the sample means appears to be a normally distributed.

This is to be expected. Notice that the distribution of the sample means is narrower and taller. Since the height of all boxes (in this discrete approximation) is 1.0, the standard deviation of the sample means must be smaller than the standard deviation of the population.


11. Uniform distribution
Uniform distributionSuppose that a real-world process is modeled by a uniform distribution in the range 30.0 to 60.0.

One example of a uniform distribution is a truly random number generation process within the range of interest, here 30.0 to 60.0. Suppose that 1000 samples are to be taken where each sample consists of 16 values selected at random from the distribution and averaged to get a sample mean. Of course, there are random errors inherent in each sample of 16 values, but these random errors can be minimized by taking more samples, which is why 1000 samples are taken. Then create a bar chart of the distribution of the means of the 1000 samples. What is the distribution of the sampling process?

12. Sampling from uniform distribution
Sampling from uniform distributionHere is the result of sampling from a uniform distribution.

The sample means appear to have a bell-shaped distribution. This distribution is called the normal distribution, a measure of central tendency. The central limit theorem states that most sampling distributions can be approximated by a normal distribution, even if the population distribution (in this case, the uniform distribution) is not normally distributed. Thus, the central limit theorem has great importance since it means that the normal distribution has useful applications in practice. What is the importance of the central limit theorem?


13. Exponential distribution
Exponential distributionConsider the following exponential distribution.

What happens if samples are taken from this exponential distribution?

14. Sampling from exponential distribution
Sampling from exponential distributionHere is a chart of the results of 2000 samples of size 16 from an exponential distribution.

The distribution of the sample means appears to be a normally distributed.

15. Discrete distribution
Discrote distributionConsider the following discrete distribution.

What happens if samples are taken from this discrete distribution?

16. Sampling from a discrete distribution
Sampling from discrete distributionHere is the chart of the results of 2000 samples of size 32 from the very non-normal discrete distribution.

The distribution of the sample means appears to be a normally distributed. The standard deviation of the population would be quite large, as the values are only at the extremes (low and high) of the possible values.

Thus, the standard deviation of the sample means appears to be less than the standard deviation of the population. Suppose that the middle values with 0.0 probability are omitted. An intuitive analogy at this point is to compare this distribution with the binomial distribution of a biased coin.

17. Central limit theorem
As can be seen, the central limit theorem is important in that it shows that normal distributions can be used to model the distribution of the sample means.

18. Statistical animations
Here are some animations of some statistical concepts that are relevant to a data science course.

19. Change in mean
/QM.XLS/norm-01.xls: Normal distributionAs the mean changes, the distribution of the means keeps the same shape (i.e., the same standard deviation).

In this animation, the mean μ ranges from 40 to 60 while the standard deviation σ is 15.

20. Change in standard deviation
Normal distributionAs the standard deviation σ (sigma) changes, the mean μ (mu) stays the same, but the shape of the curve changes. Since the area under the curve is always 1.0, the curve gets wider and flatter as the standard deviation increases and taller and narrower as the standard deviation decreases.

21. Normal curve shapes
The area under a probability curve is 1.0.

As the standard deviation σ (sigma) increases, the curve gets wider and shorter.

As the standard deviation σ (sigma) decreases, the curve gets narrower and taller.

As the mean μ (mu) increases, the curve shifts right.

As the mean μ (mu) decreases, the curve shifts left.


22. Poisson distribution
Poisson distribution for mu=20Here is an animation of how the Poisson distribution varies as mu is changed.

Notice that as μ(mu) increases, the Poisson distribution resembles the discrete binomial and the continuous normal distribution.

23. Chi-square distribution
Chi-square distributionHere is an animation to show how the χ2 (chi-squared) distribution approaches the normal distribution as n gets large.

24. End of page

by RS  admin@robinsnyder.com : 1024 x 640