đŸ•”ïž Chebyshev's Theorem: Concept, Formula, Example

Chebyshev’s Theorem is also known as Chebyshev’s inequality, and it’s a fundamental concept in probability theory and statistics.

It provides a way to estimate the proportion of data that falls within a certain range around the mean, regardless of the shape of the probability distribution.

The Concept of Chebyshev’s Theorem

The theorem states that for any given dataset, regardless of its probability distribution, at least a certain proportion of the data must lie within a specific number of standard deviations from the mean.

This theorem is useful when you have the mean and standard deviation of your data, and you need to know the proportion of data that lie within plus or minus two standard deviations from the mean.

If your data follow the normal distribution, you can apply the empirical rule (68-95-99.7) which looks like the following:

The empirical rule states that given normal data distribution, 68% of the data falls within 1 standard deviation, 95% of data falls within 2 standard deviations and 99.7 % of data falls within 3 standard deviations.

But the empirical rule isn’t very useful when your data distribution shape is not normal or unknown. In that case, you need to use Chebyshev’s theorem.

The Formula of Chebyshev’s Theorem

For any value k greater than 1, at least 1 - 1/k^2 of the data falls within k standard deviations of the mean. k equals the number of standard deviations that you want to know and it must be greater than 1.

The plot for this theorem would look like this:

When k = 2, then at least 3/4 (or 75%) of the data falls within 2 standard deviations of the mean:

1 - 1/2^2 = 1 - 1/4 = 3/4 ≈ 0.75 or 75%

In mathematical terms, if X is a random variable with mean ÎŒ and standard deviation σ, Chebyshev’s theorem can be expressed as:

P(|X - ÎŒ| < kσ) ≄ 1 - 1/k^2

where P(|X - ÎŒ| < kσ) represents the probability that X falls within k standard deviations of the mean.

Here’s a quick look into the proportions of data according to the theorem:

Standard DeviationsMin % withinMax % outside
1.55644
27525
38911
4946
5964

Unlike the empirical rule, Chebyshev’s theorem doesn’t provide exact answers, only estimates.

Chebyshev’s Theorem Examples

Let’s see an example that applies the theorem. Suppose you have a mean of 10 and a standard deviation of 2:

For k = 2:

Lower bound: 10 - 2(2) = 6

Upper bound: 10 + 2(2) = 14

Calculate the proportion with Chebyshev’s theorem:

1 - 1/2^2 = 1 - 1/4 = 3/4 ≈ 75%

Result: At least 75% of the data falls within 6 - 14.

For k = 3:

Lower bound: 10 - 3(2) = 4

Upper bound: 10 + 3(2) = 16

Calculate the proportion with Chebyshev’s theorem:

1 - 1/3^2 = 1 - 1/9 = 8/9 ≈ 88.9%

Result: At least 88.9% of the data falls within 4 - 16.

For k = 4:

Lower bound: 10 - 4(2) = 2

Upper bound: 10 + 4(2) = 18

Calculate the proportion with Chebyshev’s theorem:

1 - 1/4^2 = 1 - 1/16 = 15/16 ≈ 93.8%

Result: At least 93.8% of the data falls within 2 - 18.

It’s important to note that Chebyshev’s theorem provides a lower bound and is generally not very tight.

In many cases, the actual proportion of data within a certain range can be much higher than the lower bound predicted by the theorem.

Conclusion

Chebyshev’s theorem is a valuable tool in probability theory and is widely used in statistical analysis to make general statements about the spread of data.

Chebyshev’s Theorem applies to all probability distributions where you can calculate the mean and standard deviation, while the Empirical Rule applies only to the normal distribution.

Also, note that the Empirical Rule provides exact answers while Chebyshev’s Theorem gives an approximation.

If your data follows the normal distribution, use the Empirical Rule. Otherwise, Chebyshev’s Theorem will be your best companion.

Take your skills to the next level âšĄïž

I'm sending out an occasional email with the latest tutorials on programming, web development, and statistics. Drop your email in the box below and I'll send new stuff straight into your inbox!

No spam. Unsubscribe anytime.