Difference between Standard Deviation and Variance.

852 views

Difference between Standard Deviation and Variance.

In: Mathematics

5 Answers

Anonymous 0 Comments

One important characteristic of a set of data is its dispersion– basically, how much its values are spread out. So how do we stick a number to measure the dispersion of a set of data? Lets say that X is my set of data, and μ is my mean. Your first thought might be to calculate the average of the deviations of X from the mean. In other words, you’d calculate the mean of X-μ (if that makes sense). However, this results immediately in a problem. The negative deviations will exactly cancel the positive deviations, making this method always give you zero, no matter what the dispersion is. For the slightly more advanced math to prove it:

E(X-μ)=E(X)-μ=μ-μ=0, with E(X) being the expected value of X. If you don’t understand this and want me to explain, I can.

So this isn’t gonna work. At this point you may think “OK, so why don’t we just make all the deviations positive?” In other words, instead of using X-μ, why don’t we use |X-μ|, or the absolute value of X-μ? This turns out to work, and sometimes people do use it, but absolute values are really annoying mathematically. First, absolute value doesn’t have a simple arithmetic formula, so its always kinda clunky to express. Second, and perhaps more importantly, you can’t take the derivative or the integral of an absolute value function, which are two processes that are really useful in analyzing data. These two things together make the absolute value trick less appealing. So instead, we can square the deviations, which solves all of our problems (except one, which will be shown).

Thus, the variance is the average of the squared deviations from the mean. In other words, it is the mean of (X-μ)2 .

Now, if you’ve been following along, you may have noticed a problem. The units of variance are the square of the units from my original data. So if I have a set of data measuring inches, my variance, or measure of the dispersion, will awkwardly be in inches squared? This presents an obvious problem since it takes away most of the practical applications of the variance. Thus, in most of applied statistics, the solution is just to take the square root of the variance. This is the value that we call standard deviation.

Anonymous 0 Comments

Variance = Average distance from the mean.

Standard Deviation = The distance from mean that includes most of your data.

Anonymous 0 Comments

In a nutshell, variance is the average of all differences from the mean . If you have a large sample that’s very well dispersed around the mean, the variance can be big. Taking a square root of the variance basically standardizes the differences – and tells you how the whole group varies from the mean.

I think the easy way to think about this is length of snakes (my teacher used snakes because no one in the class had any idea how long an average snake is); their length is 160, 180, 300, 54 and 120 cm. Which of these snakes is “long”? The mean is 162.8. That’s not that far from 160, so does that mean 180 is a particularly long snake? Variance is 6559.4 – it just means there’s a lot of differences in the snakes’ lengths. Standard deviation is sqrt(6559.4)=80.9. So, assuming that all snakes are represented by the population above, all snakes between 81cm and 243cm (160.8+/- 80.9) are within the range of “standard” lengths, and there’s only one really long and one really short snake in your sample.

Anonymous 0 Comments

Variance is defined as the average of squared mean deviations for a population — e.g. for data *x_i* with mean *mu*, the difference between each *x_i* and *mu* is squared and added, and the result is then divided by the numbee of data (or that number minus one for samples).

This gives an idea of how much variation there is in the data, with the square eliminating negatives to give an idea of the variance in absolute terms.

Standard Deviation is the square root of the variance. This works out to give us a measure of *about how far from the mean* we can go and still contain a certain proportion of the data based on the underlying probability distribution (the *normal* or Gaussian distribution, here). For normally-distributed data, we can expect around 64% of observations to be within one standard deviation of the mean in both directions.

Anonymous 0 Comments

Since this is an ELI5 question, I’m going to spare you all of the complex formulas and whatnot.

Variance, is the square of the Standard Deviation.

Standard Deviation represents the distance of some observation from the average value known as the mean.

Mathematicians use Variance in order to eliminate negatives since as we all know, taking the square of a negative number eliminates the negative sign. This is mostly used for convenience.