Learning Outcomes

Recognize, describe, and calculate the procedures of the spread of data: variance, standard deviation, and also range.

You are watching: The sum of the deviations from the mean will always be equal to ______________.


An important characteristic of any set of data is the sport in the data. In some data sets, the data values room concentrated closely near the mean; in various other data sets, the data worths are much more widely spread out from the mean. The most usual measure that variation, or spread, is the conventional deviation. The standard deviation is a number that measures how far data values are from your mean.

The standard deviation provides a numerical measure up of the as whole amount of sport in a data set, and also can be offered to identify whether a certain data value is close come or far from the mean.

The traditional deviation offers a measure up of the as whole variation in a data set.

The conventional deviation is always positive or zero. The standard deviation is tiny when the data room all focused close come the mean, exhibiting small variation or spread. The traditional deviation is bigger when the data worths are an ext spread out from the mean, exhibiting much more variation.

Suppose the we space studying the lot of time customers wait in heat at the checkout in ~ supermarket A and supermarket B. The mean wait time in ~ both supermarkets is five minutes. In ~ supermarket A, the standard deviation because that the wait time is two minutes; at supermarket B the conventional deviation because that the wait time is four minutes.

Because supermarket B has a greater standard deviation, we understand that over there is much more variation in the wait times at supermarket B. Overall, wait time at supermarket B are more spread the end from the average; wait times at supermarket A are much more concentrated near the average.

The typical deviation have the right to be offered to determine whether a data worth is close come or much from the mean.

Suppose the Rosa and also Binh both shop at supermarket A. Rosa waits at the checkout counter for seven minutes and Binh waits because that one minute. At supermarket A, the typical waiting time is 5 minutes and the typical deviation is two minutes. The traditional deviation deserve to be supplied to recognize whether a data value is close to or much from the mean.

Rosa waits for 7 minutes:

Seven is two minutes longer than the typical of five; 2 minutes is equal to one typical deviation.Rosa’s wait time of seven minutes is two minutes longer than the average of five minutes.Rosa’s wait time of seven minutes is one traditional deviation above the median of 5 minutes.

Binh waits because that one minute.

One is 4 minutes much less than the median of five; 4 minutes is same to two typical deviations.Binh’s wait time the one minute is four minutes much less than the average of five minutes.Binh’s wait time the one minute is two conventional deviations below the average of five minutes.

A data value that is two traditional deviations indigenous the median is just on the borderline for what countless statisticians would take into consideration to be much from the average. Considering data to be much from the median if that is more than two standard deviations away is more of an approximate “rule the thumb” 보다 a strict rule. In general, the shape of the distribution of the data affects how much the the data is more away 보다 two standard deviations. (You will learn much more about this in later chapters.)

The number line may assist you know standard deviation. If we were to put five and also seven top top a number line, 7 is come the appropriate of five. Us say, then, that seven isone standard deviation come the right the five since 5 + (1)(2) = 7.

If one to be also part of the data set, climate one is two typical deviations to the left the five due to the fact that 5 + (–2)(2) = 1.


*

In general, a value = mean + (#ofSTDEV)(standard deviation)where #ofSTDEVs = the number of standard deviations#ofSTDEV does not have to be one integerOne is two standard deviations much less than the mean of 5 because: 1 = 5 + (–2)(2).

The equation value = mean + (#ofSTDEVs)(standard deviation) have the right to be expressed for a sample and also for a population.

Sample: \displaystylex=\overlinex+(# the STDEV)(s)Population: \displaystylex=\mu+(# the STDEV)(\sigma)

The lower case letter s to represent the sample traditional deviation and also the Greek letter σ (sigma, lower case) represents the population standard deviation.

The prize \displaystyle\overlinex is the sample mean and also the Greek prize μ is the population mean.

Calculating the traditional Deviation

If x is a number, climate the difference “x – mean” is dubbed its deviation. In a data set, there space as many deviations as there room items in the data set. The deviations are supplied to calculate the standard deviation. If the numbers belong come a population, in icons a deviation is x – μ. For sample data, in symbols a deviation is \displaystylex-\overlinex.

The procedure to calculation the standard deviation relies on even if it is the numbers space the entire populace or are data indigenous a sample. The calculations are similar, however not identical. As such the symbol used to stand for the standard deviation relies on whether it is calculated indigenous a population or a sample. The lower instance letter s to represent the sample traditional deviation and the Greek letter σ (sigma, reduced case) represents the population standard deviation. If the sample has the same qualities as the population, then s should be a an excellent estimate that σ.

To calculation the standard deviation, we need to calculate the variance first. Thevariance is the average of the squares the the deviations (the x\displaystyle\overlinex worths for a sample, or the x – μ worths for a population). The price σ^2 to represent the population variance; the populace standard deviation σ is the square root of the populace variance. The prize s^2 to represent the sample variance; the sample conventional deviation s is the square root of the sample variance. You can think the the typical deviation together a special average of the deviations.

If the numbers come native a census the the entire population and not a sample, as soon as we calculate the average of the squared deviations to uncover the variance, we divide by N, the number of items in the population. If the data space from a sample fairly than a population, when we calculation the typical of the squared deviations, we division by n – 1, one less than the variety of items in the sample.

In the following video an instance of calculating the variance and standard deviation that a set of data is presented.

Formulas for the Sample typical Deviation

\displaystyles=\sqrt\frac\sum(x-\overlinex)^2n-1\quad\textor\quads=\sqrt\frac\sumf(x-\overlinex)^2n-1

For the sample traditional deviation, the denominator is n – 1, that is the sample dimension MINUS 1.

Formulas because that the populace Standard Deviation

\displaystyle\sigma=\sqrt\frac\sum(x-\mu)^2N\quad\textor\quad\sigma=\sqrt\frac\sumf(x-\mu)^2N

For the populace standard deviation, the denominator is N, the variety of items in the population.

In these formulas, f represents the frequency with which a worth appears. For example, if a value shows up once, f is one. If a value shows up three times in the data set or population, f is three.

Sampling Variability the a Statistic

How much the statistic different from one sample to one more is recognized as the sampling variability of a statistic. You commonly measure the sampling variability the a statistic through its conventional error. The standard error the the mean is an instance of a traditional error. The is a special typical deviation and is well-known as the conventional deviation that the sampling circulation of the mean. You will cover the conventional error the the average when girlfriend learn around The central Limit theorem (not now). The notation because that the typical error of the typical is \displaystyle\frac\sigma\sqrtn wherein σ is the standard deviation the the populace and n is the size of the sample.


Note

In practice, usage a calculator or computer system software to calculate the traditional deviation. If you are using a TI-83, 83+, 84+ calculator, you need to choose the appropriate standard deviation σ_x or s_x native the summary statistics. We will concentrate top top using and interpreting the information that the standard deviation provides us. However you need to study the following step-by-step instance to aid you understand exactly how the standard deviation procedures variation indigenous the mean. (The calculator instructions appear at the finish of this example.)

Example

In a fifth grade class, the teacher to be interested in the typical age and the sample standard deviation that the periods of she students. The complying with data are the eras for a sample of n = 20 fifth grade students. The ages are rounded come the nearest half year:

\displaystyle 9; 9.5; 9.5; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5;

Solution:

\displaystyle\overlinex = 9+9.5(2)+10(4)+10.5(4)+11(6)+11.5(3)20=10.525The average period is 10.53 years, rounded to 2 places.

The variance may be calculation by using a table. Climate the standard deviation is calculated by taking the square source of the variance. We will explain the parts of the table after ~ calculating s.

DataFreq.DeviationsDeviations^2(Freq.)( Deviations^2)
xf( x\displaystyle\overlinex)( x\displaystyle\overlinex)2( f)(x\displaystyle\overlinex)2
919 – 10.525 = –1.525(–1.525)^2 = 2.3256251 × 2.325625 = 2.325625
9.529.5 – 10.525 = –1.025(–1.025)^2 = 1.0506252 × 1.050625 = 2.101250
10410 – 10.525 = –0.525(–0.525)^2 = 0.2756254 × 0.275625 = 1.1025
10.5410.5 – 10.525 = –0.025(–0.025)^2 = 0.0006254 × 0.000625 = 0.0025
11611 – 10.525 = 0.475(0.475)^2 = 0.2256256 × 0.225625 = 1.35375
11.5311.5 – 10.525 = 0.975(0.975)^2 = 0.9506253 × 0.950625 = 2.851875
The complete is 9.7375

The sample variance, \displaystyles^2, is equal to the amount of the last tower (9.7375) separated by the total variety of data worths minus one (20 – 1):s^2 =\frac9.737520-1 =0.5125

The sample traditional deviation s is equal to the square root of the sample variance: s = \sqrt0.5125 = 0.715891 which is rounded to two decimal places, s = 0.72.

Typically, you carry out the calculation for the traditional deviation on your calculator or computer. The intermediate outcomes are no rounded. This is done because that accuracy.

For the following problems, recall that value = average + (#ofSTDEVs)(standard deviation). Verify the mean and standard deviation or a calculator or computer. For a sample: x =\displaystyle\overlinex + (#ofSTDEVs)(s) because that a population: x = μ + (#ofSTDEVs)(σ) for this example, use x =\displaystyle\overlinex + (#ofSTDEVs)(s) due to the fact that the data is native a sampleVerify the mean and standard deviation on her calculator or computer.Find the worth that is one conventional deviation above the mean. Uncover (\displaystyle\overlinex+ 1s).Find the value that is two standard deviations listed below the mean. Find (\displaystyle\overlinex – 2s).Find the values that are 1.5 traditional deviations from (below and also above) the mean.
USING THE TI-83, 83+, 84, 84+ CALCULATORClear list L1 and L2. Press STAT 4:ClrList. Enter 2nd 1 for L1, the comma (,), and 2nd 2 for L2.Enter data into the perform editor. Push STAT 1:EDIT. If necessary, clean the perform by arrowing up right into the name. Push CLEAR and also arrow down.Put the data values (9, 9.5, 10, 10.5, 11, 11.5) right into list L1 and also the frequencies (1, 2, 4, 4, 6, 3) right into list L2. Use the arrow keys to relocate around.Press STAT and arrow come CALC. Press 1:1-VarStats and enter L1 (2nd 1), L2 (2nd 2). Execute not forget the comma. Push ENTER.\displaystyle\overlinex = 10.525Use Sx due to the fact that this is sample data (not a population): Sx=0.715891(\displaystyle\overlinex+ 1s) = 10.53 + (1)(0.72) = 11.25(\displaystyle\overlinex– 2s) = 10.53 – (2)(0.72) = 9.09(\displaystyle\overlinex– 1.5s) = 10.53 – (1.5)(0.72) = 9.45(\displaystyle\overlinex+ 1.5s) = 10.53 + (1.5)(0.72) = 11.61

Explanation that the standard deviation calculation presented in the table

The deviations present how spread out the data are around the mean. The data value 11.5 is farther from the average than is the data value 11 i beg your pardon is indicated by the deviations 0.97 and 0.47. A optimistic deviation occurs once the data value is greater than the mean, conversely, a negative deviation occurs as soon as the data worth is much less than the mean. The deviation is –1.525 because that the data worth nine. If you include the deviations, the sum is always zero. (For example 1, there space n = 20 deviations.) therefore you cannot simply include the deviations to gain the spread out of the data. Through squaring the deviations, you make them confident numbers, and the amount will also be positive. The variance, then, is the mean squared deviation.

The variance is a squared measure and does not have the same units together the data. Taking the square source solves the problem. The conventional deviation steps the spread out in the same units together the data.

Notice that rather of separating by n= 20, the calculation split by n – 1 = 20 – 1 = 19 because the data is a sample. For the sample variance, we division by the sample dimension minus one (n – 1). Why not divide through n? The answer has to do with the populace variance. The sample variance is an calculation of the population variance. based on the theoretical mathematics that lies behind these calculations, separating by (n – 1) provides a far better estimate the the populace variance.


Note

Your concentration should be ~ above what the typical deviation tells us around the data. The typical deviation is a number which steps how far the data are spread native the mean. Permit a calculator or computer do the arithmetic.


The typical deviation, s or σ, is one of two people zero or bigger than zero. When the typical deviation is zero, there is no spread; that is, the every the data values are equal to every other. The typical deviation is little when the data space all focused close come the mean, and also is bigger when the data worths show more variation native the mean. As soon as the typical deviation is a lot bigger than zero, the data worths are very spread out about the mean; outliers can make s or σ an extremely large.

The conventional deviation, when first presented, have the right to seem unclear. By graphing her data, friend can gain a better “feel” for the deviations and also the conventional deviation. Friend will uncover that in symmetry distributions, the conventional deviation have the right to be very helpful yet in skewed distributions, the traditional deviation may not be lot help. The reason is that the 2 sides the a skewed circulation have different spreads. In a it was crooked distribution, that is far better to look in ~ the very first quartile, the median, the 3rd quartile, the smallest value, and also the biggest value. Because numbers can be confusing, always graph her data. Screen your data in a histogram or a box plot.


Example

Use the following data (first exam scores) indigenous Susan Dean’s spring pre-calculus class:

\displaystyle 33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100

Create a graph containing the data, frequencies, family member frequencies, and cumulative family member frequencies to three decimal places.Calculate the following to one decimal ar using a TI-83+ or TI-84 calculator:The sample meanThe sample standard deviationThe medianThe very first quartileThe 3rd quartileIQRConstruct a box plot and a histogram ~ above the same collection of axes. Make comments about the box plot, the histogram, and also the chart.
DataFrequencyRelative FrequencyCumulative family member Frequency
3310.0320.032
4210.0320.064
4920.0650.129
5310.0320.161
5520.0650.226
6110.0320.258
6310.0320.29
6710.0320.322
6820.0650.387
6920.0650.452
7210.0320.484
7310.0320.516
7410.0320.548
7810.0320.580
8010.0320.612
8310.0320.644
8830.0970.741
9010.0320.773
9210.0320.805
9440.1290.934
9610.0320.966
10010.0320.998 (Why isn’t this value 1?)
The sample typical = 73.5The sample typical deviation = 17.9The typical = 73The first quartile = 61The third quartile = 90IQR = 90 – 61 = 29

The long left whisker in package plot is reflected in the left side of the histogram. The spread out of the test scores in the reduced 50% is greater (73 – 33 = 40) than the spread in the upper 50% (100 – 73 = 27). The histogram, box plot, and also chart all reflect this. There are a substantial variety of A and B grades (80s, 90s, and 100). The histogram plainly shows this. The box plot shows us that the middle 50% the the exam scores (IQR = 29) are Ds, Cs, and Bs. The box plot additionally shows united state that the reduced 25% of the exam scores room Ds and Fs.


Try It

The complying with data display the different species of pets food stores in the area carry.

\displaystyle 6; 6; 6; 6; 7; 7; 7; 7; 7; 8; 9; 9; 9; 9; 10; 10; 10; 10; 10; 11; 11; 11; 11; 12; 12; 12; 12; 12; 12;Calculate the sample mean and the sample traditional deviation to one decimal ar using a TI-83+ or TI-84 calculator.


Standard Deviation of grouped Frequency Tables

Recall that for grouped data we execute not know individual data values, so we cannot describe the usual value the the data with precision. In other words, we cannot discover the exact mean, median, or mode. Us can, however, determine the ideal estimate that the actions of center by finding the mean of the group data with the formula:

Mean the Frequency Table =\displaystyle\frac\sum(fm)\sum(f)

where f = interval frequencies and m = interval midpoints.

Just as we could not uncover the precise mean, neither deserve to we uncover the precise standard deviation. Psychic that traditional deviation defines numerically the meant deviation a data value has actually from the mean. In straightforward English, the conventional deviation permits us to compare just how “unusual” separation, personal, instance data is compared to the mean.


Example

Find the traditional deviation because that the data in the table below.

ClassFrequency, fMidpoint, mm^2\displaystyle\overlinex^2fm^2Standard Deviation
0–21117.5813.5
3–564167.58963.5
6–8107497.584903.5
9–117101007.587003.5
12–140131697.5803.5
15–172162567.585123.5

For this data set, we have actually the mean, \displaystyle\overlinex = 7.58 and the typical deviation, \displaystyles_x = 3.5. This means that a randomly selected data worth would be expected to be 3.5 devices from the mean. If we look in ~ the an initial class, we view that the class midpoint is equal to one. This is practically two full standard deviations native the mean since 7.58 – 3.5 – 3.5 = 0.58. While the formula because that calculating the conventional deviation is no complicated, \displaystyles_x=\sqrt\fracf(m-\overlinex)^2n-1 whereby \displaystyles_x = sample traditional deviation, \displaystyle\overlinex = sample mean, the calculations are tedious. The is usually finest to use an innovation when performing the calculations.


Try It

Find the traditional deviation for the data indigenous the previous example

ClassFrequency, f
0–21
3–56
6–810
9–117
12–140
15–172

First, press the STAT key and select 1:Edit

Input the midpoint values right into L1 and the frequencies into L2





Comparing worths from various Data Sets

The standard deviation is useful when to compare data worths that come from different data sets. If the data sets have actually different method and traditional deviations, climate comparing the data values directly can it is in misleading.

For each data value, calculation how numerous standard deviations far from its median the worth is.Use the formula: value = average + (#ofSTDEVs)(standard deviation); settle for #ofSTDEVs.#ofSTDEVs = \fracvalue - meanstandard deviation Compare the outcomes of this calculation.

#ofSTDEVs is often called a ” z-score”; we have the right to use the symbol z. In symbols, the recipe become:

Samplex=\overlinex+zsz = \fracx - \overlinexs
Populationx = μ + zσ z = \fracx - μσ

Example

Two students, John and also Ali, from different high schools, want to uncover out who had actually the highest possible GPA when compared to his school. I beg your pardon student had actually the greatest GPA when contrasted to his school?

StudentGPASchool mean GPASchool standard Deviation
John2.853.00.7
Ali778010

For each student, recognize how countless standard deviations (#ofSTDEVs) his GPA is far from the average, because that his school. Pay cautious attention to signs when comparing and also interpreting the answer.

z = # the STDEVs = \fracvalue - meanstandard deviation = \fracx-μσ

For John, z = # ofSTDEVs = \displaystyle\frac2.85 - 3.000.7=-0.21

For Ali, z = # ofSTDEVs = \displaystyle\frac77- 8010=−0.3

John has actually the better GPA when compared to his school because his GPA is 0.21 standard deviations below his school’s average while Ali’s GPA is 0.3 typical deviations below his school’s mean.

John’s z-score the –0.21 is higher than Ali’s z-score of –0.3. Because that GPA, higher values room better, so we conclude the John has the much better GPA when contrasted to his school.


The following lists offer a couple of facts that carry out a little much more insight right into what the traditional deviation speak us around the distribution of the data.

For any data set, no matter what the distribution of the data is:

At the very least 75% that the data is in ~ two traditional deviations of the mean.At the very least 89% of the data is within 3 standard deviations that the mean.At least 95% of the data is within 4.5 typical deviations of the mean.This is known as Chebyshev’s Rule.

For data having actually a distribution that is BELL-SHAPED and also SYMMETRIC:

Approximately 68% the the data is within one typical deviation that the mean.Approximately 95% of the data is in ~ two standard deviations the the mean.More 보다 99% the the data is within 3 standard deviations the the mean.This is known as the Empirical Rule.It is crucial to keep in mind that this rule only uses when the form of the circulation of the data is bell-shaped and symmetric. We will certainly learn more about this once studying the “Normal” or “Gaussian” probability distribution in later chapters.

Concept Review

The conventional deviation can aid you calculate the spread out of data. Over there are various equations to usage if are calculating the standard deviation of a sample or of a population.

The conventional Deviation enables us to compare individual data or classes to the data collection mean numerically.\displaystyles_x=\sqrt\fracf(m-\overlinex)^2n-1 is the formula because that calculating the standard deviation that a sample.To calculation the traditional deviation the a population, we would use the populace mean, μ, and also the formula \displaystyle\sigma=\sqrt\fracf(x-\mu)^2N

Formula Review

\displaystyles_x=\sqrt\frac\sumfm^2n - x^2

where \displaystyles_x = sample conventional deviation, \displaystyle\overlinex = sample mean

References

Data from Microsoft Bookshelf.

See more: How To Breed All Gemstone Dragons In Dragonvale ? : Dragonvale

King, Bill.”Graphically Speaking.” Institutional Research, Lake Tahoe community College. Accessible online in ~ http://www.ltcc.edu/web/about/institutional-research (accessed April 3, 2013).