DESCRIPTIVE STATISTICS
A. Overview of Descriptive Statistics
B. Step-By-Step Instructions for Generating Descriptive Statistics In Excel
C. Interpreting Your Excel Output
A. Overview of Descriptive Statistics
One of the first steps in any statistical presentation or report is to describe the data set you are using in your analysis. The description of your data set should summarize your data set such that your reader or audience can easily see the important statistics of your data set. Descriptive statistics is simply the summary of the important statistics in your data set. Descriptive statistics can be provided using two different methods. The first method summarizes your data set using a graphical technique such as a histogram, line chart, pie chart, stem-and-leaf displays, or other graphical techniques. The second method provides a pure numerical summary of the relevant statistics in the data set. The step-by-step instructions for generating a numerical summary (the second method) in Excel are provided below.
B. Step-By-Step Instructions for Generating Descriptive Statistics In Excel
Suppose you are the manager of Speedy Oil Change which claims that it will change the oil in customers’ cars in less than 30 minutes. Further suppose that several complaints have been filed from customers stating that their oil change took longer than 30 minutes and upper-level management at Speedy Oil Change headquarters has requested that you investigate the complaints. To begin your investigation, you monitor 36 oil changes performed Speedy Oil Change and record the time each customer waited for the oil change to be completed. The number of minutes to complete each of the 36 oil changes is reported below.

Before you begin your analysis, you want to summarize this data set such that you can obtain a better understanding of the data and so that you can later present this information to upper-level management. Follow the steps described below to obtain a numerical summary of the above data set.
1. First input the data into a single column in an Excel spreadsheet. You can do this by typing in the data or by cutting and pasting.

2. From the Tools pull-down menu, select Data Analysis, and then select Descriptive Statistics.

3. Click OK in the Data Analysis window and the Descriptive Statistics window opens.

4. In the Input Range field, type A1:A36, or click the worksheet icon to the right of the Input Range field and click and drag the cursor over the data in column A. Next, select Summary Statistics checkbox. The Descriptive Statistics window should appear as follows.

5. Click OK and a new worksheet opens with the results.

The next section “Interpreting Your Excel Output” provides an explanation of the summary statistics shown above.
C. Interpreting Your Excel Output
Excel provides 13 summary statistics. The following provides a brief discussion of each.
Median: is the midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest. In other words, if the oil change data was ordered from the smallest value to the largest value (10, 11, 13, 14, 15, 17, 17 …), the number in the middle of the data is the median. In this example, the median is 24.5. Notice that 24.5 is not an actual value in the data set. When the number of observations in a data set is even, the two middle observations are found and then averaged. In this case, the two middle values are 24 and 25, and their average or mean is 24.5.
Mode: is the value of the observation that appears most frequently. In this example, the mode is 22 because a waiting period of 22 minutes occurs three times in the data set; more than any other waiting period.
(Sample) Standard Deviation: is the square root of the sample variance. The sample variance is mean of the squared deviations from the mean of the data. Standard deviation is a measure of dispersion which gives us an idea of how “spread out” the data is. The standard deviation of a data set can be compared to the standard deviation of another data set. The smaller the standard deviation, the less the data is spread out and the larger the data set, the more the data is spread out. In this example, the standard deviation is 8.409131.
Sample Variance: is mean of the squared deviations from the mean of the data and it is the square of the sample standard deviation. In this example, the sample variance is 70.71349.
Kurtosis: is a measure of how peaked or flat the data is relative to a normal distribution. Data sets with high (positive) kurtosis tend to have a sharp peak near the mean of the data and data sets with low (negative) kurtosis tend to have a flat top near the mean rather than a sharp peak. In this example, the kurtosis is -0.58947.
Skewness: describes the symmetry of the data by examining the relationship between the mean and the median of the data. If the mean is less than the median, the data is negatively skewed meaning that there is a peak in the data at a greater value than the mean. If the data is negatively skewed, the skewness statistic will be negative. If the mean is greater than the median, the data is positively skewed meaning that there is a peak in the data at a smaller value than the mean. If the data is positively skewed, the skewness statistic will be positive. A skewness value of zero indicates that the mean and the median are the same value and the data is perfectly symmetric. Notice in this example that the mean (24.97222) is slightly larger than the median (24.5) which yields a positive skewness value of 0.245788, indicating that the data is slightly positively skewed.
Range: is difference between the largest value and the smallest value. It is the simplest measure of dispersion. In this example, the largest value is 42 and the smallest value is 10 such that the range of the data is 32.
Minimum: is the smallest observation in the data set. In this case the minimum is 10. In other words, the shortest amount of time a customer had to wait for an oil change was 10 minutes.
Maximum: is the largest observation in the data set. In this case the maximum is 42. In other words, the longest amount of time a customer had to wait for an oil change was 42 minutes.
Sum: is the sum of all the observations in the data set (42 + 41 + 17 + 13 + …). In this example, the total or sum of all the time spent waiting for oil changes is 899 minutes.
Count: is the total number of observations in the data set. In this example, the count is 36.
|
|
||
| Elon University Campus Box 2700 Elon, NC 27244 (800) 334-8448 E-mail: web@elon.edu |
Last Modified:
08/20/03 |
|