🗊Презентация Using numerical measures to describe data. Measures of the center. Week 3 (2)

Категория: Математика
Нажмите для полного просмотра!
Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №1Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №2Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №3Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №4Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №5Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №6Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №7Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №8Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №9Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №10Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №11Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №12Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №13Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №14Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №15Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №16Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №17Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №18Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №19Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №20Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №21Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №22Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №23Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №24Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №25Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №26Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №27Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №28Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №29Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №30Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №31Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №32Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №33Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №34Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №35Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №36Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №37Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №38Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №39Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №40Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №41Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №42Using numerical measures to describe data. Measures of the center. Week 3 (2), слайд №43

Содержание

Вы можете ознакомиться и скачать презентацию на тему Using numerical measures to describe data. Measures of the center. Week 3 (2). Доклад-сообщение содержит 43 слайдов. Презентации для любого класса можно скачать бесплатно. Если материал и наш сайт презентаций Mypresentation Вам понравились – поделитесь им с друзьями с помощью социальных кнопок и добавьте в закладки в своем браузере.

Слайды и текст этой презентации


Слайд 1





BBA182 Applied Statistics
Week 3 (2) Using numerical data to describe data
Dr Susanne Hansen Saral
Email: susanne.saral@okan.edu.tr
https://piazza.com/class/ixrj5mmox1u2t8?cid=4#
www.khanacademy.org
Описание слайда:
BBA182 Applied Statistics Week 3 (2) Using numerical data to describe data Dr Susanne Hansen Saral Email: susanne.saral@okan.edu.tr https://piazza.com/class/ixrj5mmox1u2t8?cid=4# www.khanacademy.org

Слайд 2





	Using numerical measures to describe data 
«Is the data in the sample centered or located around a specific value?»
First question that business people, economists, corporate executives, etc. ask when presented with sample data.
Описание слайда:
Using numerical measures to describe data «Is the data in the sample centered or located around a specific value?» First question that business people, economists, corporate executives, etc. ask when presented with sample data.

Слайд 3





     Using numerical measures to describe data 
The histogram gives an idea whether the data is centered around a specific value. 
The histogram provides a visual picture of how the data is distributed (symmetric, skewed, etc.)
Описание слайда:
Using numerical measures to describe data The histogram gives an idea whether the data is centered around a specific value. The histogram provides a visual picture of how the data is distributed (symmetric, skewed, etc.)

Слайд 4





	    Is the data centered around a specific value?
Описание слайда:
Is the data centered around a specific value?

Слайд 5





        Numerical measures to describe data
Описание слайда:
Numerical measures to describe data

Слайд 6





    Measures of the center of the data set
Описание слайда:
Measures of the center of the data set

Слайд 7





			 Mean
 		Population mean, 
The mean is the most common measure of the center of a data set 
For a population of N values:
Описание слайда:
Mean Population mean, The mean is the most common measure of the center of a data set For a population of N values:

Слайд 8





			Mean
 		Sample Mean, 
For a sample of n values:
Описание слайда:
Mean Sample Mean, For a sample of n values:

Слайд 9





	                    The Mean  
        symmetry and unimodal distribution
When we have a symmetric distribution with one Mode, then the mean represents the middle value in a data set.
Описание слайда:
The Mean symmetry and unimodal distribution When we have a symmetric distribution with one Mode, then the mean represents the middle value in a data set.

Слайд 10





		 Mean
The most common measure for the center of a data set
Affected by extreme values (outliers)
Описание слайда:
Mean The most common measure for the center of a data set Affected by extreme values (outliers)

Слайд 11





		 Mean
The most common measure for the center of a data set
Affected by extreme values (outliers)
Описание слайда:
Mean The most common measure for the center of a data set Affected by extreme values (outliers)

Слайд 12





              	 Skewed distribution
 An outlier will distort the picture of the data. 
 It will inflate or deflate the mean, depending
    on the value of the outlier 
 This creates a skewed distribution.
In this case we may want to use a different measure of the data center
Описание слайда:
Skewed distribution An outlier will distort the picture of the data. It will inflate or deflate the mean, depending on the value of the outlier This creates a skewed distribution. In this case we may want to use a different measure of the data center

Слайд 13





                
			Median
In an ordered list of data, the median is the “middle” number (50% above, 50% below)
 
 
                                      Not affected by outliers
Описание слайда:
Median In an ordered list of data, the median is the “middle” number (50% above, 50% below) Not affected by outliers

Слайд 14





	       Finding the Median
The location of the median:
If the number of values is odd (uneven), the median is the middle number
   			   - 17      6        25      -5      13       9        33
For this data set:              -17     -5      6      9     13      25       33
Описание слайда:
Finding the Median The location of the median: If the number of values is odd (uneven), the median is the middle number - 17 6 25 -5 13 9 33 For this data set: -17 -5 6 9 13 25 33

Слайд 15





	       Finding the Median
The location of the median:
If the number of values is even, the median is the two middle numbers divided by 2
Описание слайда:
Finding the Median The location of the median: If the number of values is even, the median is the two middle numbers divided by 2

Слайд 16





			Finding the median
Determine the median of the following data set:
                           17   5    3    11    12    8     25    3
Описание слайда:
Finding the median Determine the median of the following data set: 17 5 3 11 12 8 25 3

Слайд 17





			Finding the median
Determine the median of the following data set:
                           17   5    3    11    12    8     25    3
                            3    3    5    8     11    12   17   25
    Median: 8 +11 = 19/ 2 = 9.5
Описание слайда:
Finding the median Determine the median of the following data set: 17 5 3 11 12 8 25 3 3 3 5 8 11 12 17 25 Median: 8 +11 = 19/ 2 = 9.5

Слайд 18





			Mode
Value that occurs most often in the data set
Not affected by outliers
Used for either numerical or categorical data
There may be no mode
There may be several modes, uni-modal, bi-modal, multimodal
Описание слайда:
Mode Value that occurs most often in the data set Not affected by outliers Used for either numerical or categorical data There may be no mode There may be several modes, uni-modal, bi-modal, multimodal

Слайд 19





Measures of the center                     	summary data
Five houses on a hill by the beach
Описание слайда:
Measures of the center summary data Five houses on a hill by the beach

Слайд 20





Measures of the center                     	summary data
What is the mean house price?
What is the median house price?
What is the modal house price?
Описание слайда:
Measures of the center summary data What is the mean house price? What is the median house price? What is the modal house price?

Слайд 21





           
           
Mean:    ($3,000,000/5)  
			 =  $600,000
Median:  middle value of ranked data 
                 = $300,000
Mode:  most frequent house price  
                = $100,000
Описание слайда:
Mean: ($3,000,000/5) = $600,000 Median: middle value of ranked data = $300,000 Mode: most frequent house price = $100,000

Слайд 22





 When is which measure of  the center the “best”?


Mean is generally used, unless outliers exist. If there are outliers the mean does not represent the center well.
Then median is used when outliers exist in the data set.
Example: Median home prices may be reported for a region – less sensitive to outliers
Описание слайда:
When is which measure of the center the “best”? Mean is generally used, unless outliers exist. If there are outliers the mean does not represent the center well. Then median is used when outliers exist in the data set. Example: Median home prices may be reported for a region – less sensitive to outliers

Слайд 23





     Shape of a Distribution
Describe the shape of a distribution
Describes how data is distributed
The presence or not of outliers in a data set, influence the shape of a distribution
Symmetric or skewed
Описание слайда:
Shape of a Distribution Describe the shape of a distribution Describes how data is distributed The presence or not of outliers in a data set, influence the shape of a distribution Symmetric or skewed

Слайд 24





Histogram of annual salaries (in $) for a sample of U.S. marketing managers:

 
 Describe the shape of this histogram (of the distribution) 
 
 Without doing calculations. Do you expect the mean salary to be higher or lower than the median salary? 
 
Описание слайда:
Histogram of annual salaries (in $) for a sample of U.S. marketing managers:   Describe the shape of this histogram (of the distribution)    Without doing calculations. Do you expect the mean salary to be higher or lower than the median salary?  

Слайд 25





		Class exercise
Eleven economists were asked to predict the percentage growth in the Consumer Price Index over the next year. 
Their forecasts were as follows:
                           3.6     3.1     3.9     3.7     3.5     1.0     3.7     3.4     3.0    3.7    3.4 
Compute the mean, median and the mode
Are there any outliers in the data set that may influence the value of the mean?
If there are outliers, how do they affect the shape of the data distribution?
Описание слайда:
Class exercise Eleven economists were asked to predict the percentage growth in the Consumer Price Index over the next year. Their forecasts were as follows: 3.6 3.1 3.9 3.7 3.5 1.0 3.7 3.4 3.0 3.7 3.4 Compute the mean, median and the mode Are there any outliers in the data set that may influence the value of the mean? If there are outliers, how do they affect the shape of the data distribution?

Слайд 26





              Solution to class exercise
Mean: 36/11 = 3.27 rounded up to 3.3
Median: 3.5
Mode: 3.7
 
Outlier: 1.0
How does the outlier affect the shape of the distribution?
It decreases the average of the data set and distorts the picture of the histogram.
The shape is skewed to the left.
 
Описание слайда:
Solution to class exercise Mean: 36/11 = 3.27 rounded up to 3.3 Median: 3.5 Mode: 3.7   Outlier: 1.0 How does the outlier affect the shape of the distribution? It decreases the average of the data set and distorts the picture of the histogram. The shape is skewed to the left.  

Слайд 27





		Measures of variability
The three measures of data center do not provide complete and sufficient description of the data.
Next to knowing how data is located around a specific value (mean, median or mode), we need information on how far the data is spread from that specific value, most often from the mean.
The measure of variability will provide us with this information.
Описание слайда:
Measures of variability The three measures of data center do not provide complete and sufficient description of the data. Next to knowing how data is located around a specific value (mean, median or mode), we need information on how far the data is spread from that specific value, most often from the mean. The measure of variability will provide us with this information.

Слайд 28





		Measures of Variability
Описание слайда:
Measures of Variability

Слайд 29





			Quartiles 
Quartiles are descriptive measures that separate large data set into four quarters.
The first quartile ( separates approximately the smallest 25 % of the data from the remaining largest 75 % of the data.
The second quartile (), is the median, which separates the data set into two identical halves.
The third quartile ( separates approximately the smallest 75 % of the data from the remaining largest 25 % of the data
Описание слайда:
Quartiles Quartiles are descriptive measures that separate large data set into four quarters. The first quartile ( separates approximately the smallest 25 % of the data from the remaining largest 75 % of the data. The second quartile (), is the median, which separates the data set into two identical halves. The third quartile ( separates approximately the smallest 75 % of the data from the remaining largest 25 % of the data

Слайд 30





		Quartiles
Описание слайда:
Quartiles

Слайд 31





How to calculate quartiles manually
Описание слайда:
How to calculate quartiles manually

Слайд 32





		
			Quartiles
Описание слайда:
Quartiles

Слайд 33





		
			Quartiles
Описание слайда:
Quartiles

Слайд 34





			Quartiles and Enron case
In the Enron data we had 60 data points. 
There are 30 values to right and 30 values to left side of the median (: 
( = -$1.68  (between15th and 16th data points)  - 75 % of the data is larger than -$ 1.68
( = -$ 0.19  median (between 30th and 31st points) - 50 % of the data is smaller than -$.19 and 50 %
              of the data is larger than -$.19 .
( =   $2.14  (between 45th and 46th data pots) -   25 % of the data is larger than $2.14
Описание слайда:
Quartiles and Enron case In the Enron data we had 60 data points. There are 30 values to right and 30 values to left side of the median (: ( = -$1.68 (between15th and 16th data points) - 75 % of the data is larger than -$ 1.68 ( = -$ 0.19 median (between 30th and 31st points) - 50 % of the data is smaller than -$.19 and 50 % of the data is larger than -$.19 . ( = $2.14 (between 45th and 46th data pots) - 25 % of the data is larger than $2.14

Слайд 35





			Range
Simplest measure of variation
Difference between the largest and the smallest observations:
Описание слайда:
Range Simplest measure of variation Difference between the largest and the smallest observations:

Слайд 36





			Range – Example Enron case
          Range = Maximum value – minimum value
          Enron data range =  $21.06 – (-$17.75) = $ 38.81
Описание слайда:
Range – Example Enron case Range = Maximum value – minimum value Enron data range = $21.06 – (-$17.75) = $ 38.81

Слайд 37





        Disadvantages of the Range
Ignores the way in which data is distributed
Описание слайда:
Disadvantages of the Range Ignores the way in which data is distributed

Слайд 38





        Disadvantages of the Range
Sensitive to outliers
Описание слайда:
Disadvantages of the Range Sensitive to outliers

Слайд 39





	                  Range:  short-comings 
                 as a good measure for variability

Because the range does not provide us with a lot of  information about the spread of the data it is not a very good measure for variability.
Описание слайда:
Range: short-comings as a good measure for variability Because the range does not provide us with a lot of information about the spread of the data it is not a very good measure for variability.

Слайд 40





	Interquartile Range
We can eliminate some outlier problems by using the interquartile range and
concentrate on the middle 50 % of the data in the data set
Eliminate high- and low-valued observations and calculate the range of
the middle 50% of the data
         
          Q1			                   Q3 	
             Interquartile range
            The Interquartile range, IQR =
Описание слайда:
Interquartile Range We can eliminate some outlier problems by using the interquartile range and concentrate on the middle 50 % of the data in the data set Eliminate high- and low-valued observations and calculate the range of the middle 50% of the data Q1 Q3 Interquartile range The Interquartile range, IQR =

Слайд 41





	Interquartile Range
The interquartile range (IQR) measures the spread of the data in the middle 50% of the data set 
Defined as the difference between the observation at the third quartile and the observation at the first quartile	
		                                  
 				IQR = Q3 - Q1
Описание слайда:
Interquartile Range The interquartile range (IQR) measures the spread of the data in the middle 50% of the data set Defined as the difference between the observation at the third quartile and the observation at the first quartile IQR = Q3 - Q1

Слайд 42





		         Interquartile Range 
                                  
Raw data:  6    8     10    12    14    9    11    7    13    11        n = 10
Ranked data:  6   7   8   9   10   11    11   12   13   14
 1. Quartile:   7.75
 3. Quartile: 12.25
		       IQR = Q3 – Q1 = 12.25 – 7.75 = 4.5
		
                         		                         Q1: 7.75		          Q3: 12.25
Описание слайда:
Interquartile Range Raw data: 6 8 10 12 14 9 11 7 13 11 n = 10 Ranked data: 6 7 8 9 10 11 11 12 13 14 1. Quartile: 7.75 3. Quartile: 12.25 IQR = Q3 – Q1 = 12.25 – 7.75 = 4.5 Q1: 7.75 Q3: 12.25

Слайд 43





            Enron data: Interquartile range
Interquartile range: 
  
 IQR : $2.14 – (-$ 1.68) = $ 3.82       
The middle 50 % of the Enron data has a spread of $ 3.82 compared to the range of $ 38. 81!
Описание слайда:
Enron data: Interquartile range Interquartile range: IQR : $2.14 – (-$ 1.68) = $ 3.82 The middle 50 % of the Enron data has a spread of $ 3.82 compared to the range of $ 38. 81!



Похожие презентации
Mypresentation.ru
Загрузить презентацию