🗊Презентация Statistics. Data Description. Data Summarization. Numerical Measures of the Data

Категория: Математика
Нажмите для полного просмотра!
Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №1Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №2Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №3Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №4Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №5Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №6Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №7Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №8Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №9Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №10Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №11Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №12Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №13Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №14Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №15Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №16Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №17Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №18Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №19Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №20Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №21Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №22Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №23Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №24Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №25Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №26Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №27Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №28Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №29Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №30Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №31Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №32Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №33Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №34Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №35Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №36Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №37Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №38Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №39Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №40Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №41Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №42Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №43Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №44Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №45Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №46Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №47Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №48Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №49Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №50Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №51Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №52Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №53Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №54Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №55Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №56Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №57Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №58Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №59Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №60Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №61Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №62Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №63Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №64Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №65Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №66Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №67Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №68Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №69

Содержание

Вы можете ознакомиться и скачать презентацию на тему Statistics. Data Description. Data Summarization. Numerical Measures of the Data. Доклад-сообщение содержит 69 слайдов. Презентации для любого класса можно скачать бесплатно. Если материал и наш сайт презентаций Mypresentation Вам понравились – поделитесь им с друзьями с помощью социальных кнопок и добавьте в закладки в своем браузере.

Слайды и текст этой презентации


Слайд 1





Chapter Three: 
Data Description

  Data Summarization

Numerical Measures of the Data
Описание слайда:
Chapter Three: Data Description Data Summarization Numerical Measures of the Data

Слайд 2





Chapter Three: Numerical Measures of the Data
Outline
Introduction
3-1  Measures of Central Tendency
3-2  Measures of Variation
3-3  Measures of Position
3-4  Exploratory Data Analysis
Описание слайда:
Chapter Three: Numerical Measures of the Data Outline Introduction 3-1 Measures of Central Tendency 3-2 Measures of Variation 3-3 Measures of Position 3-4 Exploratory Data Analysis

Слайд 3





Chapter Three: Numerical Measures of the Data
Objectives
Summarize data using the measures of central tendency, such as the mean, median, mode, and midrange.
Describe data using the measures of variation, such as the range, variance, and standard deviation.
Identify the position of a data value in a data set using various measures of position, such as percentiles, and quartiles.
Use the techniques of exploratory data analysis, including stem and leaf plots, box plots, and five-number summaries to discover various aspects of data.
Описание слайда:
Chapter Three: Numerical Measures of the Data Objectives Summarize data using the measures of central tendency, such as the mean, median, mode, and midrange. Describe data using the measures of variation, such as the range, variance, and standard deviation. Identify the position of a data value in a data set using various measures of position, such as percentiles, and quartiles. Use the techniques of exploratory data analysis, including stem and leaf plots, box plots, and five-number summaries to discover various aspects of data.

Слайд 4





Chapter Three: Numerical Measures of the Data
3-1 Measures of Central tendency
We will compute two means: one for the sample and one for a finite population of values.
Описание слайда:
Chapter Three: Numerical Measures of the Data 3-1 Measures of Central tendency We will compute two means: one for the sample and one for a finite population of values.

Слайд 5





Chapter Three: Numerical Measures of the Data
Example:- (Sample Mean)
The ages of a random sample of seven students at a certain school are 11, 10, 12, 13, 7, 9, 15
Find the average (Mean) age of this sample
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- (Sample Mean) The ages of a random sample of seven students at a certain school are 11, 10, 12, 13, 7, 9, 15 Find the average (Mean) age of this sample

Слайд 6





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 7





Chapter Three: Numerical Measures of the Data
The Sample Mean for an Ungrouped Frequency Distribution
Описание слайда:
Chapter Three: Numerical Measures of the Data The Sample Mean for an Ungrouped Frequency Distribution

Слайд 8





Chapter Three: Numerical Measures of the Data
The Sample Mean for an Ungrouped Frequency Distribution –
 Example
Описание слайда:
Chapter Three: Numerical Measures of the Data The Sample Mean for an Ungrouped Frequency Distribution – Example

Слайд 9





Chapter Three: Numerical Measures of the Data
The Sample Mean for a Grouped Frequency Distribution
The mean for a grouped frequency distribution is given by :
Here          is the corresponding class midpoint
Описание слайда:
Chapter Three: Numerical Measures of the Data The Sample Mean for a Grouped Frequency Distribution The mean for a grouped frequency distribution is given by : Here is the corresponding class midpoint

Слайд 10





Important remark :
In some situations the mean may not be representative of the data.
As an example, the annual salaries of five vice presidents at AVX, LLC are $90,000, $92,000, $94,000, $98,000, and $350,000. The mean is: 
Notice how the one extreme value ($350,000) pulled the mean upward.  Four of the five vice presidents earned less than the mean, raising the question whether the arithmetic mean value of $144,800 is typical of the salary of the five vice presidents.
Описание слайда:
Important remark : In some situations the mean may not be representative of the data. As an example, the annual salaries of five vice presidents at AVX, LLC are $90,000, $92,000, $94,000, $98,000, and $350,000. The mean is: Notice how the one extreme value ($350,000) pulled the mean upward. Four of the five vice presidents earned less than the mean, raising the question whether the arithmetic mean value of $144,800 is typical of the salary of the five vice presidents.

Слайд 11





Properties of the mean 
As stated, the mean is a widely used measure of central tendency .  It has several important properties.
Every set of interval level and ratio level data has a mean.
All the data values are included in the calculation.
A set of data has only one mean, that is, the mean is unique.
The mean is a useful measure for comparing two or more populations.
The sum of the deviations of each value from the mean will always be zero, that is  
The mean is highly affected by extreme data .
 Note: Illustrating  the fifth property 
Consider the set of values: 3, 8, and 4.  The mean is 5.
Описание слайда:
Properties of the mean As stated, the mean is a widely used measure of central tendency . It has several important properties. Every set of interval level and ratio level data has a mean. All the data values are included in the calculation. A set of data has only one mean, that is, the mean is unique. The mean is a useful measure for comparing two or more populations. The sum of the deviations of each value from the mean will always be zero, that is The mean is highly affected by extreme data . Note: Illustrating the fifth property Consider the set of values: 3, 8, and 4. The mean is 5.

Слайд 12





Chapter Three: Numerical Measures of the Data

Median : The median splits the ordered data into halves
the symbol used to denote the median is
Описание слайда:
Chapter Three: Numerical Measures of the Data Median : The median splits the ordered data into halves the symbol used to denote the median is

Слайд 13





Chapter Three: Numerical Measures of the Data
When there is an even number of values in the data set, the median is obtained by taking the average of the two middle numbers.
Example:-
 Six customers purchased the following number of magazines: 1, 7, 3, 2, 3, 4.  Find the median.
Arrange the data in order and compute the middle point.
Data array: 1, 2, 3, 3, 4, 7.
The median,      = (3 + 3)/2 = 3.
 Example:-Find the median grade of the following sample
          62, 68, 71, 74, 77, 82, 84, 88, 90, 94
          62, 68, 71, 74, 77      82, 84, 88, 90, 94
         5 on the left                    5 on the right
                              = 79.5
Описание слайда:
Chapter Three: Numerical Measures of the Data When there is an even number of values in the data set, the median is obtained by taking the average of the two middle numbers. Example:- Six customers purchased the following number of magazines: 1, 7, 3, 2, 3, 4. Find the median. Arrange the data in order and compute the middle point. Data array: 1, 2, 3, 3, 4, 7. The median, = (3 + 3)/2 = 3. Example:-Find the median grade of the following sample 62, 68, 71, 74, 77, 82, 84, 88, 90, 94 62, 68, 71, 74, 77 82, 84, 88, 90, 94 5 on the left 5 on the right = 79.5

Слайд 14





example
Find the median grade of the following sample of students grades :
A B A D F D F A B C C C F D A F D A A B B F D A B F C
Data array:
F F F F F F D D D D D C C C C B B B B B A A A A A A A
The median grade is :  C
Half of the students had at least C ( a grade less than or equal C.
Half of the students had at most C ( a grade more than or equal C . 
The median can be determined for ordinal level data .
Описание слайда:
example Find the median grade of the following sample of students grades : A B A D F D F A B C C C F D A F D A A B B F D A B F C Data array: F F F F F F D D D D D C C C C B B B B B A A A A A A A The median grade is : C Half of the students had at least C ( a grade less than or equal C. Half of the students had at most C ( a grade more than or equal C . The median can be determined for ordinal level data .

Слайд 15





Properties of the Median

The major properties of the median are:
The median is a unique value, that is, like the mean, there is only one median for a set of data.
It is not influenced by extremely large or small values and is therefore a valuable measure of central tendency when such values do occur.
It can be computed for ratio level, interval level, and ordinal-level data.
Fifty percent of the observations are greater than the median and fifty percent of the observations are less than the median.
Описание слайда:
Properties of the Median The major properties of the median are: The median is a unique value, that is, like the mean, there is only one median for a set of data. It is not influenced by extremely large or small values and is therefore a valuable measure of central tendency when such values do occur. It can be computed for ratio level, interval level, and ordinal-level data. Fifty percent of the observations are greater than the median and fifty percent of the observations are less than the median.

Слайд 16





Chapter Three: Numerical Measures of the Data
Mode:- is the score that occurs most frequently (denoted by M)
Example:- The following data represent the duration (in days) of U.S. space shuttle voyages for the years 1992-94.  Find the mode.
Data set: 8, 9, 9, 14, 8, 8, 10, 7, 6, 9, 7, 8, 10, 14, 11, 8, 14, 11. 
Ordered set: 6, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 11, 14, 14, 14.    Mode = 8 days.
Example:- Six strains of bacteria were tested to see how long they could remain alive outside their normal environment.  The time, in minutes, is given below.  Find the mode.
Data set: 2, 3, 5, 7, 8, 10. 
There is no mode. since each data value occurs equally with a frequency of one.
Описание слайда:
Chapter Three: Numerical Measures of the Data Mode:- is the score that occurs most frequently (denoted by M) Example:- The following data represent the duration (in days) of U.S. space shuttle voyages for the years 1992-94. Find the mode. Data set: 8, 9, 9, 14, 8, 8, 10, 7, 6, 9, 7, 8, 10, 14, 11, 8, 14, 11. Ordered set: 6, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 11, 14, 14, 14. Mode = 8 days. Example:- Six strains of bacteria were tested to see how long they could remain alive outside their normal environment. The time, in minutes, is given below. Find the mode. Data set: 2, 3, 5, 7, 8, 10. There is no mode. since each data value occurs equally with a frequency of one.

Слайд 17





Chapter Three: Numerical Measures of the Data
Example:- Eleven different automobiles were tested at a speed of 15 mph for stopping distances.  The distance, in feet, is given below.  Find the mode.
Data set: 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26. 
There are two modes (bimodal). The values are 18 and 24.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- Eleven different automobiles were tested at a speed of 15 mph for stopping distances. The distance, in feet, is given below. Find the mode. Data set: 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26. There are two modes (bimodal). The values are 18 and 24.

Слайд 18





Chapter Three: Numerical Measures of the Data
The Mode for a Grouped  Frequency Distribution –
Can be approximated by the midpoint of  the modal class.
 Example
Описание слайда:
Chapter Three: Numerical Measures of the Data The Mode for a Grouped Frequency Distribution – Can be approximated by the midpoint of the modal class. Example

Слайд 19





Properties of the Mode

The mode can be found for all levels of data (nominal, ordinal, interval, and ratio).
The mode is not affected by extremely high or low values.
A set of data can have more than one mode. If it has two modes, it is said to be bimodal.
A disadvantage is that a set of data may not have a mode because no value appears more than once.
Описание слайда:
Properties of the Mode The mode can be found for all levels of data (nominal, ordinal, interval, and ratio). The mode is not affected by extremely high or low values. A set of data can have more than one mode. If it has two modes, it is said to be bimodal. A disadvantage is that a set of data may not have a mode because no value appears more than once.

Слайд 20





Chapter Three: Numerical Measures of the Data
The weighted mean is used when the  values in a data set are not all equally represented.
The weighted mean of a variable X is found by multiplying each value by its corresponding weight and dividing the sum of the products by the sum of the weights.
Описание слайда:
Chapter Three: Numerical Measures of the Data The weighted mean is used when the values in a data set are not all equally represented. The weighted mean of a variable X is found by multiplying each value by its corresponding weight and dividing the sum of the products by the sum of the weights.

Слайд 21





Chapter Three: Numerical Measures of the Data
Example:- During a one hour period on a hot Saturday afternoon a boy served fifty drinks.  He sold five drinks for $0.50, fifteen for $0.75, fifteen for $0.90, and fifteen for $1.10.  Compute the weighted mean of the of the price of the drinks :afternoon a boy served fifty
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- During a one hour period on a hot Saturday afternoon a boy served fifty drinks. He sold five drinks for $0.50, fifteen for $0.75, fifteen for $0.90, and fifteen for $1.10. Compute the weighted mean of the of the price of the drinks :afternoon a boy served fifty

Слайд 22





Best measure of central tendency
Описание слайда:
Best measure of central tendency

Слайд 23





Relationship between mean , median and mode and   the shape  of the distribution
Symmetric – the mean =the median=the mode 
Skewed left – the mean will usually be smaller than the median
Skewed right – the mean will usually be larger than the median
Описание слайда:
Relationship between mean , median and mode and the shape of the distribution Symmetric – the mean =the median=the mode Skewed left – the mean will usually be smaller than the median Skewed right – the mean will usually be larger than the median

Слайд 24





Chapter Three: Numerical Measures of the Data
3-2 Measures of Dispersion( variation)
o the spread or variability in the data.
Learning objectives
The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Comparing two sets of data
The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data
The measures of dispersion in this section measure the differences between how far “spread out” the data values are.
Описание слайда:
Chapter Three: Numerical Measures of the Data 3-2 Measures of Dispersion( variation) o the spread or variability in the data. Learning objectives The range of a variable The variance of a variable The standard deviation of a variable Use the Empirical Rule Comparing two sets of data The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data The measures of dispersion in this section measure the differences between how far “spread out” the data values are.

Слайд 25





Chapter Three: Numerical Measures of the Data
Variability  -- provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together.
Tells how meaningful measures of central tendency are
Help to see which scores are outliers (extreme scores)
Why do we Study Dispersion?
A direct comparison of two sets of data based only on two measures of central tendency  such as the mean and the median can be misleading since an average does not tell us anything about the spread of the data. 
See Example 3-15 page 128 of your text book
Comparison of two outdoor paints : 6 gallons of each brand have been  tested and the data obtained show  how long ( in months)  each brand will last before fading .
Brand A : 10  60  50  30  40  20 
Brand B : 35  45  30  35  40  25
Calculate the mean for each brand :
Описание слайда:
Chapter Three: Numerical Measures of the Data Variability -- provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together. Tells how meaningful measures of central tendency are Help to see which scores are outliers (extreme scores) Why do we Study Dispersion? A direct comparison of two sets of data based only on two measures of central tendency such as the mean and the median can be misleading since an average does not tell us anything about the spread of the data. See Example 3-15 page 128 of your text book Comparison of two outdoor paints : 6 gallons of each brand have been tested and the data obtained show how long ( in months) each brand will last before fading . Brand A : 10 60 50 30 40 20 Brand B : 35 45 30 35 40 25 Calculate the mean for each brand :

Слайд 26





Chapter Three: Numerical Measures of the Data
Measures of dispersion are :
The range , 
 The interquartile range ,
 The variance and standard deviation ,
 The coefficient of variation  
The range (R) of a variable is the difference between the largest data value and the smallest data value
                R = highest value – lowest value.
Properties of the range 
Only two values are used in the calculation.
It is influenced by extreme values.
It is easy to compute and understand.
Описание слайда:
Chapter Three: Numerical Measures of the Data Measures of dispersion are : The range , The interquartile range , The variance and standard deviation , The coefficient of variation The range (R) of a variable is the difference between the largest data value and the smallest data value R = highest value – lowest value. Properties of the range Only two values are used in the calculation. It is influenced by extreme values. It is easy to compute and understand.

Слайд 27





Example
Example
Compute the range of     6, 1, 2, 6, 11, 7, 3, 3
The largest value is 11
The smallest value is 1
Subtracting the two … 11 – 1 = 10 … the range is 10
Relative measure of Range called coefficient of Range
Описание слайда:
Example Example Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The largest value is 11 The smallest value is 1 Subtracting the two … 11 – 1 = 10 … the range is 10 Relative measure of Range called coefficient of Range

Слайд 28





Chapter Three: Numerical Measures of the Data
The variance of a variable
The variance is based on the deviation from the mean
( xi – μ ) for populations
( xi –    ) for samples 
To treat positive differences and negative differences, we square the deviations
( xi – μ )2 for populations
( xi –    )2 for samples
Описание слайда:
Chapter Three: Numerical Measures of the Data The variance of a variable The variance is based on the deviation from the mean ( xi – μ ) for populations ( xi – ) for samples To treat positive differences and negative differences, we square the deviations ( xi – μ )2 for populations ( xi – )2 for samples

Слайд 29





Chapter Three: Numerical Measures of the Data
The population variance of a variable is the sum of the  squared deviations of the data values from the mean  divided by the number in the population
                           
                           where
The population variance is represented by σ2
i.e. the square root of the arithmetic mean of the squares of deviations from arithmetic mean of given distribution.
Описание слайда:
Chapter Three: Numerical Measures of the Data The population variance of a variable is the sum of the squared deviations of the data values from the mean divided by the number in the population where The population variance is represented by σ2 i.e. the square root of the arithmetic mean of the squares of deviations from arithmetic mean of given distribution.

Слайд 30





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 31





Chapter Three: Numerical Measures of the Data
The sample variance of a variable is the sum of the squared deviations of data values from the mean  divided by one less than the number in the sample
The sample variance is represented by s2
Sample standard deviation (s)






 
                                       or
Описание слайда:
Chapter Three: Numerical Measures of the Data The sample variance of a variable is the sum of the squared deviations of data values from the mean divided by one less than the number in the sample The sample variance is represented by s2 Sample standard deviation (s) or

Слайд 32


Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №32
Описание слайда:

Слайд 33





Chapter Three: Numerical Measures of the Data
Sample Variance for Grouped and Ungrouped Data
For grouped data, use the class midpoints for the observed value in the different classes. 
For ungrouped data, use the same formula with the class midpoints, Xm, replaced with the actual observed X value. 
Example:- 
Find the variance and SD for the following data set
2,3,4,5,2,2,2,3,2,4,3,2,5,2,3,3,4,2,5,4,4,3,3,2,5,2
Описание слайда:
Chapter Three: Numerical Measures of the Data Sample Variance for Grouped and Ungrouped Data For grouped data, use the class midpoints for the observed value in the different classes. For ungrouped data, use the same formula with the class midpoints, Xm, replaced with the actual observed X value. Example:- Find the variance and SD for the following data set 2,3,4,5,2,2,2,3,2,4,3,2,5,2,3,3,4,2,5,4,4,3,3,2,5,2

Слайд 34





Chapter Three: Numerical Measures of the Data
Step one put the data I ungrouped frequency table
Описание слайда:
Chapter Three: Numerical Measures of the Data Step one put the data I ungrouped frequency table

Слайд 35





Chapter Three: Numerical Measures of the Data
Example:- find the variance and SD for the frequency distribution of the data representing number of miles that 20 runners run during one week
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- find the variance and SD for the frequency distribution of the data representing number of miles that 20 runners run during one week

Слайд 36





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 37





Chapter Three: Numerical Measures of the Data
Interpretation and Uses of the Standard Deviation
The standard deviation is used to measure the spread of the data. A small standard deviation indicates that the data is clustered close to the mean, thus the mean is representative of the data. A large standard deviation indicates that the data are spread out from the mean and the mean is not representative of the data.
Описание слайда:
Chapter Three: Numerical Measures of the Data Interpretation and Uses of the Standard Deviation The standard deviation is used to measure the spread of the data. A small standard deviation indicates that the data is clustered close to the mean, thus the mean is representative of the data. A large standard deviation indicates that the data are spread out from the mean and the mean is not representative of the data.

Слайд 38





Chapter Three: Numerical Measures of the Data
Coefficient of Variation :- 
The relative measure of St. Dev. is the coefficient of variation which is defined to be the standard deviation divided by the mean.  The result is expressed as a percentage.			
                                             Or
	
Important note:
The coefficient of variation should only be computed for data measured on a ratio scale.
See the following example
Описание слайда:
Chapter Three: Numerical Measures of the Data Coefficient of Variation :- The relative measure of St. Dev. is the coefficient of variation which is defined to be the standard deviation divided by the mean. The result is expressed as a percentage. Or Important note: The coefficient of variation should only be computed for data measured on a ratio scale. See the following example

Слайд 39





Example :
To see why the coefficient of variation should not be applied to interval level data, compare the same set of temperatures in Celsius and Fahrenheit:
    Celsius: [0, 10, 20, 30, 40] 
    Fahrenheit: [32, 50, 68, 86, 104]
The CV of the first set is 15.81/20 = 0.79. For the second set (which are the same temperatures) it is 28.46/68 = 0.42
So the coefficient of variation does not have any meaning for data on an interval scale.
Описание слайда:
Example : To see why the coefficient of variation should not be applied to interval level data, compare the same set of temperatures in Celsius and Fahrenheit: Celsius: [0, 10, 20, 30, 40] Fahrenheit: [32, 50, 68, 86, 104] The CV of the first set is 15.81/20 = 0.79. For the second set (which are the same temperatures) it is 28.46/68 = 0.42 So the coefficient of variation does not have any meaning for data on an interval scale.

Слайд 40


Statistics. Data Description. Data Summarization. Numerical Measures of the Data, слайд №40
Описание слайда:

Слайд 41





Chapter Three: Numerical Measures of the Data
Example:- Data about the annual salary (000’s) and age of CEO’s in a number of firms has been collected. The means and standard deviations are as follows:
Which distribution has more dispersion? Is direct comparison appropriate?
Salary and age are measured in different units and the means show that there is also a significant difference in magnitude.
Direct comparison is not appropriate
Comparing CV’s we can now see clearly that the dispersion or variability relative to the mean is greater for CEO annual salary than for age.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- Data about the annual salary (000’s) and age of CEO’s in a number of firms has been collected. The means and standard deviations are as follows: Which distribution has more dispersion? Is direct comparison appropriate? Salary and age are measured in different units and the means show that there is also a significant difference in magnitude. Direct comparison is not appropriate Comparing CV’s we can now see clearly that the dispersion or variability relative to the mean is greater for CEO annual salary than for age.

Слайд 42





Chapter Three: Numerical Measures of the Data
Measure of position:
Measures of position are used to locate the relative position of a data value in the data set
1- Standard Scores 
To compare values of different units a z-score for each value is needed to be obtained then compared 
A z-score or standard score for each value is obtained by   
For sample
For population
The z-score represents the number SD that a data value falls above or below the mean.
Описание слайда:
Chapter Three: Numerical Measures of the Data Measure of position: Measures of position are used to locate the relative position of a data value in the data set 1- Standard Scores To compare values of different units a z-score for each value is needed to be obtained then compared A z-score or standard score for each value is obtained by For sample For population The z-score represents the number SD that a data value falls above or below the mean.

Слайд 43





Chapter Three: Numerical Measures of the Data
Standard Scores (or z-scores) specify the exact location of a score within a distribution relative to the mean
The sign (- or +) tells whether the score is above or below the mean
The numerical value tells the distance from the mean in terms of standard deviations
E.g., a z-score of -1.3 tells us that the raw score fell 1.3 standard deviations below the mean.
 
Raw score is the original, untransformed score.
To make them more meaningful, raw scores can be converted to z-scores.
Описание слайда:
Chapter Three: Numerical Measures of the Data Standard Scores (or z-scores) specify the exact location of a score within a distribution relative to the mean The sign (- or +) tells whether the score is above or below the mean The numerical value tells the distance from the mean in terms of standard deviations E.g., a z-score of -1.3 tells us that the raw score fell 1.3 standard deviations below the mean.   Raw score is the original, untransformed score. To make them more meaningful, raw scores can be converted to z-scores.

Слайд 44





Chapter Three: Numerical Measures of the Data
Characteristics of Standard Scores
The shape of the distribution of standard scores is the same as the shape of the distribution of raw scores (the only thing that changes is the units on the x-axis)
The mean of a set of standard scores = 0.
The St. deviation of a set of standard scores = 1.
A standard score of greater than +3 or less than  - 3 is an extreme score, or an outlier.
Описание слайда:
Chapter Three: Numerical Measures of the Data Characteristics of Standard Scores The shape of the distribution of standard scores is the same as the shape of the distribution of raw scores (the only thing that changes is the units on the x-axis) The mean of a set of standard scores = 0. The St. deviation of a set of standard scores = 1. A standard score of greater than +3 or less than - 3 is an extreme score, or an outlier.

Слайд 45





Chapter Three: Numerical Measures of the Data
Example:- A student scored 65 on a statistics exam that had a mean of 50 and a standard deviation of 10.  Compute the z-score.
z = (65 – 50)/10 = 1.5.
That is, the score of 65 is 1.5 standard deviations above the mean.
Above - since the z-score is positive.
Assume that this student scored 70 on a math exam that had a mean of 80 and a standard deviation of 5 .
Compute the z-score .
Z= ( 70-80)/5=-2
That is, the score of 70 is 2 standard deviations below the mean.
below - since the z-score is positive.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example:- A student scored 65 on a statistics exam that had a mean of 50 and a standard deviation of 10. Compute the z-score. z = (65 – 50)/10 = 1.5. That is, the score of 65 is 1.5 standard deviations above the mean. Above - since the z-score is positive. Assume that this student scored 70 on a math exam that had a mean of 80 and a standard deviation of 5 . Compute the z-score . Z= ( 70-80)/5=-2 That is, the score of 70 is 2 standard deviations below the mean. below - since the z-score is positive.

Слайд 46





Example:- a student scored 65 on a calculus test that had a mean of 50 and a SD of 10. she scored 30 on statistics test with a mean of 25 and variance of 25, compare relative positions  of the two tests.
Example:- a student scored 65 on a calculus test that had a mean of 50 and a SD of 10. she scored 30 on statistics test with a mean of 25 and variance of 25, compare relative positions  of the two tests.
Since the z-score for calculus is larger , her relative position in the calculus class is higher than  her relative position in the statistics class.
Описание слайда:
Example:- a student scored 65 on a calculus test that had a mean of 50 and a SD of 10. she scored 30 on statistics test with a mean of 25 and variance of 25, compare relative positions of the two tests. Example:- a student scored 65 on a calculus test that had a mean of 50 and a SD of 10. she scored 30 on statistics test with a mean of 25 and variance of 25, compare relative positions of the two tests. Since the z-score for calculus is larger , her relative position in the calculus class is higher than her relative position in the statistics class.

Слайд 47





Chapter Three: Numerical Measures of the Data
Quartiles divide the data set into 4 groups.
Quartiles are denoted by Q1, Q2, and Q3.
The median is the same as Q2.
Finding the Quartiles
Procedure:  Let         be the      quartile and n the sample size.
Step 1: Arrange the data in order.
Step 2: Compute c = ({n+1}k)/4.
Step 3: If c is not a whole number, round off to whole number. use    
                the value halfway between       and      . 
Step 4: If c is a whole number then the value of     is the position 
               value of the required percentile.
Описание слайда:
Chapter Three: Numerical Measures of the Data Quartiles divide the data set into 4 groups. Quartiles are denoted by Q1, Q2, and Q3. The median is the same as Q2. Finding the Quartiles Procedure: Let be the quartile and n the sample size. Step 1: Arrange the data in order. Step 2: Compute c = ({n+1}k)/4. Step 3: If c is not a whole number, round off to whole number. use the value halfway between and . Step 4: If c is a whole number then the value of is the position value of the required percentile.

Слайд 48





Chapter Three: Numerical Measures of the Data
Example: 
 For the following data set: 2, 3, 5, 6, 8, 10, 12
Find Q1  and Q3
n = 7, so for Q1  we have c = ((7+1)1)/4 = 2. 
Hence the value of Q1 is the 2nd value.
Thus Q1 for the data set is    3.
for Q3  we have c = ((7+1)3)/4 = 6. 
Hence the value of Q3 is the 6th value.
Thus Q3 for the data set is    10.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example: For the following data set: 2, 3, 5, 6, 8, 10, 12 Find Q1 and Q3 n = 7, so for Q1 we have c = ((7+1)1)/4 = 2. Hence the value of Q1 is the 2nd value. Thus Q1 for the data set is 3. for Q3 we have c = ((7+1)3)/4 = 6. Hence the value of Q3 is the 6th value. Thus Q3 for the data set is 10.

Слайд 49





Chapter Three: Numerical Measures of the Data
Example: Find Q1  and Q3 for the following data set: 
                  2, 3, 5, 6, 8, 10, 12, 15, 18.
Note: the data set is already ordered.
n = 9, so for Q1  we have c = ((9+1)1)/4 = 2.5. 
Hence the value of Q1 is the halfway between the 2nd value and 3rd value.
for Q3  we have c = ((9+1)3)/4 = 7.5. 
Hence the value of Q3 is the halfway between the 7th value and 8th value
Описание слайда:
Chapter Three: Numerical Measures of the Data Example: Find Q1 and Q3 for the following data set: 2, 3, 5, 6, 8, 10, 12, 15, 18. Note: the data set is already ordered. n = 9, so for Q1 we have c = ((9+1)1)/4 = 2.5. Hence the value of Q1 is the halfway between the 2nd value and 3rd value. for Q3 we have c = ((9+1)3)/4 = 7.5. Hence the value of Q3 is the halfway between the 7th value and 8th value

Слайд 50





Chapter Three: Numerical Measures of the Data
Example: 
 For the following data set: 2, 3, 5, 6, 8, 10, 12
Find Q1  and Q3
The median for the above data is 6
The median for the lower group of data which is less than median is 3
So the value of Q1 is the 2nd value which means that Q1 =3.
The median for the upper group of data which is grater than median is 10
So the value of Q3 is the 6th value which means that Q3 =10.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example: For the following data set: 2, 3, 5, 6, 8, 10, 12 Find Q1 and Q3 The median for the above data is 6 The median for the lower group of data which is less than median is 3 So the value of Q1 is the 2nd value which means that Q1 =3. The median for the upper group of data which is grater than median is 10 So the value of Q3 is the 6th value which means that Q3 =10.

Слайд 51





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 52





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 53





Chapter Three: Numerical Measures of the Data
The Interquartile Range (IQR)
The Interquartile Range,    IQR = Q3 – Q1.
the  Interquartile Range  (IQR), also called the midspread , middle fifty or inner 50% data range, is a measure of statistical dispersion (variation), being equal to the difference between the third and first quartiles.
Описание слайда:
Chapter Three: Numerical Measures of the Data The Interquartile Range (IQR) The Interquartile Range, IQR = Q3 – Q1. the  Interquartile Range  (IQR), also called the midspread , middle fifty or inner 50% data range, is a measure of statistical dispersion (variation), being equal to the difference between the third and first quartiles.

Слайд 54





Chapter Three: Numerical Measures of the Data
An outlier is an extremely high or an extremely low data value when compared with the rest of the data values.
Описание слайда:
Chapter Three: Numerical Measures of the Data An outlier is an extremely high or an extremely low data value when compared with the rest of the data values.

Слайд 55





Example
Given the data set 5, 6, 12, 13, 15, 18, 22, 50, can the value of 50 be considered as an outlier?
Q1 = 9, Q3 = 20, IQR = 11.  Verify. 
(1.5)(IQR) = (1.5)(11) = 16.5.
9 – 16.5 = – 7.5 and 20 + 16.5 = 36.5.
The value of 50 is outside the range (– 7.5 to 36.5), hence 50 is an outlier.
Описание слайда:
Example Given the data set 5, 6, 12, 13, 15, 18, 22, 50, can the value of 50 be considered as an outlier? Q1 = 9, Q3 = 20, IQR = 11. Verify. (1.5)(IQR) = (1.5)(11) = 16.5. 9 – 16.5 = – 7.5 and 20 + 16.5 = 36.5. The value of 50 is outside the range (– 7.5 to 36.5), hence 50 is an outlier.

Слайд 56





Chapter Three: Numerical Measures of the Data
Measure of Dispersion tells us about the variation of the data set. 
Skewness tells us about the direction of variation of the data set. 
Definition: 
Skewness is a measure of symmetry, or more precisely, the lack of symmetry. 
Coefficient of Skewness
Unitless number that  measures the degree and direction of symmetry of a distribution
There are several  ways of measuring Skewness:
Pearson’s coefficient of Skewness
 
Описание слайда:
Chapter Three: Numerical Measures of the Data Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set. Definition: Skewness is a measure of symmetry, or more precisely, the lack of symmetry. Coefficient of Skewness Unitless number that measures the degree and direction of symmetry of a distribution There are several ways of measuring Skewness: Pearson’s coefficient of Skewness  

Слайд 57





Chapter Three: Numerical Measures of the Data
For any bell shaped distribution:
Approximately 68% of the data values will fall within one standard deviation of the mean.
Approximately  95% will fall within two standard deviations of the mean.
Approximately 99.7% will fall within three standard deviations of the mean.
Описание слайда:
Chapter Three: Numerical Measures of the Data For any bell shaped distribution: Approximately 68% of the data values will fall within one standard deviation of the mean. Approximately 95% will fall within two standard deviations of the mean. Approximately 99.7% will fall within three standard deviations of the mean.

Слайд 58





The Empirical (Normal) Rule
Описание слайда:
The Empirical (Normal) Rule

Слайд 59





Chapter Three: Numerical Measures of the Data
What is a Box Plot 
To construct a box plot, first obtain the 5 number summary
{     Min,     Q1,     M,     Q3,     Max     }
Описание слайда:
Chapter Three: Numerical Measures of the Data What is a Box Plot To construct a box plot, first obtain the 5 number summary { Min, Q1, M, Q3, Max }

Слайд 60





Chapter Three: Numerical Measures of the Data
The box plot is useful in analyzing small data sets that do not lend themselves easily to histograms. Because of the small size of a box plot, it is easy to display and compare several box plots in a small space. 
A box plot is a good alternative or  complement to a histogram and is usually better for showing several simultaneous comparisons.
Описание слайда:
Chapter Three: Numerical Measures of the Data The box plot is useful in analyzing small data sets that do not lend themselves easily to histograms. Because of the small size of a box plot, it is easy to display and compare several box plots in a small space. A box plot is a good alternative or complement to a histogram and is usually better for showing several simultaneous comparisons.

Слайд 61





Chapter Three: Numerical Measures of the Data
How to use it:
Collect and arrange data. Collect the data and arrange it into an ordered set from lowest value to highest.
Calculate the median. M = median= Q2
Calculate the first quartile. (Q1)
Calculate the third quartile. (Q3) 
Calculate the interquartile rage (IQR). This range is the difference between the first and third quartile vales. (Q3 - Q1)
Obtain the maximum. This is the largest data value that is less than or equal to the third quartile plus 1.5 X IQR.
 Q3 + [(Q3 - Q1) X 1.5]
.
Описание слайда:
Chapter Three: Numerical Measures of the Data How to use it: Collect and arrange data. Collect the data and arrange it into an ordered set from lowest value to highest. Calculate the median. M = median= Q2 Calculate the first quartile. (Q1) Calculate the third quartile. (Q3) Calculate the interquartile rage (IQR). This range is the difference between the first and third quartile vales. (Q3 - Q1) Obtain the maximum. This is the largest data value that is less than or equal to the third quartile plus 1.5 X IQR. Q3 + [(Q3 - Q1) X 1.5] .

Слайд 62





Chapter Three: Numerical Measures of the Data
Obtain the minimum. This value will be the smallest data value that is greater than or equal to the first quartile minus 1.5 X IQR. 
                            Q1 - [(Q3 - Q1) X 1.5]
Draw and label the axes of the graph. The scale of the horizontal axis must be large enough to encompass the greatest value of the data sets. 
Draw the box plots. Construct the box, insert median points, and attach maximum and minimum. Identify outliers (values outside the upper and lower fences) with asterisks.
The box plot can provide answers to the following questions:
Does the location differ between subgroups?
Does the variation differ between subgroups?
Are there any outliers?
Описание слайда:
Chapter Three: Numerical Measures of the Data Obtain the minimum. This value will be the smallest data value that is greater than or equal to the first quartile minus 1.5 X IQR. Q1 - [(Q3 - Q1) X 1.5] Draw and label the axes of the graph. The scale of the horizontal axis must be large enough to encompass the greatest value of the data sets. Draw the box plots. Construct the box, insert median points, and attach maximum and minimum. Identify outliers (values outside the upper and lower fences) with asterisks. The box plot can provide answers to the following questions: Does the location differ between subgroups? Does the variation differ between subgroups? Are there any outliers?

Слайд 63





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 64





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 65





Chapter Three: Numerical Measures of the Data
Now find the interquartile range (IQR). The interquartile range is the difference between the upper quartile and the lower quartile. In this case the IQR = 87 - 52 = 35. The IQR is a very useful measurement. It is useful because it is less influenced by extreme values, it limits the range to the middle 50% of the values.
35 is the interquartile range
begin to draw Box-plot graph.
Описание слайда:
Chapter Three: Numerical Measures of the Data Now find the interquartile range (IQR). The interquartile range is the difference between the upper quartile and the lower quartile. In this case the IQR = 87 - 52 = 35. The IQR is a very useful measurement. It is useful because it is less influenced by extreme values, it limits the range to the middle 50% of the values. 35 is the interquartile range begin to draw Box-plot graph.

Слайд 66





Chapter Three: Numerical Measures of the Data
Example 2
Consider two datasets:
A1={0.22, -0.87, -2.39, -1.79, 0.37, -1.54, 1.28, -0.31, -0.74, 1.72, 0.38, -0.17, -0.62, -1.10, 0.30, 0.15, 2.30, 0.19, -0.50, -0.09}
A2={-5.13, -2.19, -2.43, -3.83, 0.50, -3.25, 4.32, 1.63, 5.18, -0.43, 7.11, 4.87, -3.10, -5.81, 3.76, 6.31, 2.58, 0.07, 5.76, 3.50}
Notice that both datasets are approximately balanced around zero; evidently the mean in both cases is "near" zero. However there is substantially more variation in A2 which ranges approximately from -6 to 6 whereas A1 ranges approximately from -2½ to 2½.
Below find box plots. Notice the difference in scales: since the box plot is displaying the full range of variation, the y-range must be expanded.
Описание слайда:
Chapter Three: Numerical Measures of the Data Example 2 Consider two datasets: A1={0.22, -0.87, -2.39, -1.79, 0.37, -1.54, 1.28, -0.31, -0.74, 1.72, 0.38, -0.17, -0.62, -1.10, 0.30, 0.15, 2.30, 0.19, -0.50, -0.09} A2={-5.13, -2.19, -2.43, -3.83, 0.50, -3.25, 4.32, 1.63, 5.18, -0.43, 7.11, 4.87, -3.10, -5.81, 3.76, 6.31, 2.58, 0.07, 5.76, 3.50} Notice that both datasets are approximately balanced around zero; evidently the mean in both cases is "near" zero. However there is substantially more variation in A2 which ranges approximately from -6 to 6 whereas A1 ranges approximately from -2½ to 2½. Below find box plots. Notice the difference in scales: since the box plot is displaying the full range of variation, the y-range must be expanded.

Слайд 67





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 68





Chapter Three: Numerical Measures of the Data
Описание слайда:
Chapter Three: Numerical Measures of the Data

Слайд 69





 
If the median is near the center of the box, the distribution is approximately symmetric.
If the median falls to the left of the center of the box, the distribution is positively skewed.
If the median falls to the right of the center of the box, the distribution is negatively skewed
Similarly :
If the lines are about the same length,  the distribution is approximately symmetric.
If the right line is larger than the left line,  the distribution is positively skewed. 
If the left line is larger than the right line,  the distribution is negatively skewed.
Описание слайда:
If the median is near the center of the box, the distribution is approximately symmetric. If the median falls to the left of the center of the box, the distribution is positively skewed. If the median falls to the right of the center of the box, the distribution is negatively skewed Similarly : If the lines are about the same length, the distribution is approximately symmetric. If the right line is larger than the left line, the distribution is positively skewed. If the left line is larger than the right line, the distribution is negatively skewed.



Похожие презентации
Mypresentation.ru
Загрузить презентацию