A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and analysing those statistics. Descriptive statistics is distinguished from inferential statistics (or inductive statistics) by its aim to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, is not developed on the basis of probability theory, and are frequently nonparametric statistics. Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented. For example, in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related co-morbidities, etc.
Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis and skewness.
Use in statistical analysis
Descriptive statistics provide simple summaries about the sample and about the observations that have been made. Such summaries may be either quantitative, i.e. summary statistics, or visual, i.e. simple-to-understand graphs. These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis, or they may be sufficient in and of themselves for a particular investigation.
For example, the shooting percentage in basketball is a descriptive statistic that summarizes the performance of a player or a team. This number is the number of shots made divided by the number of shots taken. For example, a player who shoots 33% is making approximately one shot in every three. The percentage summarizes or describes multiple discrete events. Consider also the grade point average. This single number describes the general performance of a student across the range of their course experiences.
The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations and of economic data was the first way the topic of statistics appeared. More recently, a collection of summarisation techniques has been formulated under the heading of exploratory data analysis: an example of such a technique is the box plot.
In the business world, descriptive statistics provides a useful summary of many types of data. For example, investors and brokers may use a historical account of return behaviour by performing empirical and analytical analyses on their investments in order to make better investing decisions in the future.
Univariate analysis
Univariate analysis involves describing the distribution of a single variable, including its central tendency (including the mean, median, and mode) and dispersion (including the range and quartiles of the data-set, and measures of spread such as the variance and standard deviation). The shape of the distribution may also be described via indices such as skewness and kurtosis. Characteristics of a variable's distribution may also be depicted in graphical or tabular format, including histograms and stem-and-leaf display.
Bivariate and multivariate analysis
When a sample consists of more than one variable, descriptive statistics may be used to describe the relationship between pairs of variables. In this case, descriptive statistics include:
- Cross-tabulations and contingency tables
- Graphical representation via scatterplots
- Quantitative measures of dependence
- Descriptions of conditional distributions
The main reason for differentiating univariate and bivariate analysis is that bivariate analysis is not only a simple descriptive analysis, but also it describes the relationship between two different variables. Quantitative measures of dependence include correlation (such as Pearson's r when both variables are continuous, or Spearman's rho if one or both are not) and covariance (which reflects the scale variables are measured on). The slope, in regression analysis, also reflects the relationship between variables. The unstandardised slope indicates the unit change in the criterion variable for a one unit change in the predictor. The standardised slope indicates this change in standardised (z-score) units. Highly skewed data are often transformed by taking logarithms. The use of logarithms makes graphs more symmetrical and look more similar to the normal distribution, making them easier to interpret intuitively.: 47
References
- Mann, Prem S. (1995). Introductory Statistics (2nd ed.). Wiley. ISBN 0-471-31009-3.
- Christopher, Andrew N. (2017), "Drawing Conclusions From Data: Descriptive Statistics, Inferential Statistics, and Hypothesis Testing", Interpreting and Using Statistics in Psychological Research, Thousand Oaks, CA: SAGE Publications, Inc, pp. 145–183, doi:10.4135/9781506304144.n6, ISBN 978-1-5063-0416-8, retrieved 2021-06-01
- Dodge, Y. (2003). The Oxford Dictionary of Statistical Terms. OUP. ISBN 0-19-850994-4.
- Christopher, Andrew N. (2017), "Drawing Conclusions From Data: Descriptive Statistics, Inferential Statistics, and Hypothesis Testing", Interpreting and Using Statistics in Psychological Research, Thousand Oaks, CA: SAGE Publications, Inc, pp. 145–183, doi:10.4135/9781506304144.n6, ISBN 978-1-5063-0416-8, retrieved 2021-06-01
- Investopedia, Descriptive Statistics Terms
- Trochim, William M. K. (2006). "Descriptive statistics". Research Methods Knowledge Base. Retrieved 14 March 2011.
- Babbie, Earl R. (2009). The Practice of Social Research (12th ed.). Wadsworth. pp. 436–440. ISBN 978-0-495-59841-1.
- Nick, Todd G. (2007). "Descriptive Statistics". Topics in Biostatistics. Methods in Molecular Biology. Vol. 404. New York: Springer. pp. 33–52. doi:10.1007/978-1-59745-530-5_3. ISBN 978-1-58829-531-6. PMID 18450044.
External links
- Descriptive Statistics Lecture: University of Pittsburgh Supercourse: http://www.pitt.edu/~super1/lecture/lec0421/index.htm
A descriptive statistic in the count noun sense is a summary statistic that quantitatively describes or summarizes features from a collection of information while descriptive statistics in the mass noun sense is the process of using and analysing those statistics Descriptive statistics is distinguished from inferential statistics or inductive statistics by its aim to summarize a sample rather than use the data to learn about the population that the sample of data is thought to represent This generally means that descriptive statistics unlike inferential statistics is not developed on the basis of probability theory and are frequently nonparametric statistics Even when a data analysis draws its main conclusions using inferential statistics descriptive statistics are generally also presented For example in papers reporting on human subjects typically a table is included giving the overall sample size sample sizes in important subgroups e g for each treatment or exposure group and demographic or clinical characteristics such as the average age the proportion of subjects of each sex the proportion of subjects with related co morbidities etc Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion Measures of central tendency include the mean median and mode while measures of variability include the standard deviation or variance the minimum and maximum values of the variables kurtosis and skewness Use in statistical analysisDescriptive statistics provide simple summaries about the sample and about the observations that have been made Such summaries may be either quantitative i e summary statistics or visual i e simple to understand graphs These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis or they may be sufficient in and of themselves for a particular investigation For example the shooting percentage in basketball is a descriptive statistic that summarizes the performance of a player or a team This number is the number of shots made divided by the number of shots taken For example a player who shoots 33 is making approximately one shot in every three The percentage summarizes or describes multiple discrete events Consider also the grade point average This single number describes the general performance of a student across the range of their course experiences The use of descriptive and summary statistics has an extensive history and indeed the simple tabulation of populations and of economic data was the first way the topic of statistics appeared More recently a collection of summarisation techniques has been formulated under the heading of exploratory data analysis an example of such a technique is the box plot In the business world descriptive statistics provides a useful summary of many types of data For example investors and brokers may use a historical account of return behaviour by performing empirical and analytical analyses on their investments in order to make better investing decisions in the future Univariate analysis Univariate analysis involves describing the distribution of a single variable including its central tendency including the mean median and mode and dispersion including the range and quartiles of the data set and measures of spread such as the variance and standard deviation The shape of the distribution may also be described via indices such as skewness and kurtosis Characteristics of a variable s distribution may also be depicted in graphical or tabular format including histograms and stem and leaf display Bivariate and multivariate analysis When a sample consists of more than one variable descriptive statistics may be used to describe the relationship between pairs of variables In this case descriptive statistics include Cross tabulations and contingency tables Graphical representation via scatterplots Quantitative measures of dependence Descriptions of conditional distributions The main reason for differentiating univariate and bivariate analysis is that bivariate analysis is not only a simple descriptive analysis but also it describes the relationship between two different variables Quantitative measures of dependence include correlation such as Pearson s r when both variables are continuous or Spearman s rho if one or both are not and covariance which reflects the scale variables are measured on The slope in regression analysis also reflects the relationship between variables The unstandardised slope indicates the unit change in the criterion variable for a one unit change in the predictor The standardised slope indicates this change in standardised z score units Highly skewed data are often transformed by taking logarithms The use of logarithms makes graphs more symmetrical and look more similar to the normal distribution making them easier to interpret intuitively 47 ReferencesMann Prem S 1995 Introductory Statistics 2nd ed Wiley ISBN 0 471 31009 3 Christopher Andrew N 2017 Drawing Conclusions From Data Descriptive Statistics Inferential Statistics and Hypothesis Testing Interpreting and Using Statistics in Psychological Research Thousand Oaks CA SAGE Publications Inc pp 145 183 doi 10 4135 9781506304144 n6 ISBN 978 1 5063 0416 8 retrieved 2021 06 01 Dodge Y 2003 The Oxford Dictionary of Statistical Terms OUP ISBN 0 19 850994 4 Christopher Andrew N 2017 Drawing Conclusions From Data Descriptive Statistics Inferential Statistics and Hypothesis Testing Interpreting and Using Statistics in Psychological Research Thousand Oaks CA SAGE Publications Inc pp 145 183 doi 10 4135 9781506304144 n6 ISBN 978 1 5063 0416 8 retrieved 2021 06 01 Investopedia Descriptive Statistics Terms Trochim William M K 2006 Descriptive statistics Research Methods Knowledge Base Retrieved 14 March 2011 Babbie Earl R 2009 The Practice of Social Research 12th ed Wadsworth pp 436 440 ISBN 978 0 495 59841 1 Nick Todd G 2007 Descriptive Statistics Topics in Biostatistics Methods in Molecular Biology Vol 404 New York Springer pp 33 52 doi 10 1007 978 1 59745 530 5 3 ISBN 978 1 58829 531 6 PMID 18450044 External linksDescriptive Statistics Lecture University of Pittsburgh Supercourse http www pitt edu super1 lecture lec0421 index htm Portal Mathematics