Start Date: 07/05/2020
Course Type: Common Course |
Course Link: https://www.coursera.org/learn/inferential-statistics
Inferential statistics are concerned with making inferences based on relations found in the sample, to relations in the population. Inferential statistics help us decide, for example, whether the differences between groups that we see in our data are strong enough to provide support for our hypothesis that group differences exist in general, in the entire population. We will start by considering the basic principles of significance testing: the sampling and test statistic distribution, p-value, significance level, power and type I and type II errors. Then we will consider a large number of statistical tests and techniques that help us make inferences for different types of data and different types of research designs. For each individual statistical test we will consider how it works, for what data and design it is appropriate and how results should be interpreted. You will also learn how to perform these tests using freely available software. For those who are already familiar with statistical testing: We will look at z-tests for 1 and 2 proportions, McNemar's test for dependent proportions, t-tests for 1 mean (paired differences) and 2 means, the Chi-square test for independence, Fisher’s exact test, simple regression (linear and exponential) and multiple regression (linear and logistic), one way and factorial analysis of variance, and non-parametric tests (Wilcoxon, Kruskal-Wallis, sign test, signed-rank test, runs test).
In this second module of week 1 we dive right in with a quick refresher on statistical hypothesis testing. Since we're assuming you just completed the course Basic Statistics, our treatment is a little more abstract and we go really fast! We provide the relevant Basic Statistics videos in case you need a gentler introduction. After the refresher we discuss methods to compare two groups on a categorical or quantitative dependent variable. We use different test for independent and dependent groups.
Inferential Statistics Inferential statistics are those methods of statistics that seek correlations between two or more variables using a simple, but robust, test statistic. Inferential statistics are methods that are based on simple, non-parametric tests. A statistical experiment is the collection of observations that provide evidence for or against a causal relationship. In the course of using inferential statistics, one learns the basic concepts that are used in all experiments: the assumption of correlation, the design of the analysis, the assumption of linearity, and the nature of the test statistic. The course focuses on the concepts, which are introduced in the linearity, probabilistic inference, and instrumental variables. The course will enable students to understand and apply the concepts of inference and model checking, and to use these concepts in data analysis and model selection. The course will also focus on the design of the statistical model used to predict outcomes in a clinical trial. The students will learn the basic principles of design, analysis, and interpretation of results, and the nature of the statistical model used to make inferences for different outcomes. The course will also cover the basic concepts used in interpreting results, including the nature of the statistical model, the assumption of linearity, and the statistical approach used to make inferences. The course will also cover the basic methods for selecting the model that provides the evidence for a given outcome, and the choice of statistical model to predict the outcome. The course is designed to be useful to all students
Article | Example |
---|---|
Descriptive statistics | Descriptive statistics are statistics that quantitatively describe or summarize features of a collection of information. Descriptive statistics are distinguished from inferential statistics (or inductive statistics), in that descriptive statistics aim to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, are not developed on the basis of probability theory. Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented. For example in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related comorbidities etc. |
Statistics | "Applied statistics" comprises descriptive statistics and the application of inferential statistics. "Theoretical statistics" concerns both the logical arguments underlying justification of approaches to statistical inference, as well encompassing "mathematical statistics". Mathematical statistics includes not only the manipulation of probability distributions necessary for deriving results related to methods of estimation and inference, but also various aspects of computational statistics and the design of experiments. |
Multiset | See hypergeometric distribution and inferential statistics for further on the distribution of hits. |
Work–life interface | · Simple inferential statistics are preferred (79%) instead of, for example, structural equation modeling (17%). |
Mathematical statistics | Statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation. Initial requirements of such a system of procedures for inference and induction are that the system should produce reasonable answers when applied to well-defined situations and that it should be general enough to be applied across a range of situations. Inferential statistics are used to test hypotheses and make estimations using sample data. Whereas descriptive statistics describe a sample, inferential statistics infer predictions about a larger population that the sample represents. |
Mathematical statistics | Nonparametric statistics are statistics not based on parameterized families of probability distributions. They include both descriptive and inferential statistics. The typical parameters are the mean, variance, etc. Unlike parametric statistics, nonparametric statistics make no assumptions about the probability distributions of the variables being assessed. |
List of open-source software for mathematics | Descriptive statistics involves methods of organizing, picturing and summarizing information from data. Inferential statistics involves methods of using information from a sample to draw conclusions about the Population. |
List of open-source software for mathematics | Statistics is the study of how to collate and interpret numerical information from data. It is the science of learning from data and communicating uncertainty. There are two branches in statistics: ‘Descriptive statistics’’ and ‘’ Inferential statistics |
Statistical inference | Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and does not assume that the data came from a larger population. |
Partition of sums of squares | The above information is how sum of squares is used in descriptive statistics; see the article on total sum of squares for an application of this broad principle to inferential statistics. |
Design for Six Sigma | Although many tools used in DFSS consulting such as response surface methodology, transfer function via linear and non linear modeling, axiomatic design, simulation have their origin in inferential statistics, statistical modeling may overlap with data analytics and mining, |
Number of Identified Specimens | NISP should not be used when calculating a sample size for inferential statistics, because it will inflate the statistical significance. Thus in these situations MNI should be used instead. |
Pseudomedian | In inferential statistics, the pseudomedian of a finite populations is the location parameter computed by the Hodges–Lehmann statistic. It coincides with a population median when the population is symmetric. |
Clint Ballinger | Clint J. Ballinger is an American social scientist who writes about the misapplication of inferential statistics in the social sciences, international development, as well as geographic determinism from a consequentialist ethical perspective. |
Nonparametric statistics | Nonparametric statistics are statistics not based on parameterized families of probability distributions. They include both descriptive and inferential statistics. The typical parameters are the mean, variance, etc. Unlike parametric statistics, nonparametric statistics make no assumptions about the probability distributions of the variables being assessed. The difference between parametric models and non-parametric models is that the former has a fixed number of parameters, while the latter grows the number of parameters with the amount of training data. Note that the "non"-parametric model does, counterintuitively, contain parameters: the distinction is that parameters are determined by the training data in the case of non-parametric statistics, not the model. |
Foundations of statistics | Classical inferential statistics was largely developed in the second quarter of the 20th Century, much of it in reaction to the (Bayesian) probability of the time which utilized the controversial principle of indifference to establish prior probabilities. The rehabilitation of Bayesian inference was a reaction to the limitations of frequentist probability. More reactions followed. While the philosophical intepretations are old, the statistical terminology is not. The current statistical terms "Bayesian" and "frequentist" stabilized in the second half of the 20th century. |
Statistics | Two main statistical methods are used in data analysis: descriptive statistics, which summarizes data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draws conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a "distribution" (sample or population): "central tendency" (or "location") seeks to characterize the distribution's central or typical value, while "dispersion" (or "variability") characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena. |
Basic and Applied Social Psychology | In 2015, the journal banned p-values (and related inferential statistics such as confidence intervals) as evidence in papers accepted by the journal, replacing hypothesis testing with "strong descriptive statistics, including effect sizes" on the grounds that "the state of the art [for hypothesis testing] remains uncertain". |
Foundations of statistics | Inferential statistics is based on models. Much of classical hypothesis testing, for example, was based on the assumed normality of the data. Robust and nonparametric statistics were developed to reduce the dependence on that assumption. Bayesian statistics interprets new observations from the perspective of prior knowledge – assuming a modeled continuity between past and present. The design of experiments assumes some knowledge of those factors to be controlled, varied, randomized and observed. Statisticians are well aware of the difficulties in proving causation (more of a modeling limitation than a mathematical one), saying "correlation does not imply causation". |
Informal inferential reasoning | In statistics education literature, the term "informal" is used to distinguish informal inferential reasoning from a formal method of statistical inference. |