Learning objectives
Students should be able to do each of the following by the end of this course:
What is statistics? overview of the course
- Describe the central goals and fundamental concepts of statistics.
- Describe the difference between experimental and observational research with regard to what can be inferred about causality
- Explain how randomization provides the ability to make inferences about causation.
Working with data
- Distinguish between different types of variables (quantitative/qualitative, discrete/continuous)
- Describe the concept of measurement error
- Distinguish between the concepts of reliability and validity and apply each concept to a particular dataset
R Lab: Basics
- Interact with an RMarkdown notebook in RStudio
- Describe the difference between a variable and a function
- Create a vector, matrix, or data frame and access its elements
- Load data from an R package and view the data
Summarizing data
- Compute absolute, relative, and cumulative frequency distributions for a given dataset
- Generate a graphical representation of frequency distributions
- Describe the difference between a normal and a long-tailed distribution, and describe the situations that give rise to each
Visualizing data
- Describe the principles that distinguish between good and bad graphs, and use them to identify good versus bad graphs.
R Lab: Data visualization
- Plot summary graphs using ggplot
Fitting models (central tendency)
- Describe the basic equation for statistical models (outcome=model + error)
- Describe different measures of central tendency, how they are computed, and which are appropriate under what circumstance.
Fitting models (central tendency) and z-scores
- Describe different measures of dispersion, how they are computed, and how to determine which is most appropriate in any given circumstance. Describe and compute z-scores.
Probability
- Describe the sample space for a selected random experiment.
- Compute relative frequency and empirical probability for a given set of events
- Compute probabilities of single events, complementary events, and the unions and intersections of collections of events.
- Describe the law of large numbers.
- Describe the difference between a probability and a conditional probability
- Describe the concept of statistical independence
- Use Bayes’ theorem to compute the inverse conditional probability.
Sampling
- Distinguish between a population and a sample, and between population parameters and statistics
- Describe the concepts of sampling error and sampling distribution
- Describe how the Central Limit Theorem determines the nature of the sampling distribution of the mean
Resampling
- Describe the concept of a Monte Carlo simulation.
- Describe the meaning of randomness in statistics
- Obtain random numbers from the uniform and normal distributions
- Describe the concept of the bootstrap
Hypothesis testing
- Identify the components of a hypothesis test, including the parameter of interest, the null and alternative hypotheses, and the test statistic.
- Describe the proper interpretations of a p-value as well as common misinterpretations
- Distinguish between the two types of error in hypothesis testing, and the factors that determine them.
- Describe how resampling can be used to compute a p-value.
- Describe the main criticisms of null hypothesis statistical testing
Quantifying effects and designing studies
- Describe the proper interpretation of a confidence interval, and compute a confidence interval for the mean of a given dataset.
- Define the concept of effect size, and compute the effect size for a given test.
- Define the concept of statistical power, and compute statistical power for a given statistical test.
Modeling categorical relationships
- Describe the concept of a contingency table for categorical data.
- Describe the concept of the chi-squared test for association and compute it for a given contingency table.
Modeling continuous relationships
- Describe the concept of the correlation coefficient and its interpretation and compute it for a bivariate dataset
- Describe the potential causal influences that can give rise to a correlation.
The general linear model
- Describe the concept of linear regression and apply it to a bivariate dataset
- Describe the concept of the general linear model and provide examples of its application
Comparing means
- Determine whether a one-sample t-test or two-sample t-test is appropriate for a given hypothesis.
- Compute a one-sample and two-sample t-test on relevant datasets, and compute the effect size and confidence intervals associated with each of these tests.
Statistical Inference R lab
- Demonstrate the ability to apply statistical models to real data in R
Statistical modeling: Practical examples
- Describe how to determine what kind of model to apply to a dataset
Doing reproducible research
- Describe the concept of P-hacking and its effects on scientific practice
- Describe the concept of positive predictive value and its relation to statstical power