Probability and Statistics: The Science of Uncertainty
A comprehensive introductory university-level course on the mathematical foundations of probability and statistics. Requiring one year of calculus, the course covers probability models, random variables, expectation, sampling distributions, likelihood and Bayesian inference, and relationships among variables.
Gambaran Umum Kursus
📚 Content Summary
A comprehensive introductory university-level course on the mathematical foundations of probability and statistics. Requiring one year of calculus, the course covers probability models, random variables, expectation, sampling distributions, likelihood and Bayesian inference, and relationships among variables.
Master the rigorous mathematical science of uncertainty through calculus-based probability and statistical inference.
Author: Michael J. Evans and Jeffrey S. Rosenthal
Acknowledgments: The authors acknowledge contributions from various reviewers and colleagues at institutions such as the University of Toronto, McMaster University, and Purdue University. Funding and infrastructure support from the University of Toronto are also noted.
🎯 Learning Objectives
- Define a formal probability model using sample spaces, events, and probability measures.
- Apply combinatorial principles (permutations, subsets, binomial coefficients) to solve uniform probability problems.
- Utilize the Law of Total Probability and Bayes' Theorem to analyze multi-stage systems and update beliefs based on new information.
- Define and distinguish between discrete and absolutely continuous random variables and their respective probability/density functions.
- Identify and apply key probability distributions (Bernoulli, Binomial, Poisson, Normal, etc.) to model real-world phenomena.
- Compute marginal densities, conditional distributions, and assess independence for multivariate distributions.
- Calculate the expected value, variance, and covariance for discrete, continuous, and mixed random variables.
- Apply the Law of the Unconscious Statistician (LOTUS) and linearity properties to compute expectations of transformed variables.
- Derive moments using Probability-Generating (PGF) and Moment-Generating Functions (MGF).
- Define and derive sampling distributions for functions of i.i.d. sequences.
🔹 Lesson 1: Foundations of Probability Models
Overview: This lesson establishes the rigorous mathematical framework for probability, transitioning from the intuitive "measure of uncertainty" to formal axiomatic models. It covers the essential properties of probability measures, combinatorial counting techniques for finite spaces, and the foundational mechanics of conditional probability, including Bayes' Theorem and the continuity of probability measures.
Learning Outcomes:
- Define a formal probability model using sample spaces, events, and probability measures.
- Apply combinatorial principles (permutations, subsets, binomial coefficients) to solve uniform probability problems.
- Utilize the Law of Total Probability and Bayes' Theorem to analyze multi-stage systems and update beliefs based on new information.
🔹 Lesson 2: Random Variables and Probability Distributions
Overview: This lesson explores the mathematical framework for quantifying uncertainty through random variables (RVs). Students will progress from defining RVs and their distributions (discrete and continuous) to understanding joint distributions, transformations, and the methods used to simulate these variables numerically. The content bridges theoretical calculus-based probability with practical applications in modeling and statistical simulation.
Learning Outcomes:
- Define and distinguish between discrete and absolutely continuous random variables and their respective probability/density functions.
- Identify and apply key probability distributions (Bernoulli, Binomial, Poisson, Normal, etc.) to model real-world phenomena.
- Compute marginal densities, conditional distributions, and assess independence for multivariate distributions.
🔹 Lesson 3: Mathematical Expectation and Moments
Overview: This lesson explores the fundamental concept of mathematical expectation as the "long-run average" of a random variable, extending from simple discrete and continuous cases to general arbitrary variables. We will analyze the variability of data through variance and covariance, utilize generating functions (PGF, MGF, and Characteristic Functions) to simplify moment calculations, and apply powerful probability inequalities to bound unknown distributions. Finally, the course covers conditional expectations and the Law of Total Expectation, which are essential for analyzing complex, multi-stage random processes.
Learning Outcomes:
- Calculate the expected value, variance, and covariance for discrete, continuous, and mixed random variables.
- Apply the Law of the Unconscious Statistician (LOTUS) and linearity properties to compute expectations of transformed variables.
- Derive moments using Probability-Generating (PGF) and Moment-Generating Functions (MGF).
🔹 Lesson 4: Sampling Distributions and Limit Theorems
Overview: This lesson explores the behavior of random variables when they are functions of a sample (sampling distributions) and how these distributions behave as the sample size grows (limit theorems). Students will master the transition from finite-sample distributions to asymptotic approximations like the Central Limit Theorem and investigate computational methods such as Monte Carlo approximations and Importance Sampling.
Learning Outcomes:
- Define and derive sampling distributions for functions of i.i.d. sequences.
- Distinguish between and apply Convergence in Probability and Convergence in Distribution.
- Utilize the Central Limit Theorem and Normal Approximation to the Binomial to estimate probabilities.
🔹 Lesson 5: Fundamentals of Statistical Inference
Overview: This lesson explores the transition from pure probability to statistical inference, addressing how we use observed data to make statements about the true underlying probability measures of a system. Students will learn to construct formal statistical models (Bernoulli and Normal), understand rigorous data collection methods like simple random and stratified sampling from finite populations, and summarize findings through descriptive statistics, histograms, and empirical distribution functions.
Learning Outcomes:
- Define the role of statistical inference in addressing uncertainty caused by variation and limited data.
- Construct and interpret statistical models, identifying parameters and parameter spaces.
- Differentiate between population characteristics and sample estimates using simple random and stratified sampling techniques.
🔹 Lesson 6: Likelihood-Based Inference
Overview: This lesson explores the theoretical foundations and practical applications of likelihood-based statistical inference. It transitions from fundamental concepts like the Likelihood Principle and Sufficiency to the estimation of parameters via Maximum Likelihood Estimation (MLE) and the evaluation of these estimators through bias, consistency, and standard errors. Furthermore, the lesson covers both parametric approaches (z-intervals, t-intervals, and hypothesis testing) and distribution-free methods (Method of Moments, Bootstrapping, and Sign Statistics), culminating in the advanced study of Asymptotic Normality and Fisher Information.
Learning Outcomes:
- Define and apply the Likelihood Function and the Factorization Theorem to identify Sufficient and Minimal Sufficient Statistics.
- Calculate Maximum Likelihood Estimators (MLE) and evaluate their quality using Mean Squared Error (MSE), Bias, and Consistency.
- Construct and interpret Confidence Intervals and P-values for various statistical models using both parametric and non-parametric techniques.
🔹 Lesson 7: Bayesian Statistical Inference
Overview: This lesson explores the Bayesian framework of statistical inference, where parameters are treated as random variables with probability distributions. Students will learn to combine prior beliefs (Prior Distributions) with observed data (Likelihood) to produce updated beliefs (Posterior Distributions). The curriculum covers theoretical foundations, practical estimation techniques (Bayes Factors, Prediction), computational methods (Gibbs Sampling, Asymptotic Normality), and the strategic selection of priors.
Learning Outcomes:
- Calculate posterior distributions using Bayes' Theorem for various models, including conjugate families.
- Perform Bayesian estimation (mean, mode) and hypothesis testing using Bayes Factors.
- Construct posterior predictive distributions for future observations.
🔹 Lesson 8: Optimal Inferences and Decision Theory
Overview: This lesson explores the mathematical foundations for finding "best" statistical procedures. We transition from basic estimation to optimal unbiased estimation (UMVU), develop the theory of Uniformly Most Powerful (UMP) tests through the Neyman-Pearson Theorem, and integrate Bayesian perspectives and Decision Theory to evaluate estimators and tests using loss and risk functions.
Learning Outcomes:
- Apply the Rao-Blackwell Theorem and Lehmann-Scheffé Theorem to derive Uniformly Minimum Variance Unbiased (UMVU) estimators.
- Utilize the Cramer-Rao Information Inequality to determine the fundamental lower bound on the variance of unbiased estimators.
- Construct Uniformly Most Powerful (UMP) tests using the Neyman-Pearson Lemma and evaluate them via power functions and error types.
🔹 Lesson 9: Model Checking and Diagnostics
Overview: This lesson explores the critical process of validating the assumptions made during statistical modeling. Students will learn to use discrepancy and ancillary statistics to check sampling models, utilize visual tools like residual and probability plots, and perform formal tests such as Chi-Squared and Fisher’s Exact tests. Additionally, the lesson covers Bayesian model checking through prior-data conflict analysis and warns against the statistical pitfalls of performing multiple simultaneous checks.
Learning Outcomes:
- Define and identify Ancillary Statistics and Discrepancy Statistics used to measure model deviations.
- Construct and interpret Standardized Residuals and Normal Probability Plots to assess normality and model fit.
- Apply the Chi-Squared Goodness of Fit Test and Fisher’s Exact Test to categorical and grouped data.
🔹 Lesson 10: Relationships Among Variables and Regression
Overview: This lesson explores how statistical models describe the dependencies between different variables. It progresses from the fundamental definition of relationship—based on changes in conditional distributions—to sophisticated modeling techniques including Simple and Multiple Linear Regression, Analysis of Variance (ANOVA) for categorical predictors, and Logistic Regression for binary responses. Students will learn to estimate parameters using the Method of Least Squares, evaluate model fit through R-squared and ANOVA decomposition, and validate assumptions via Residual Analysis.
Learning Outcomes:
- Define and identify relationships between variables based on conditional distributions.
- Apply the Method of Least Squares to estimate parameters in simple and multiple linear regression models.
- Utilize ANOVA decomposition and F-statistics to test the significance of predictors and identify interactions.
🔹 Lesson 11: Introduction to Stochastic Processes
Overview: This lesson provides a comprehensive foundation in stochastic processes—systems that evolve randomly over time. Students will progress from discrete-time models, such as Simple Random Walks and Markov Chains, to advanced computational techniques like Markov Chain Monte Carlo (MCMC), and finally to continuous-time processes including Martingales, Brownian Motion, and Poisson Processes.
Learning Outcomes:
- Calculate probabilities for random walks and determine the likelihood of "ruin" in gambling models.
- Analyze Markov chains for irreducibility, periodicity, and stationary distributions.
- Design and explain Metropolis-Hastings and Gibbs sampling algorithms for complex distributions.