Course teached as: B016441 - STATISTICAL INFERENCE Second Cycle Degree in ECONOMICS AND DEVELOPMENT Curriculum DEVELOPMENT ECONOMICS
Teaching Language
English
Course Content
The course aims at providing the cornerstones of inferential statistics: the concept of statistical model, the tools of point estimation, interval estimation and statistical hypotheses testing. The simple linear regression model is also introduced.
Hogg, R. V., Tanis E. A., Zimmerman D. L. P ROBABILITY AND STATISTICAL INFERENCE, Pearson, >= 9th edition
or
Mittelhammer, R. C. Mathematical Statistics for Economics and Business. Springer, >= 2nd edition
This second textbook does not include the linear regression part, for which additional material is provided.
Learning Objectives
Statistics deals with collecting, organizing and interpreting numerical data. Statistical literacy is an essential skill for understanding and taking sensible decisions based on the analysis of numerical information. Within this framework, the course aims at providing the cornerstones of inferential statistics: the concept of statistical model, the tools of point estimation, interval estimation and statistical hypothesis testing, linear regression.
Prerequisites
1) Mathematical concepts: basic operations and properties; capital sigma (generalized sum) and pi (generalized product) operators and their properties; functions; special functions (power, exponential, logarithm); derivatives; basic notions of series and integrals
2) At least 9 CFU in Statistics
3) At least the basic notions of probability: random experiment, sample space, events, probability and its properties, conditional probability and its properties
4) At least the basic notions on random variables (rv): definition; discrete vs continuous rv's; cumulative distribution function (cdf), probability mass function (pmf), probability density function (pdf); expectations, with special focus on mean and variance.
5) At least the basic notions on multiple rv: definition; joint, marginal and conditional distributions; expectations, with special focus on covariance and correlation coefficient.
Students without this background can review the specific chapter of the textbook
Teaching Methods
Traditional lessons showing printed notes (which are uploaded on Moodle).
Further information
A preliminary short course is supplied before the starting date
All material is available on the Moodle page.
Type of Assessment
The exam consists of two parts:
1) A written test with exercises. The student can use formulas written behind the 9 sheets of the statistical tables. Duration 1h:30'; weight 60%.
2) Oral exam with questions on the theory. Duration 30'; weight 40%.
Students have to show that they are capable of understanding the basic notions of inference and the related most commonly used probabilistic models. Students must show not only that they are capable of using a formula, but that they are able to understand the scenario in which they are, choose a suitable probabilistic model, and choose and employ the appropriate inferential procedures for the goal at hand.
Course program
Preliminary short course
Probability
1) Random Variable (r.v.): definition; examples; domain (support) of a r.v.; discrete and continuous r.v.'s. Discrete r.v.: distribution of a discrete r.v. via probability mass function (p.m.f.); properties; examples.
2) Discrete r.v. The distribution of a discrete r.v. via cumulative distribution function (c.f.d.). Properties of the c.d.f. Examples. Expectations of discrete r.v.'s: mean, variance, standard deviation (s.d.).
3) Discrete r.v. The mean and the variance of some transformations of a r.v. X: mean and variance of a constant (c), of the de-meaned r.v. (X - mu), of the standardized r.v. (X - mu)/sigma, of a linear tranformation (a + b X). Continuous r.v. Motivations: why the p.m.f. does not make sense while the c.d.f. can still play a role. Using the c.d.f. to compute probabilities.
4) Continuous r.v. Definition, interpretation and properties of the p.d.f. Link between c.d.f. and p.d.f of the same r.v. Expectations of continuous r.v.'s, with a specific emphasis on the mean and on the variance.
5) Multiple r.v.: definition; examples; domain of a multiple r.v.; discrete, continuous and mixed multiple r.v.'s. Multiple discrete r.v.: definition of the joint p.m.f.; relationships with the marginal p.m.f. and the conditional p.m.f.; properties.
6) Multiple discrete r.v. Expectations involving multiple discrete r.v.'s: mean, variance and standard deviation of the marginal components; covariance and correlations between couples of random variables and their interpretation. Multiple continuous r.v. Definition of the joint p.d.f.
7) Multiple continuous r.v. Properties of the joint p.d.f. Joint, marginal and conditional p.d.f.'s. Expectations involving multiple continuous r.v.'s: mean, variance and standard deviation of the marginal components; covariance and correlations between couples of random variables. Multiple r.v. Independence of r.v's; independence versus absence of correlation. Examples.
8) Multiple r.v. Properties of covariance and correlation coefficient. Mean and variance of a portfolio (linear combination) of random variables and some useful special cases. Special r.v.'s. Summary of the points touched in handling special r.v's: definition (in terms of p.m.f. or p.d.f.); main expectations (mean and variance); properties; some practical examples (when possible).
Regular course
9) Special r.v.'s. The Bernoulli r.v. The Binomial r.v.; The Poisson r.v.
10) Special r.v.'s. The Continuous Uniform r.v. The Normal (or Gaussian) r . v .
11) Special r.v.'s. The use of the Standard Normal tables to compute probabilities and intervals with Normal r.v.'s
12) Special r.v.'s. The Gamma r.v., the Chi-squared r.v., the Student-T r.v., the Fisher-F r.v.
13) Point estimation. Introduction to the concepts of population, sample, parameter, statistic and estimator, statistic value and estimate, sample distribution of a statistic and related synthetic indices.
14) Point estimation. Properties of estimators: the mean squared error (MSE) and the concept of relative and absolute efficiency. In quest of the most efficient estimator: motivations for applying some restriction to the set of possible estimators taken into account; decomposition of the MSE as variance plus bias^2; unbiased estimators.
15) Point estimation. In quest of the most efficient estimator: the Cramer-Rao bound as benchmark for checking the absolute efficiency of unbiased estimators. The Maximum Likelihood (ML) method: definition of likelihood, log-likelihood, score vector. The ML method at work: the estimation of p in the Bernoulli model.
16) Point estimation. The ML method at work: the estimation of lambda in the Poisson model; the estimation of mu and/or sigma^2 (depending on if one or both parameters are unknown) in the Normal model.
17) Point estimation. Derivation of the properties (sample distribution, bias, variance, MSE, check of the Cramer-Rao bound for unbiased estimators) of the ML estimators computed.
18) Point estimation. ML estimation of parameters of the Gamma model as a motivation for introducing asymptotic properties. Asymptotic properties: consistency, asymptotic unbiasedness, asymptotic efficiency, asymptotic sample distribution.
19) Point estimation. ML estimators as C.A.N.E. (Consistent Asymptotically Normal Efficient) estimators. Interval Estimation. Introduction to the statistical problem by comparing interval estimation with point estimation.
20) Interval Estimation. Definition of interval estimate (confidence interval), confidence level, size of the interval. The Pivot method for finding confidence intervals: definition of pivot quantity and illustration of how the method works in practice. Interval Estimation. Pivots and corresponding intervals for: the mean of a Normal r.v. (variance known).
21) Interval Estimation. Pivots and corresponding intervals for: the mean of a Normal r.v. (variance unknown); the variance and the s.d. of a Normal r.v. (mean known and unknown).
22) Interval Estimation. Pivots and corresponding intervals for: the probability of a Bernoulli r.v.; the mean of a Poisson r.v. How to use the theory behind interval estimation for computing the sample size of a survey aiming at estimating a probability or a mean.
23) Testing Hypotheses. Motivations, framework, definition of statistical hypothesis (simple and composite), definition of statistical test.
24) Testing Hypotheses. Table of decisions, type I and type II errors, significance level and power of a test. The Neyman-Person lemma and ensuing remarks. Examples.
25) Testing Hypotheses. Comparison of different specifications of the alternative hypothesis (pointwise, unidirectional, bidirectional) and consequences on the rejection region. More on the role of the power of a test. The factors influencing the power a test.
26) Testing Hypotheses. Testing hypotheses concerning: the mean parameter of a Normal r.v. (cases sigma^2 known and sigma^2 unknown); the probability parameter of a Bernoulli r.v. The p-value: definition, computation and interpretation.
27) Testing Hypotheses. Testing hypotheses concerning: the variance of a Normal r.v. (cases mu known and mu unknown); the difference between the probabilities of two independent Bernoulli distributions (and remarks on point estimation and interval estimation in the same situation).
28) Testing Hypotheses. Testing hypotheses concerning: the difference between the means of two Normal r.v.'s by means of independent samples (with the two variances known; with large samples and the two variances unknown; with the two variances unknown but equal and, related to this case, the pooled sample variance).
29) Testing Hypotheses. Testing hypotheses concerning: the difference between the means of two Normal r.v.'s, by means of independent samples, with the Satterthwaite-Welsh statistic; the difference between the means of two Normal r.v.'s by means of paired data.
30) Testing Hypotheses. Asymptotic tests: Likelihood-ratio, Score and Wald tests.
31) Linear Regression Model. Introduction; model definition and corresponding properties; Ordinary Least Squares (OLS) estimators of the parameters; fitted values and residuals. Linear Regression Model.
32) Properties of OLS estimators: their sample distribution; Best Linear Unbiased Estimators (BLUE) and discussion of the Gauss-Markov property. Examples.
33) Linear Regression Model. Deviance decomposition and R^2 index; predictions of the conditional mean and of the dependent variable for a given value of the independent variable.