Statistics 560: Introduction to Mathematical Statistics
Fall 2013
Course Syllabus

I make these assumptions about my students:

 

·        They have had at least two semesters of calculus and appreciate math

·        They value the use of statistics to solve problems

·        They have the intellectual curiosity to read books that give
a higher level understanding of what is covered in the course

·        They value learning technology as a way to improve their job
prospects. I will use a variety of technologies to help students in this goal.

Course Information
 

·        Catalog Description: STAT 560 – Introduction to Mathematical Statistics (3) Probability, probability distributions, simulation of random variables, sampling distributions, central limit theorem, testing of hypotheses, confidence intervals, maximum likelihood methods, Bayesian methods. Credit Restriction: Not for credit for MS with a major in statistics or management science. Recommended Background: Mathematics 241. Comment(s): A course equivalent to Mathematics 241 also is acceptable.

·        Instructor: Ramón V. León

o   Email: rleon@utk.edu

o   Cell phone: 865 773 2245.

You can text me at just about any time. Normally, I will respond right away or as soon as I can. You can also call me, but I prefer that you text me first to set up a convenient time for us to talk or Skype. (See next.) If I don’t respond within a reasonable time call me on the phone directly.

o   Skype: ramonvleon.

Having a Skype account is required for this course: Once you have it send me a request to be my contact. Make sure to state in your request that you are my student. Otherwise I will not accept your request as I am very popular with the women of Ghana.  Think of my Skype address as my virtual office. With Skype I can share with you my computer screen to show you how to work a problem or demo software and see each other to increase rapport. Whenever my computer is on I am accessible in Skype. With Skype you can also chat with me using its chatting capabilities. 

o   Office Hours: 3:30—4:30 MTWR using Skype (ramonvleon). I will frequently be in my office (SMC 249) at these times. Please text me to check if I will be there on a given day or to request that I meet you in my office on a given day. You can also meet me by appointment.

 

·        Teaching Assistant:  Thomas Tilson

o   Email: ttilson1@utk.edu

o   Skype: tltilson

o   Office Hours: TR 5:00 – 6:00 via Skype (tltilson) or in person in the common area in front my office, that is, in front of SMC 249

 

·        Class Meeting Time and Place: MW 5:05 – 6:20 p.m. via Blackboard Collaborate. This program can be accessed from Blackboard by going to the “Tools” tab and then selecting “Blackboard Collaborate.”

 

·        OneNote Website: This course in supported by a OneNote website that you can access from Blackboard. Just click on the OneNote tab in the upper-left corner of Blackboard. All the class notes and other course material are available there. If you are signed in to the site, as explained below, you will be able to download files by right-clicking on them. (This may not work when using Chrome and if so just use Firefox or Internet Explorer.)

 

·        Windows Live Account: To be able to download files by right clicking on them from the course’s OneNote website you need to have a Windows Live account and be signed in. You automatically have a Windows Live account if you have an email address with one of these suffixes @hotmail.com, @outlook.com, or @live.com. If you sign in as you would normally do to check your email you will be also signing into your Windows Live account and thus be able to download files by right clicking on them. (Again, this may not work when using Chrome and if so just use Firefox or Internet Explorer.) If you do not have one of these email addresses (or what is the same thing Window Live account) you can register for one of these here. Warning: When one is at the OneNote website one can sign in using one’s UT ID and password, but for some reason signing in that way will not give you the ability to download files by right clicking one them. Bottom line: You need to open a Windows Live account directly from Microsoft.

·        Textbook: None since I supply very complete notes of my lectures. I will also refer you to Wikipedia and other web resources.

 

·        JMP Pro: (version 10.0) statistical software will be used throughout the course. JMP is very easy to learn and I will be demoing it through the course. Both PC and MAC versions can be downloaded for free at the following web address: https://web.dii.utk.edu/softwaredistribution/.  After logging into this site, click on “SAS”, then select JMP Pro 10.0 for Windows or JMP Pro 10.0 for Mac. Scroll up or down to see the “Download selected item” button. Detailed instructions for downloading and installing this software can be found at http://web.utk.edu/~cwiek/JMPinstall/. We strongly encourage you to obtain this software for your own computer.  However, JMP software can also be accessed at many of the computing labs on campus, and through the “APPS Server” at http://apps.utk.edu/

·        Exams: There will be a take-home midterm and a take-home final exam. (The course schedule will be provided later.)

·        Optional Project: The project is optional and not really necessary for this mathematical course. However, if you have some data pertaining to your job that you would like my help in analyzing I am game. If you decide to do a project you need to send me a written proposal as a MS Word document describing the data set and the questions that you plan to answer with it. You need to submit with your proposal a JMP file containing the data. For each column of your JMP file you need to use the Notes feature of JMP found in the Column Properties drop-down menu to fully explain what the variable in the column is. It is not enough to simply provide the necessarily cryptic column name. Please this form to write the proposal.

 

Upon my approval of this proposal you are to analyze the data and write a report with your analysis and conclusions. The report should be submitted as a MS Word file. The JMP file with the project data should be attached. This file should have the columns notes required in the proposal and have your most important JMP analyses saved as scripts to the data table. (I will show you how to do this.) The report should be at most eight pages long and should contain a summary of the steps you used in your analysis. You should include only a few selected graphs and tables. The final model that you used in the analysis should be clearly stated at the end of the report with a summary of the reasons why you selected it. Use this form to write the project report.

 

·       Assignments: All assignments will be made available in the Submission tab of Blackboard where you will also submit them. Assignments should be submitted as MS Word files—not as files from any other word processor or as PDFs. This will facilitate us giving you detailed assignment feedback. For several assignments you will have the opportunity to submit them twice. The first submission will be graded on effort; the second submission will be graded on correctness after we go over what students had difficulty in the first submission in an outside-class-time online help session. (There will also be comments particular to you when we return your first submission.)

 

·        Book and other Reports: You must use this form to do your report. Need to write reports on:

 

o   Calculated Risks: How To Know When Numbers Deceive You by Gerd Gigerenzer

o   Part 1 (Ch. 1 to 8) of Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society by Jim Manzi 

o   Those who have done reports for one of these books in one my earlier classes should do a report instead on either of these two books:

§  Thinking, Fast and Slow by Daniel Kahnerman

§  The Black Swan: Second Edition: The Impact of the Highly Improbable by Nassim Nicholas Taleb.

§  Against the Gods: The Remarkable Story of Risk by Peter L. Bernstein

 

o   Other short reports may be assigned throughout the semester counted as homework. For these you don't need to use the book report form .

 

·        Grade: Your course grade will be computed as follows: 

 

Percent of Grade

 

Activity

25%

Midterm exam

30%

Final exam

10%

Book report on Calculated Risk

10%

Book report on Uncontrolled

20%

Homework

5%

Surveys (These will be conducted after every class, primarily to find out what needs more explanation.)

100%

Course Score

 

·        Grades will be computed using the usual 90+ A and so on.

·        Course schedule: To be provided. 

·        Attendance: Preferably, you should listen to the lectures in real time so that you can interact with me and other students. In cases where this is impossible you must listen to the lecture recordings available on Blackboard. Blackboard Collaborate will tell me who watched the lectures either in real-time or via their recording.

·        Disability: If you need course adaptations or accommodations because of a documented disability or if you have emergency information to share, please contact the Office of Disability Services at 191 Hoskins Library at 974-6087. This will ensure that you are properly registered for services.

Tentative Course Topics

1.      Bayes Analysis of Mammogram

·        Should women in their forties get monogram?

·        Reasons for recent doctor recommendation that they should not.

o   Breast cancer rate among these women

o   Selectivity and specificity of mammograms.

·        P(Cancer| Positive mammogram) as a function of the base
rate of cancer in the population of interest

o   Plot of this probability versus the base rate

·        Connection to Bayesian inference in general

2.      What Is Statistics?

·        Drawing conclusions from data

·        Reasoning under uncertainty

·        Variation

 

3.      Describing Data

·        Bar charts

·        Histograms

·        Mean, variance and standard deviation

·        Median and IQR

·        Five-number summary

·        Box plots

o   Calculation and construction

o   Comparing distributions using them

4.      Reliability Data

·        Characteristics of reliability data

o   Right censoring

o   Left censoring

o   Interval censoring

o   Truncation

·        Entering reliability data in JMP

·        Life tests versus inspection data

·        Reliability data examples

o   Ball bearings

o   Integrated circuits

o   Shock absorbers: multiple failure modes

o   Heat exchangers

o   Turbine wheels inspections data

o   Circuit pack truncated data resulting from factory burn in

·        Estimates of the cumulative distribution function

o   Empirical distribution function

o   Kaplan-Meier estimate of the distribution function when one has right censoring

 

5.      Opinion Polls and Confidence Intervals for Proportions

·        Margin of error

·        Populations versus samples

·        Random samples

·        Definition of confidence based on repeated random samples

o   Simulation

·        Confidence intervals for proportions

o   Formula

o   Heuristic derivation of this formula

·        How JMP calculates confidence intervals

·        Conservative calculation of ME used in polls where p is assumed to be equal to 0.5 regardless of the sample proportions

·        ME as a function of sample size

o   Calculation of sample size for a desired ME

·        Non-effect of population size on the ME if the random sample is less than 10% of the population.

o   Justification using finite sample correction

6.      Bootstrap Confidence Intervals for Proportion

·        Concepts

·        Calculating them using JMP

·        Contrast between normal theory confidence intervals and the bootstrap ones

 

7.      Normal distribution

·        Density function

·        Cumulative distribution  function

·        Population mean and variance

·        Empirical (68-95-99.7) rule

·        Standardization

·        Calculations of probabilities based on online applet

·        Calculation of percentiles and vice versa using applet

·        Normal probability plots

·        Central Limit Theorem

·        Simulation of normal random variables

8.      Distributions in General

·        Random variables

·        Describing distributions

o   Cumulative distribution function

o   Density function and probability mass function

o   Survival function

o   Quantiles and percentiles

o   Hazard rate and its interpretation


 

 

9.      Concepts of interest for reliability and maintainability engineers

·        Bathtub curve of hazard rate: infant mortality, useful life and wear-out phases

·        Equivalence of mortality rates and (hazard) failure rates.

·        Social security life tables and mortality rates

·        B10

·        Problems with using the mean when there are right censoring.

 

10.   Distributions of Particular Interest for Reliability and Maintainability Engineers

·        Exponential

o   Memory-less property and its interpretation in terms of wear

o   Calculation of its mean and variance using symbolic mathematics via Wolfram Alpha.

·        Weibull

o   Interpretation of alpha and beta parameters

·        Lognormal

·        How the Weibull and the lognormal compare

·        Gamma

·        Simulation

11.   Analyzing Reliability Data

·        JMP’s Life Distribution and Reliability platforms

·        Heuristic interpretation of JMP output

·        Probability plotting to identify best distributions

·        Individual versus simultaneous confidence bands

·        Multiple modes of failure: competing risks

12.   Maximum Likelihood Estimation

·        Binomial case with heuristic interpretation

·        Review of MLH asymptotic theory and associated formulas

·        MLE for the exponential distribution with right censoring using asymptotic theory

13.   Basic Probability

·        Terminology of randomness

·        Law of large numbers

·        Nonexistence of “Law of Averages”

·        Types of probability

o   Classical probability based on symmetry

o   Frequentist

o   Personal

o   Axiomatic

·        Probability theorems

o   Venn diagrams

o   Conditional probability

o   Independence

o   Law of Total Probability and its derivation

o   Bayes rule and its derivation


 

14.   Bayesian inference

·        Concepts

o   Priors

o   Using gambling odds for the elicitation of the prior

o   Likelihood

o   Posteriors

o   Credibility intervals

o   Prediction

·        Bayesian updating: first principle calculations

o   Mammogram data revisited

o   Coin tossing with a three point prior (mass at 0, 0.5 and 1)

o   Oranges and apples

·        Bayesian updating: Conjugate priors

o   Family of life distributions having conjugate priors

o   Bayesian inference for proportion using the Beta conjugate prior

·        Bayesian updating:  Markov chain Monte Carlo (MCMC)

o   Rejection-acceptance algorithm: mechanics, software, heuristics interpretations, and derivation

o   Application in reliability engineering when the engineers has information about the hazard rates

·        Informative versus non-informative priors

·        Bayesian network diagrams

 

15.   Comparison of the Normal Theory and Bootstrap Confidence Intervals with Credibility Intervals

·        Review of how confidence intervals are defined

·        Review of credibility intervals are defined

·        Population proportion case

·        Numerical examples

16.   Concept in the Testing of Hypotheses

·        Null vs. alternative hypotheses

·        Reasoning used in testing

·        P-values and test of significance

o   Examples

o   Higgs Boson and what do physicists mean when they say that they have 5 sigma evidence for its existence

·        More concepts important in testing

o   Statistical significance

o   Practical significance

o   Type I and II errors

o   Alpha

o   Beta

o   Tension between alpha and beta.

o   Power and OC  curves

o   What affects power: effect size, alpha level and sample size

o   Determining the sample size for a given alpha, important effect size and corresponding beta

·        Relationship between two-sided tests and confidence intervals

·        Chi-square test

17.   Standard Deviation as a Ruler

·        Changes in location and scale and standardization

o   Celsius versus Fahrenheit scales

·        z-scores

·        When is a z-score big?

·        Empirical rule revisited

18.   Classical Normal Theory confidence intervals

·        Joint derivation based on parameter estimates and standard errors

·        SD deviation as ruler revisited

o   Standard errors

·        List of parameters involved

o   Proportions

o   Difference of proportions

o   Means

o   Difference of means: independent samples

o   Difference of means: paired samples

·        Examples of the problems that these confidence intervals address

·        Assumptions

o   Randomization

o   Independence

o   10% condition

o   Independence

·        Calculation of them using JMP

o   Data entry

o   Interpretation of JMP output in context

19.   Simple Regression

·        Logistic regression

·        Simple linear regression

o   Assumptions and how to check them

o   Outliers and high leverage points

·        Interpretation of JMP output

·        Regression wisdom

 

20.    Regression with two independent variables

·        Continuous regressors

·        Categorical regressors and dummy variables

·        Effect of a variable after adjusting for the effect of another variable.

·        Interactions

o   Interaction in the context of one categorical and one continuous regressor

o   Analysis of covariance

o   Interactions when one has two continuous regressors

·        Multicollinearity

·        Example of general multivariate regression

o   Brief discussion of JMP output


 

 

21.   Further Topics in Regressions: Highlights

·        Generalized Linear Models

o   Framework

o   Link function

o   JMP Generalized linear model platform

·        Weibull regression with right-censored data

o   Model

o   Accelerated life testing with temperature as a regressor

o   JMP output interpretation on the basis on an example

·        Poisson regression for counts and rates

o   Model,

o   Over-dispersion

o   Excessive numbers of zeros

o   JMP output interpretation on the basis on an example

·        Cox proportional hazards model

o   Model

o   JMP output interpretation on the basis on an example

 

In addition, to the lectures on the scheduled material, there will be occasional enrichment lectures motivated by students’ interests.