Preface
 |
Statistical Modeling for Biomedical Researchers
by William D. Dupont, (2002)
Publisher: Cambridge University Press
ISBN: 0-521-65578-1
Pages: 404 pages
Price: £30.00+ p&p
|
Table of Contents
1 Introduction
- 1.1 Algebraic Notation
- 1.2 Descriptive Statistics
-
- 1.2.1 Dot Plot
-
- 1.2.2 Sample Mean
-
- 1.2.3 Residual
-
- 1.2.4 Sample Variance
-
- 1.2.5 Sample Standard Deviation
-
- 1.2.6 Percentile and Median
-
- 1.2.7 Box Plot
-
- 1.2.8 Histogram
-
- 1.2.9 Scatter Plot
- 1.3 The Stata Statistical Software Package
-
- 1.3.1 Downloading Data from My Web Site
-
- 1.3.2 Creating Dot Plots with Stata
-
- 1.3.3 Stata Command Syntax
-
- 1.3.4 Obtaining Interactive Help from Stata
-
- 1.3.5 Stata Log Files
-
- 1.3.6 Displaying Other Descriptive Statistics with Stata
- 1.4 Inferential Statistics
-
- 1.4.1 Probability Density Function
-
- 1.4.2 Mean, Variance and Standard Deviation
-
- 1.4.3 Normal Distribution
-
- 1.4.4 Expected Value
-
- 1.4.5 Standard Error
-
- 1.4.6 Null Hypothesis, Alternative Hypothesis and P Value
-
- 1.4.7 95% Confidence Interval
-
- 1.4.8 Statistical Power
-
- 1.4.9 The z and Student's t Distributions
-
- 1.4.10 Paired t Test
-
- 1.4.11 Performing Paired t Tests with Stata
-
- 1.4.12 Independent t Test Using a Pooled Standard Error Estimate
-
- 1.4.13 Independent t Test using Separate Standard Error Estimates
-
- 1.4.14 Independent t Tests using Stata
-
- 1.4.15 The Chi-Squared Distribution
- 1.5 Additional Reading
- 1.6 Exercises
2 Simple Linear Regression
- 2.1 Sample Covariance
- 2.2 Sample Correlation Coefficient
- 2.3 Population Covariance and Correlation Coefficient
- 2.4 Conditional Expectation
- 2.5 Simple Linear Regression Model
- 2.6 Fitting the Linear Regression Model
- 2.7 Historical Trivia: Origin of the Term Regression
- 2.8 Determining the Accuracy of Linear Regression Estimates
- 2.9 Ethylene Glycol Poisoning Example
- 2.10 95% Confidence Interval for y[x] = a + Bx Evaluated at x
- 2.11 95% Prediction Interval for the Response of a New Patient
- 2.12 Simple Linear Regression with Stata
- 2.13 Lowess Regression
- 2.14 Plotting a Lowess Regression Curve in Stata
- 2.15 Residual Analyses
- 2.16 Studentized Residual Analysis Using Stata
- 2.17 Transforming the x and y Variables
-
- 2.17.1 Stabilizing the Variance
-
- 2.17.2 Correcting for Non-linearity
-
- 2.17.3 Example: Research Funding and Morbidity for 29 Diseases
- 2.18. Analyzing Transformed Data with Stata
- 2.19. Testing the Equality of Regression Slopes
-
- 2.19.1 Example: The Framingham Heart Study
- 2.20 Comparing Slope Estimates with Stata
- 2.21 Additional Reading
- 2.22 Exercises
3 Multiple Linear Regression
- 3.1 The Model
- 3.2 Confounding Variables
- 3.3 Estimating the Parameters for a Multiple Linear Regression Model
- 3.4 R2 Statistic for Multiple Regression Models
- 3.5 Expected Response in the Multiple Regression Model
- 3.6 The Accuracy of Multiple Regression Parameter Estimates
- 3.7 Leverage
- 3.8 95% Confidence Interval for yhati
- 3.9 95% Prediction Intervals
- 3.10 Example: The Framingham Heart Study
-
- 3.10.1 Preliminary Univariate Analyses
- 3.11 Scatterplot Matrix Graphs
-
- 3.11.1 Producing Scatterplot Matrix Graphs with Stata
- 3.12 Modeling Interaction in Multiple Linear Regression
-
- 3.12.1 The Framingham Example
- 3.13 Multiple Regression Modeling of the Framingham Data
- 3.14 Intuitive Understanding of a Multiple Regression Model
-
- 3.14.1 The Framingham Example
- 3.15 Calculating 95% Confidence and Prediction Intervals
- 3.16 Multiple Linear Regression with Stata
- 3.17 Automatic Methods of Model Selection
-
- 3.17.1 Forward Selection using Stata
-
- 3.17.2 Backward Selection
-
- 3.17.3 Forward Stepwise Selection
-
- 3.17.4 Backward Stepwise Selection
-
- 3.17.5 Pros and Cons of Automated Model Selection
- 3.18 Collinearity
- 3.19 Residual Analyses
- 3.20 Influence
-
- 3.20.1 DFBETA Influence Statistic
-
- 3.20.2 Cook's Distance
-
- 3.20.3 The Framingham Example
- 3.21 Residual and Influence Analyses Using Stata
- 3.22 Additional Reading
- 3.23 Exercises
4 Simple Logistic Regression
- 4.1 Example: APACHE Score and Mortality in Patients with Sepsis
- 4.2 Sigmoidal Family of Logistic Regression Curves
- 4.3 The Log Odds of Death Given a Logistic Probability Function
- 4.4 The Binomial Distribution
- 4.5 Simple Logistic Regression Model
- 4.6 Generalized Linear Model
- 4.7 Contrast Between Logistic and Linear Regression
- 4.8 Maximum Likelihood Estimation
-
- 4.8.1 Variance of Maximum Likelihood Parameter Estimates
- 4.9 Statistical Tests and Confidence Intervals
-
- 4.9.1 Likelihood Ratio Tests
-
- 4.9.2 Quadratic Approximations to the Log Likelihood Ratio Function
-
- 4.9.3 Score Tests
-
- 4.9.4 Wald Tests and Confidence Intervals
-
- 4.9.5 Which Test Should You Use?
- 4.10 Sepsis Example
- 4.11 Logistic Regression with Stata
- 4.12 Odds Ratios and the Logistic Regression Model
- 4.13 95% Confidence Interval for the Odds Ratio Associated with a Unit Increase in x
-
- 4.13.1 Calculating this Odds Ratio with Stata
- 4.14 Logistic Regression with Grouped Response Data
- 4.15 95% Confidence Interval for the P[x]
- 4.16 95% Confidence Intervals for Proportions
- 4.17 Example: The Ibuprofen in Sepsis Trial
- 4.18 Logistic Regression with Grouped Data Using Stata
- 4.19 Simple 2x2 Case-Control Studies
-
- 4.19.1 Example: The Ille-et-Vilaine Study of Esophageal Cancer and Alcohol
-
- 4.19.2 Review of Classical Case-Control Theory
-
- 4.19.3 95% Confidence Interval for the Odds Ratio: Woolf's Method
-
- 4.19.4 Test of the Null Hypothesis that the Odds Ratio Equals One
-
- 4.19.5 Test of the Null Hypothesis that Two Proportions are Equal
- 4.20 Logistic Regression Models for 2x2 Contingency Tables
-
- 4.20.1 Nuisance Parameters
-
- 4.20.2 95% Confidence Interval for the Odds Ratio: Logistic Regression
- 4.21 Creating a Stata Data File
- 4.22 Analyzing Case-Control Data with Stata
- 4.23 Regressing Disease Against Exposure
- 4.24 Additional Reading
- 4.25 Exercises
5 Multiple Logistic Regression
- 5.1 MantelHaenszel Estimate of an Age-Adjusted Odds Ratio
- 5.2 MantelHaenszel x2 Statistic for Multiple 2x2 Tables
- 5.3 95% Confidence Interval for the Age-Adjusted Odds Ratio
- 5.4 Breslow and Day's Test for Homogeneity
- 5.5 Calculating the MantelHaenszel Odds Ratio using Stata
- 5.6 Multiple Logistic Regression Model
- 5.7 95% Confidence Interval for an Adjusted Odds Ratio
- 5.8 Logistic Regression for Multiple 2x2 Contingency Tables
- 5.9 Analyzing Multiple 2x2 Tables with Stata
- 5.10 Handling Categorical Variables in Stata
- 5.11 Effect of Dose of Alcohol on Esophageal Cancer Risk
-
- 5.11.1 Analyzing Model with Stata
- 5.12 Effect of Dose of Tobacco on Esophageal Cancer Risk
- 5.13 Deriving Odds Ratios from Multiple Parameters
- 5.14 The Standard Error of a Weighted Sum of Regression Coefficients
- 5.15 Confidence Intervals for Weighted Sums of Coefficients
- 5.16 Hypothesis Tests for Weighted Sums of Coefficients
- 5.17 The Estimated Variance-Covariance Matrix
- 5.18 Multiplicative Models of Two Risk Factors
- 5.19 Multiplicative Model of Smoking, Alcohol, and Esophageal Cancer
- 5.20 Fitting a Multiplicative Model with Stata
- 5.21 Model of Two Risk Factors with Interaction
- 5.22 Model of Alcohol, Tobacco, and Esophageal Cancer with Interaction Terms
- 5.23 Fitting a Model with Interaction using Stata
- 5.24 Model Fitting: Nested Models and Model Deviance
- 5.25 Effect Modifiers and Confounding Variables
- 5.26 Goodness-of-Fit Tests
-
- 5.26.1 The Pearson x2 Goodness-of-Fit Statistic
- 5.27 HosmerLemeshow Goodness-of-Fit Test
-
- 5.27.1 An Example: The Ille-et-Vilaine Cancer Data Set
- 5.28 Residual and Influence Analysis
-
- 5.28.1 Standardized Pearson Residual
-
- 5.28.2 DFBETA Influence Statistic
-
- 5.28.3 Residual Plots of the Ille-et-Vilaine Data on Esophageal Cancer
- 5.29 Using Stata for Goodness-of-Fit Tests and residual Analyses
- 5.30 Frequency Matched CaseControl Studies
- 5.31 Conditional Logistic Regression
- 5.32 Analyzing Data with Missing Values
-
- 5.32.1 Cardiac Output in the Ibuprofen in Sepsis Study
-
- 5.32.2 Modeling Missing Values with Stata
- 5.33 Additional Reading
- 5.34 Exercises
6 Introduction to Survival Analysis
- 6.1 Survival and Cumulative Mortality Functions
- 6.2 Right Censored Data
- 6.3 KaplanMeier Survival Curves
- 6.4 An Example: Genetic Risk of Recurrent Intracerebral Hemorrhage
- 6.5 95% Confidence Intervals for Survival Functions
- 6.6 Cumulative Mortality Function
- 6.7 Censoring and Bias
- 6.8 Logrank Test
- 6.9 Using Stata to Derive Survival Functions and the Logrank Test
- 6.10 Logrank Test for Multiple Patient Groups
- 6.11 Hazard Functions
- 6.12 Proportional Hazards
- 6.13 Relative Risks and Hazard Ratios
- 6.14 Proportional Hazards Regression Analysis
- 6.15 Hazard Regression Analysis of the Intracerebral Hemorrhage Data
- 6.16 Proportional Hazards Regression Analysis with Stata
- 6.17 Tied Failure Times
- 6.18 Additional Reading
- 6.19 Exercises
7 Hazard Regression Analysis
- 7.1 Proportional Hazards Model
- 7.2 Relative Risks and Hazard Ratios
- 7.3 95% Confidence Intervals and Hypothesis Tests
- 7.4 Nested Models and Model Deviance
- 7.5 An Example: The Framingham Heart Study
-
- 7.5.1 Univariate Analyses
-
- 7.5.2 Multiplicative Model of DBP and Gender on Risk of CHD
-
- 7.5.3 Using Interaction Terms to Model the Effects of Gender and DBP on CHD
-
- 7.5.4 Adjusting for Confounding Variables
-
- 7.5.5 Interpretation
-
- 7.5.6 Alternate Models
- 7.6 CoxSnell Generalized Residuals and Proportional Hazards Models
- 7.7 Proportional Hazards Regression Analysis using Stata
- 7.8 Stratified Proportional Hazards Models
- 7.9 Survival Analysis with Ragged Study Entry
-
- 7.9.1 KaplanMeier Survival Curve and the Logrank Test with Ragged Entry
-
- 7.9.2 Age, Sex, and CHD in the Framingham Heart Study
-
- 7.9.3 Proportional Hazards Regression Analysis with Ragged Entry
-
- 7.9.4 Survival Analysis with Ragged Entry using Stata
- 7.10 Hazard Regression Models with Time Dependent Covariates
-
- 7.10.1 CoxSnell Residuals for Models with Time-Dependent Covariates
-
- 7.10.2 Testing the Proportional Hazards Assumption
-
- 7.10.3 Alternative Models
- 7.11 Modeling Time-Dependent Covariates with Stata
- 7.12 Additional Reading
- 7.13 Exercises
8 Introduction to Poisson Regression: Inferences on Morbidity and Mortality Rates
- 8.1 Elementary Statistics Involving Rates
- 8.2 Calculating Relative Risks from Incidence Data Using Stata
- 8.3 The Binomial and Poisson Distributions
- 8.4 Simple Poisson Regression for 2x2 Tables
- 8.5 Poisson Regression and the Generalized Linear Model
- 8.6 Contrast Between Poisson, Logistic, and Linear Regression
- 8.7 Simple Poisson Regression with Stata
- 8.8 Poisson Regression and Survival Analysis
-
- 8.8.1 Recoding Survival Data on Patients as Patient-Year Data
-
- 8.8.2 Converting Survival Records to Person-Years of Follow-Up using Stata
- 8.9 Converting the Framingham Survival Data Set to Person-Time Data
- 8.10 Simple Poisson Regression with Multiple Data Records
- 8.11 Poisson Regression with a Classification Variable
- 8.12 Applying Simple Poisson Regression to the Framingham Data
- 8.13 Additional Reading
- 8.14 Exercises
9 Multiple Poisson Regression
- 9.1 Multiple Poisson Regression Model
- 9.2 An Example: The Framingham Heart Study
-
- 9.2.1 A Multiplicative Model of Gender, Age and Coronary Heart Disease
-
- 9.2.2 A Model of Age, Gender and CHD with Interaction Terms
-
- 9.2.3 Adding Confounding Variables to the Model
- 9.3 Using Stata to Perform Poisson Regression
- 9.4 Residual Analyses for Poisson Regression Models
-
- 9.4.1 Deviance Residuals
- 9.5 Residual Analysis of Poisson Regression Models Using Stata
- 9.6 Additional Reading
- 9.7 Exercises
10 Fixed Effects Analysis of Variance
- 10.1 One-Way Analysis of Variance
- 10.2 Multiple Comparisons
- 10.3 Reformulating Analysis of Variance as a Linear Regression Model
- 10.4 Non-parametric Methods
- 10.5 Kruskal-Wallis Test
- 10.6 Example: A Polymorphism in the Estrogen Receptor Gene
- 10.7 One-Way Analyses of Variance Using Stata
- 10.8 Two-Way Analysis of Variance, Analysis of Covariance, and Other Models
- 10.9 Additional Reading
- 10.10 Exercises
11 Repeated-Measures Analysis of Variance
- 11.1 Example: Effect of Race and Dose of Isoproterenol on Blood Flow
- 11.2 Exploratory Analysis of Repeated Measures Data Using Stata
- 11.3 Response Feature Analysis
- 11.4 Example: The Isoproterenol Data Set
- 11.5 Response Feature Analysis Using Stata
- 11.6 The Area-Under-the-Curve Response Feature
- 11.7 Generalized Estimating Equations
- 11.8 Common Correlation Structures
- 11.9 GEE Analysis and the Huber-White Sandwich Estimator
- 11.10 Example: Analyzing the Isoproterenol Data with GEE
- 11.11 Using Stata to Analyze the Isoproterenol Data Set Using GEE
- 11.12 GEE Analyses with Logistic or Poisson Models
- 11.13 Additional Reading
- 11.14 Exercises
Appendix A Summary of Stata Commands Used in this Text
References
Index
Back to Books on Stata
Back to Bookshop home
©Timberlake Consultants Limited
Last revised:16/07/2007