A Short Introduction to Stata for Biostatistics
(book + CD with data and results of exercises)

by Michael Hills and Bianca L. De Stavola, (2006)

Publisher: Timberlake Consultants Press
ISBN: 0-9552127-1-5
Pages: 172 pages
Price: £25.00 + p&p


Contents

Preface
Table of Contents
Download data files and programs
Book Order Form

Preface

This book is a revised version of a book with a similar title which was prepared for Stata 7. The datasets and programs have been updated to Stata 8, and the book has been expanded to include new chapters on dialog boxes, the new graph commands, likelihood ratio tests, and Mantel-Haenszel methods. Several of the original chapters have also been expanded.

Starting to use Stata is relatively simple, but because of its size, and the wealth of information in the guide and manuals, the question of what comes next can be rather daunting. This book provides a short introduction which will help answer this question. Although written with biostatisticians in mind, much of the material in the book is equally relevant to other disciplines.

We believe that the only way to learn Stata is to try it out, so we assume that the reader is seated in front of a computer which is running Stata. For this reason we have not felt it necessary to print the output which follows Stata commands. Of course there are occasions when it is reassuring to see that the output on the screen is the same as the output on the page, and we occasionally include some output for this reason, but the spirit of the book is to try something and see what happens. In keeping with this we have not provided solutions to the exercises, but instead a program produces the solutions on the screen.

We have mostly used official Stata commands but, as one of the features of Stata is the large number of user-contributed commands, we have also felt free to use a few of these, where appropriate. In particular, Chapters 10 - 12 depend on two commands, written specifically for this book, which provide dialog boxes for making tables and estimating effects.

The datasets and additional program files are an integral part of the book. They are included on the CD-ROM which comes with the book, but are also available as a down load from www.timberlake.co.uk or www.stata.com. Instructions on how to proceed with each of these media are given in Chapter 0: Getting Started. We would be grateful to be notified of any errors: as and when the need arises, updates and an errata file will be included on the websites.

The ideas in Chapters 10 - 12 arose from courses taught jointly by Michael Hills and David Clayton, and we gratefully acknowledge David's contribution. We are also grateful to Nick Cox for reading a draft of the Stata 7 book and making many helpful suggestions.


Top

Contents

0. Getting Started

1 Some Basic Commands

1.1 The births data
1.2 A first look at the data
1.3 Tables of frequencies
1.4 Tables of means and other things
1.5 Restricting the scope of commands
1.6 Generating new variables
1.7 Ordering, dropping and keeping
1.8 Sorting data
1.9 Using Stata as a calculator
1.10 Shortcuts
1.11 Stata syntax
1.12 Using the Stata help facilities

2 Tabs, Menus and Dialog Boxes

2.1 Where to find the dialog boxes
2.2 A first look at the data
2.3 Tables of frequencies
2.4 Tables of means and other things
2.5 Restricting the scope of commands
2.6 Generating new variables
2.7 Ordering, dropping and keeping
2.8 Sorting data
2.9 Using Stata as a calculator

3 Housekeeping

3.1 Labelling a dataset
3.2 Notes
3.3 Labelling variables and theIr values
3.4 Data types and display formats
3.5 Recoding a variable
3.6 Missing values
3.7 Dates
3.8 Saving files
3.9 Log files
3.10 Do files

4 Data Input and Output

4.1 Data sources
4.2 Data from a spreadsheet
4.3 Data from a wordprocessor
4.4 Large datasets
4.5 More about dictionary files
4.6 Loading data from the keyboard
4.7 Data output
4.8 Import and export to other packages

5 Graph Commands

5.1 Box plots
5.2 Histograms
5.3 Scatter plots
5.4 Overlaying graphs
5.5 Line plots
5.6 Cumulative distribution plots
5.7 Adding lines
5.8 Graph titles
5.9 Titles and labels for axes
5.10 Naming, saving, and combining graphs
5.11 Printing and exporting graphs
5.12 Schemes
5.13 Help for graphics

6 Graph Dialog Boxes

6.1 Histograms
6.2 Box plots
6.3 Bar charts
6.4 Twoway graphs

7 More Basic Tools

7.1 The return list
7.2 Generating variables using functions
7.3 Grouping the values of a variable
7.4 Comparing two means or two proportions
7.5 Weights
7.6 Repeating commands for different sub-groups
7.7 Repeating commands for different variables

8 Data Management

8.1 Cleaning data
8.2 String variables
8.3 Appending to add more subjects
8.4 Merging to add more variables
8.5 Merging to update variables
8.6 Unmatched merges

9 Repeated Measurements

9.1 Wide and long coding
9.2 Graphing repeated measures
9.3 Working at the group level
9.4 Collapsing the data
9.5 Reshaping from long to wide and vice versa
9.6 Use of system variables with by
9.7 Merging files with long coding

10 Response and Explanatory Variables

10.1 Questions in statistical analysis
10.2 Producing tables with tabmore
10.3 A second explanatory variable
10.4 Odds
10.5 Case-control studies
10.6 Survival data and rates
10.7 Count data and rates

11 Measuring Effects

11.1 A metric response
11.2 A binary response
11.3 Case-control studies
11.4 A failure response
11.5 Metric exposure variables
11.6 Metric versus grouped

12 Stratifying and Controlling

12.1 Stratification
12.2 Controlling
12.3 Controlling the effect of a metric exposure
12.4 Metric control variables
12.5 Metric versus grouped

13 Regression Commands

13.1 Three important regression models
13.2 A metric exposure
13.3 A categorical exposure with two levels
13.4 Categorical exposures with more than 2 levels
13.5 Fitted values and residuals
13.6 Case-control studies

14 Tests of Hypotheses

14.1 Models and Likelihood
14.2 Log likelihood
14.3 Likelihood ratio and Wald tests in Stata
14.4 Joint tests of several parameters
14.5 Other regression commands

15 Controlling and Stratifying with Regression

15.1 Controlling with regression commands
15.2 Testing effects after controlling
15.3 Stratifying with regression commands
15.4 Testing for effect modification and interactions
15.5 Interactions with metric variables

16 Mantel-Haenszel Methods

16.1 The method
16.2 The Stata command
16.3 Matched case-control studies
16.4 Mantel-Haenszel methods for rates
16.5 Exposures on more than two levels

17 Survival Data and Stset

17.1 The response in survival data
17.2 Summarizing survival time
17.3 Calculating rates and rate ratios
17.4 Nelson-Aalen plots of cumulative rate
17.5 Variables created by stset
17.6 A metric exposure on a log scale
17.7 Rates that vary with time
17.8 Cox regression
17.9 Time updated exposures

18 Different Time Scales and Standardization

18.1 Follow-up time
18.2 The diet data
18.3 Rates that change with time
18.4 Using non-st commands with st data
18.5 Two time-scales
18.6 Standardization

19 Writing Stata Programs

19.1 Starting with a do file
19.2 Making the do file into an ado file
19.3 Cutting out unwanted output
19.4 Making the program accept arguments
19.5 Allowing if, in, and options
19.6 Discarding previous versions of a program
19.7 Another example
19.8 Some additional programming points

20 How Stata is Organized

20.1 Paths and programs
20.2 Updating Stata
20.3 The Stata Journal
20.4 User-contributed programs
20.5 The Statalist
20.6 Other sources of help


Download data files and programs

To download the files from our website, start Stata (make sure you have created C:\data\hs) and type the following commands in the Commands window:

. cd C:\data\hs
. net from http://www.timberlake.co.uk/data/hs/
. net install hsbook
. net get hsbook

Back to Books on Stata

Back to Bookshop home


©Timberlake Consultants Limited
Last revised:16/07/2007


Back to Stata homepage
Back to Timberlake Consultants