A Short Introduction to Stata for Biostatistics updated for Stata 10
(book + CD with data and results of exercises)

by Michael Hills and Bianca L. De Stavola, (2007)

Publisher: Timberlake Consultants Press
ISBN: 978-0-9557076-1-2
Pages: 177 pages
Price: £25.00 + p&p


Contents

Preface
Contents
Download data files and programs
FAQ's
Errata
Updates
Book Order Form

Preface

Starting to use Stata is relatively simple, but because of its size, and the wealth of information in the guide and manuals, the question of what comes next can be rather daunting. This book provides a short introduction which will help answer this question. Although written with biostatisticians in mind, much of the material in the book is equally relevant to other disciplines. At the time of writing the current version of Stata is 10, but most of what is in the book applies to version 9. The only dofferences are in Chapter 6, because , most of the graphics menus changed from version 9 to 10.

We believe that the only way to learn Stata is to try it out, so we assume that the reader is seated in front of a computer which is running Stata. For this reason we have not felt it necessary to print the output which follows Stata commands. Of course there are occasions when it is reassuring to see that the output on the screenis the same as the output on the page, and we occasionally include some output for this reason, but the spirit of the book is to try something and see what happens.In keeping with this we have not provided solutions to the exercises, but instead a program produces the solutions on the screen.

We have mostly used official Stata commands but, as one of the features of Stata is the large number of user–contributed commands, we have also felt free to use a few of these, where appropriate. In particular, Chapters 10 – 12 depend on two commands,written specifically for this book, which provide dialog boxes for making tables and estimating effects.

The datasets and additional program files are an integral part of the book. They are included on the CD-ROM which comes with the book, but are also available as a download from www.timberlake.co.uk and www.stata.com Instructions on how to proceed with each of these media are given in Chapter 0: Getting Started. We would be grateful to be notified of any errors. The ideas in Chapter 10-12 arose from courses taught jointly by Michael Hills and David Clayton and we gratefully acknowledge David's contribution.


Top

Contents

0 Getting started
1 Some basic commands

1.1 The births data
1.2 A first look at the data
1.3 Tables of frequencies
1.4 Tables of means and other things
1.5 Restricting the scope of commands
1.6 Generating new variables
1.7 Ordering, dropping and keeping
1.8 Sorting data
1.9 Using Stata as a calculator
1.10 Shortcuts
1.11 Stata syntax  
1.12 Using the Stata help facilities

2 Tabs, menus and dialog boxes
2.1 Where to find the dialog boxes
2.2 A first look at the data
2.3 Tables of frequencies
2.4 Tables of means and other things
2.5 Restricting the scope of commands
2.6 Generating new variables
2.7 Ordering, dropping and keeping
2.8 Sorting data
2.9 Using Stata as a calculator

3 Housekeeping
3.1 Labelling a dataset
3.2 Notes
3.3 Labelling variables and their value
3.4 Data types and display formats
3.5 Recoding a variable
3.6 Missing values
3.7 Dates
3.8 Saving files
3.9 Log files
3.10 Do files

4 Data input and output
4.1 Data sources
4.2 Data from a spreadsheet
4.3 Data from a wordprocessor
4.4 Large datasets
4.5 More about dictionary files
4.6 Loading data from the keyboard
4.7 Data output
4.8 Import and export to other packages

5 Graph commands
5.1 Box plots
5.2 Histograms
5.3 Scatter plots
5.4 Overlaying graphs
5.5 Line plots
5.6 Cumulative distribution plots
5.7 Adding lines
5.8 Graph titles
5.9 Titles and labels for axes
5.10 Naming, saving, and combining graphs
5.11 Printing and exporting graphs
5.12 Schemes
5.13 Help for graphics

6 Graph dialog boxes
6.1 Histograms
6.2 Box plots
6.3 Bar charts
6.4 Twoway graphs
6.5 The graph editor

7 More basic tools
7.1 The return list
7.2 Generating variables using functions
7.3 Grouping the values of a variable
7.4 Comparing two means or two proportions
7.5 Weights
7.6 Repeating commands for different sub-groups
7.7 Repeating commands for different variables

8 Data management
8.1 Cleaning data
8.2 String variables
8.3 Appending to add more subjects
8.4 Merging to add more variables
8.5 Merging to update variables
8.6 Unmatched merges

9 Repeated measurements
9.1 Wide and long coding
9.2 Graphing repeated measures
9.3 Working at the group level
9.4 Collapsing the data
9.5 Reshaping from long to wide and vice versa
9.6 Use of system variables with by:
9.7 Merging files with long coding

10 Response and explanatory variables
10.1 Questions in statistical analysis
10.2 Producing tables with tabmore
10.3 A second explanatory variable
10.4 Odds
10.5 Case-control studies
10.6 Survival data and rates
10.7 Count data and rates

11 Measuring e_ects
11.1 A metric response
11.2 A binary response
11.3 Case-control studies
11.4 A failure response
11.5 Metric exposure variables
11.6 Metric versus grouped

12 Stratifying and controlling
12.1 Stratification
12.2 Controlling
12.3 Controlling the effect of a metric exposure
12.4 Metric control variables
12.5 Metric versus grouped

13 Regression commands
13.1 Three important regression models
13.2 A metric exposure
13.3 A categorical exposure with two levels
13.4 Categorical exposures with more than 2 levels
13.5 Fitted values and residuals
13.6 Case-control studies

14 Tests of hypotheses
14.1 Models and Likelihood
14.2 Log likelihood
14.3 Likelihood ratio and Wald tests in Stata
14.4 Joint tests of several parameters
14.5 Other regression commands

15 Controlling and stratifying with regression
15.1 Controlling with regression commands
15.2 Testing effects after controlling
15.3 Stratifying with regression commands
15.4 Testing for effect modification and interactions
15.5 Interactions with metric variables

16 Mantel-Haenszel methods
16.1 The method
16.2 The Stata command
16.3 Matched case-control studies
16.4 Mantel-Haenszel methods for rates
16.5 Exposures on more than two levels

17 Survival data and stset
17.1 The response in survival data
17.2 Summarizing survival time
17.3 Calculating rates and rate ratios
17.4 Nelson-Aalen plots of cumulative rate
17.5 Variables created by stset
17.6 A metric exposure on a log scale
17.7 Rates that vary with time
17.8 Cox regression
17.9 Time updated exposures

18 Different time scales and standardization
18.1 Follow-up time
18.2 The diet data
18.3 Rates that change with time
18.4 Using non-st commands with st data
18.5 Two time-scales
18.6 Standardization

19 Writing Stata programs
19.1 Starting with a do file
19.2 Making the do file into an ado file
19.3 Cutting out unwanted output
19.4 Making the program accept arguments
19.5 Allowing if, in, and options
19.6 Discarding previous versions of a program
19.7 Another example
19.8 Some additional programming points

20 How Stata is organized
20.1 Paths and programs
20.2 Updating Stata
20.3 The Stata Journal
20.4 User-contributed programs
20.5 The Statalist
20.6 Other sources of help


Download data files and programs

To download the files from our website, start Stata (make sure you have created C:\data\shortintro) and type the following commands in the Commands window:

. cd C:\data\shortintro
. net from http://www.timberlake.co.uk/data/hs/
. net install hsbook
. net get hsbook

If you are still using 'A Short Introduction to Stata for Biostatistics' for Stata version 9 then you can download the data using the following commands:

. cd C:\data\shortintro
. net from http://www.timberlake.co.uk/data/hs9/
. net install book
. net get book

Back to Books on Stata

Back to Bookshop home


©Timberlake Consultants Limited
Last revised:10/12/2007


Back to Stata homepage
Back to Timberlake Consultants