![]() |
Data Analysis using Stata by Ulrich Kohler and Frauke Kreuter (2005) Publisher: Stata Press ISBN: 1-59718-007-6 Pages: 395 pages, paperback Price: £35.00 + p&p Download data sets from: To extract the file kk.zip, create a new folder: c:\data\kk. To do so on a Windows system, use the Windows Explorer, move into the directory c:\data, select File > New > Folder, and insert c:\data\kk. Afterwards copy kk.zip into this folder. Unzip the file kk.zip using any program that can unzip zip archives. Make sure to preserve the kksoep subdirectory contained in the zip file. |
0 About the book
| 0.1 | Structure |
| 0.2 | Using this book: Materials and hints |
| 0.3 | Teaching with this manual |
1 "The first time"
| 1.1 | Starting Stata |
| 1.2 | Setting up your screen |
| 1.3 | Your first analysis |
| 1.4 | Do-files |
| 1.5 | Exiting Stata |
2 Working with do-files
| 2.1 | From interactive work to working with a do-file |
| 2.1.1 | Alternative 1 |
| 2.1.2 | Alternative 2 |
| 2.2 | Designing do-files |
| 2.2.1 | Comments |
| 2.2.2 | Line breaks |
| 2.2.3 | Some crucial commands |
| 2.3 | Organizing your work |
| 2.4 | Summary |
3 The grammar of Stata
| 3.1 | The elements of Stata commands |
| 3.1.1 | Stata commands |
| 3.1.2 | The variable list |
| List of variables: required or optional | |
| Abbreviation rules | |
| Special listings | |
| 3.1.3 | Options |
| 3.1.4 | The in qualifier |
| 3.1.5 | The if qualifier |
| 3.1.6 | Expressions |
| Operators | |
| Functions | |
| 3.1.7 | List of Numbers |
| 3.1.8 | Using Filenames |
| 3.2 | Repeating similar commands |
| 3.2.1 | The by prefix |
| 3.2.2 | The foreach loop |
| 3.2.3 | The forcalues loop |
| 3.3 | Weights |
4 Some general comments on the statistical commands
5 Creating and changing variables
| 5.1 | The commands generate and replace |
| 5.1.1 | Variable names |
| 5.1.2 | Some examples |
| 5.1.3 | Changing codes with by, _n, and _N |
| 5.1.4 | Subscripts |
| 5.2 | Specialized recoding commands |
| 5.2.1 | The recode command |
| 5.2.2 | The egen command |
| 5.3 | Additional tools for recording data |
| 5.3.1 | String functions |
| 5.3.2 | Date functions |
| 5.4 | Commands for dealing with missing values |
| 5.5 | Labels |
| 5.6 | Storage types, or, the ghost in the machine |
6 Creating and changing graph
| 6.1 | A primer on graph syntax |
| 6.2 | Graph types |
| 6.2.1 | Examples |
| 6.2.2 | Specialized graphs |
| 6.3 | Graph elements |
| 6.3.1 | Appearance of data |
| Choice of marke | |
| Marker colors | |
| Marker size | |
| Lines | |
| 6.3.2 | Graphs and plot regions |
| Graph size | |
| Plot region | |
| Scaling the axes | |
| 6.3.3 | Information inside the plot region |
| Reference lines | |
| Labeling inside the plot region |
|
| 6.3.4 | Information outside the plot region |
| Labeling the axes | |
| Tick lines | |
| Axis titles | |
| The legend | |
| Graph titles | |
| 6.4 | Multiple graphs |
| 6.4.1 | Overlaying numerous twoway graphs |
| 6.4.2 | Option by() |
| 6.4.3 | Combining graphs |
| 6.5 | Saving and printing graphs |
7 Describing and comparing distributions
| 7.1 | Categories: Few or many? |
| 7.2 | Variables with few categories |
| 7.2.1 | Tables |
| Frequency tables | |
| More than one frequency table | |
| Comparing distributions | |
| Summary statistics | |
| 7.2.2 | Graphs |
| Histograms | |
| Bar charts | |
| Dot chart |
|
| 7.3 | Variables with many categories |
| 7.3.1 | Frequencies of grouped data |
| Some remarks on grouping data | |
| Special techniques for grouping data | |
| 7.3.2 | Describing data using statistics |
| Important summary statistics | |
| The summarize command | |
| The tabstat command | |
| Comparing distributions using statistics | |
| 7.3.3 | Graphs |
| Box plots | |
| Histograms | |
| Kernel density estimation | |
| Quantile Plot | |
| 7.3.4 | Summary |
| 7.4 | Summary |
8 Introduction to linear regression
| 8.1 | Simple linear regression |
| 8.1.1 | The basic principle |
| 8.1.2 | Linear regression using Stata |
| The table of coefficients | |
| Standard errors | |
| The table of ANOVA results | |
| The model fit table |
|
| 8.2 | Multiple regression |
| 8.2.1 | Multiple regression using Stata |
| 8.2.2 | Additional components |
| 8.2.3 | What does "under control" mean? |
| 8.3 | Regression diagnostics |
| 8.3.1 | Violation of E(εi) = 0 |
| Linearity | |
| Influential cases |
|
| Omitted variables |
|
| 8.3.2 | Violation of Var(εi) = σ2 |
| 8.3.3 | Violation of Cov(εi, εj) = 0, i ≠ j |
| 8.4 | Model extensions |
| 8.4.1 | Categorical independent variables |
| 8.4.2 | Interaction terms |
| 8.4.3 | Regression models using transformed variables |
| Nonlinear relations | |
| Eliminating heteroskedasticity |
|
| 8.5 | More on standard errors |
| 8.5.1 | Bootstrap techniques |
| 8.5.2 | Confidence intervals on cluster samples |
| 8.6 | Advanced techniques |
| 8.6.1 | Median regression |
| 8.6.2 | Regression models for panel data |
| From wide to long format |
|
| Fixed-effects models | |
| 8.6.3 | Error-component models |
| 8.7 | Summary |
9 Regression models for categorical dependent variables
| 9.1 | The linear probability model |
| 9.2 | Basic concepts |
| 9.2.1 | Odds, log odds, and odds ratios |
| 9.2.2 | Excursion: The maximum likelihood principle |
| 9.3 | Logistic regression with Stata |
| 9.3.1 | The coefficients block |
| Sign interpretation | |
| Interpretation with odds ratios | |
| Probability interpretation | |
| 9.3.2 | The iteration block |
| 9.3.3 | The model fit block |
| Classification tables | |
| Pearson chi-squared | |
| 9.4 | Logistic regression diagnostics |
| 9.4.1 | Linearity |
| 9.4.2 | Influential cases |
| 9.5 | Likelihood-ratio test |
| 9.6 | Refined models |
| 9.7 | Advanced techniques |
| 9.7.1 | Probit models |
| 9.7.2 | Multinomial logistic regression |
| 9.7.3 | Models for ordinal data |
| 9.8 | Summary |
10 Reading and writing data
| 10.1 | The goal: The data matrix |
| 10.2 | Importing machine-readable data |
| 10.2.1 | Reading system files from other packages |
| 10.2.2 | Reading ASCII text files |
| Reading data in spreadsheet format |
|
| Reading data in free format |
|
| Reading data in fixed format | |
| 10.3 | Inputting data |
| 10.3.1 | Input data using the editor |
| 10.3.2 | The input command |
| 10.4 | Combining data |
| 10.4.1 | The GSOEP database |
| 10.4.2 | The merge command |
| The merge procedure | |
| Keeping track of observations | |
| Merging more than two files | |
| Merging data on different levels | |
| 10.4.3 | The append command |
| 10.5 | Saving and exporting data |
| 10.6 | Handling big datasets |
| 10.6.1 | Rules for handling the working memory |
| 10.6.2 | Using oversized datasets |
| 10.7 | Summary |
11 Do-files for advanced users and user-written programs
| 11.1 | Two examples of usage |
| 11.2 | Four programming tools |
| 11.2.1 | Local macros |
| 11.2.2 | Do-files |
| 11.2.3 | Programs |
| 11.2.4 | Programs in do-files and ado-files |
| 11.3 | User-written Stata commands |
| 11.3.1 | Parsing variable lists |
| 11.3.2 | Parsing options |
| 11.3.3 | Parsing if and in qualifiers |
| 11.3.4 | Generating an unknown number of variables |
| 11.3.5 | Default values |
| 11.3.6 | Extended macro functions |
| 11.3.7 | Avoiding changes in the dataset |
| 11.3.8 | Help Files |
| 11.4 | Summary |
12 Around Stata
| 12.1 | Resources and information |
| 12.2 | Taking care of Stata |
| 12.3 | Additional procedures |
| 12.3.1 | SJ and STB ado-files |
| 12.3.2 | SSC ado-files |
| 12.3.3 | Other ado-files |
| 12.4 | Summary |
References
©Timberlake Consultants Limited
Last revised: