|
Creating Stata datasets
- Input data from command line
- Input data saved from spreadsheets
- Read data using a dictionary
- Read any type of ASCII data
- Read and write data in the format required by the FDA for NDA submittals
- Read and write XML-formatted data files, including those produced by Microsoft Excel
- Convert datasets directly from other statistical packages, spreadsheets, and databases using third-party software
ODBC support
- Import data from any ODBC data source, such as Access, Excel, Postgres or MySQL
- Export data to new or existing ODBC tables
- Execute raw SQL commands individually or in batches
- Support for ODBC on Windows, Macintosh and Linux
Built-in spreadsheet editor
- For Windows, Macintosh, and Unix

Data-management functions
Data reorganization
- Rowcolumn transposition
- Data reshaping
- Stacking of variables
- Collapsing into means, totals, etc.
Labels
- Dataset labels
- Variable labels
- Value labels (e.g., male and female for 0 and 1)
- Ability to switch between multiple sets of data, variable, and value labels
- Missing value labels
- Multiple-language support
|
Sorting
- Ascending or descending sorts
- Multiple-key sorts
- Numeric and string sorts
Merging datasets
- Merge datasets
- By key variables
- By observations
- Join datasets
- Outer join
- Append datasets
- Append time series
Special datasets
- Panel data/cross-sectional time series
- Survival/duration data
- Time series
- Survey
(under development)
Utilities
- Compress (make dataset as small as possible without loss of accuracy)
- Formatted and unformatted disk I/O
Variable management
- Generation of new variables
- Replacement of existing variables
- Encoding and decoding string variables
Dataset reports
- Flexible description of variables, labels, and types
- Codebooks for variables
- Value-label reports
- Duplicates and missing values
Variable types
- Byte
- Integer (int)
- Long
- Float
- Double
- String
- Dates
Notes
- Extensive notes can be attached to a dataset
|