Stata - Cluster Analysis

New in Stata 10

Hierarchical clustering
  • Single linkage
  • Complete linkage
  • Average linkage
  • Ward's linkage (including Ward's method)
  • Weighted average linkage
  • Centroid linkage
  • Median linkage

Nonhierarchical

  • Kmeans
  • Kmedians

Cluster on observations

Cluster using any proximity matrix

Dendrograms

  • Full trees
  • Subtrees
  • Upper portion of tree
  • Vertical or horizontal orientation
  • Branch counts

Stopping rules

  • Calínski and Harabasz pseudo-F index
  • Duda and Hart Je(2)/Je(1) index
Similarity/dissimilarity measures for binary data
  • Matching
  • Jaccard
  • Russell
  • Hamann
  • Dice
  • Antidice
  • Sneath
  • Rogers
  • Ochiai
  • Yule
  • Anderberg
  • Kulczynski
  • Gower2
  • Pearson

Result-management utilities

  • dir
  • list
  • drop
  • use
  • rename

User-extensible commands

  • Ability to add new clustering methods and utilities
  • Full set of tools to ease making additions


Support tools

  • Generate summary and grouping variables
  • Attach notes to analyses

Similarity/dissimilarity measures for continuous data

  • L2/Euclidean
  • L1/absolute/cityblock/manhattan
  • L(#)
  • Canberra
  • correlation
  • angular
Back to Capabilities Home

Back to Stata homepage
Back to Timberlake Consultants

©Timberlake Consultants Limited
Last revised:17/06/2007