Predictive Modeling, Clustering Categorical Variables


Categorical variables with lots of categories can be clustered using Greenacre's method. Earlier we did it by hand using a simple example that displayed quasi-complete separation. In this topic we will use PROC CLUSTER to implement Greenacre's method when doing it by hand is impossible.



Introduction



A repeat of the simple example using PROC CLUSTER for Greenacre's method



The branch variable on the develop data set



The cluster step for the branch variable



Find the optimum number of clusters



Define indicator variables for cluster membership




The slides used in the videos are found here