[ACCEPTED]-Clustering list for hclust function-hclust

Accepted answer
Score: 48

I will use the dataset available in R to 8 demonstrate how to cut a tree into desired 7 number of pieces. Result is a table.

Construct 6 a hclust object.

hc <- hclust(dist(USArrests), "ave")
#plot(hc)

You can now cut the tree 5 into as many branches as you want. For my 4 next trick, I will split the tree into two 3 groups. You set the number of cuts with 2 the k parameter. See ?cutree and the use of paramter 1 h which may be more useful to you (see cutree(hc, k = 2) == cutree(hc, h = 110)).

cutree(hc, k = 2)
       Alabama         Alaska        Arizona       Arkansas     California 
             1              1              1              2              1 
      Colorado    Connecticut       Delaware        Florida        Georgia 
             2              2              1              1              2 
        Hawaii          Idaho       Illinois        Indiana           Iowa 
             2              2              1              2              2 
        Kansas       Kentucky      Louisiana          Maine       Maryland 
             2              2              1              2              1 
 Massachusetts       Michigan      Minnesota    Mississippi       Missouri 
             2              1              2              1              2 
       Montana       Nebraska         Nevada  New Hampshire     New Jersey 
             2              2              1              2              2 
    New Mexico       New York North Carolina   North Dakota           Ohio 
             1              1              1              2              2 
      Oklahoma         Oregon   Pennsylvania   Rhode Island South Carolina 
             2              2              2              2              1 
  South Dakota      Tennessee          Texas           Utah        Vermont 
             2              2              2              2              2 
      Virginia     Washington  West Virginia      Wisconsin        Wyoming 
             2              2              2              2              2
Score: 19

lets say,

y<-dist(x)
clust<-hclust(y)
groups<-cutree(clust, k=3)
x<-cbind(x,groups)

now you will get for each record, the 2 cluster group. You can subset the dataset 1 as well:

x1<- subset(x, groups==1)
x2<- subset(x, groups==2)
x3<- subset(x, groups==3)

More Related questions