Data set used for clustering and classification