Abstract—Video summarization is an efficient and flexible way to represent video data. In this paper, we use the Kernel PCA and clustering based key frame extraction to realize multilevel video representation. In order to remove the redundancy caused by large scene changes, SIFT flow scene alignment