Collaborative filtering (CF) is one of the essential algorithms in recommendation system. Based on the performance analysis, two computational kernels are identified. In order to accelerate CF on large-scale data, a CUDA-enabled parallel CF approach is proposed where an efficient data partition sche