Text similarity-Shingling, Minhash algorithm experiment report Guangong (with source code java)