n-gram estimates in probabilistic models for Pinyin to Hanzi transcription