Toolkit for large data processing, includes word segmentation, part-of-speech tagging, dependency parsing