ccjieba
C++ Chinese text segmentation library, a refactored version inspired by cppjieba. C++17, zero external dependencies.
Features
- 5 segmentation algorithms: MixSegment (default), MPSegment, HMMSegment, FullSegment, QuerySegment
- Keyword extraction: TF-IDF based keyword extraction
- POS tagging: Part-of-speech tagging for words and sentences
- User dictionary: Incremental word additions without rebuilding
- Zero dependencies: Header-only usage possible for most features
Quick Start
See Getting Started for build instructions and basic usage.