StaticcreateConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto for more information.
StaticcreateConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_auto for more information.
StaticcreateConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_auto for more information.
StaticcreateConstruct an WordSegmenter with with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_dictionary for more information.
StaticcreateConstruct an WordSegmenter with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_dictionary for more information.
StaticcreateConstruct an WordSegmenter with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_dictionary for more information.
StaticcreateConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_lstm for more information.
StaticcreateConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_lstm for more information.
StaticcreateConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_lstm for more information.
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf16 for more information.
An ICU4X word-break segmenter, capable of finding word breakpoints in strings.
See the Rust documentation for
WordSegmenterfor more information.