Static
createConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto
for more information.
Static
createConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_auto
for more information.
Static
createConstruct an WordSegmenter with automatically selecting the best available LSTM or dictionary payload data, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_auto
for more information.
Static
createConstruct an WordSegmenter with with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_dictionary
for more information.
Static
createConstruct an WordSegmenter with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_dictionary
for more information.
Static
createConstruct an WordSegmenter with dictionary payload data for Chinese, Japanese, Burmese, Khmer, Lao, and Thai, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and dictionary for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_dictionary
for more information.
Static
createConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data. This does not assume any content locale.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_lstm
for more information.
Static
createConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_lstm
for more information.
Static
createConstruct an WordSegmenter with LSTM payload data for Burmese, Khmer, Lao, and Thai, using a particular data source.
Note: currently, it uses dictionary for Chinese and Japanese, and LSTM for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for try_new_lstm
for more information.
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf16
for more information.
An ICU4X word-break segmenter, capable of finding word breakpoints in strings.
See the Rust documentation for
WordSegmenter
for more information.