![]() |
ICU4X
International Components for Unicode
|
#include <LineSegmenter.d.hpp>
Public Member Functions | |
std::unique_ptr< icu4x::LineBreakIteratorUtf8 > | segment (std::string_view input) const |
std::unique_ptr< icu4x::LineBreakIteratorUtf16 > | segment16 (std::u16string_view input) const |
std::unique_ptr< icu4x::LineBreakIteratorLatin1 > | segment_latin1 (diplomat::span< const uint8_t > input) const |
An ICU4X line-break segmenter, capable of finding breakpoints in strings.
See the Rust documentation for LineSegmenter
for more information.
|
inlinestatic |
Construct a LineSegmenter
with default options (no locale-based tailoring) using compiled data. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options using compiled data. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai, using a particular data source.
See the Rust documentation for new_auto
for more information.
|
inlinestatic |
Construct a LineSegmenter
with default options (no locale-based tailoring) and dictionary payload data for Burmese, Khmer, Lao, and Thai, using compiled data
See the Rust documentation for new_dictionary
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options and dictionary payload data for Burmese, Khmer, Lao, and Thai, using compiled data.
See the Rust documentation for new_dictionary
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options and dictionary payload data for Burmese, Khmer, Lao, and Thai, using a particular data source.
See the Rust documentation for new_dictionary
for more information.
|
inlinestatic |
Construct a LineSegmenter
with default options (no locale-based tailoring) and LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data.
See the Rust documentation for new_lstm
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options and LSTM payload data for Burmese, Khmer, Lao, and Thai, using compiled data.
See the Rust documentation for new_lstm
for more information.
|
inlinestatic |
Construct a LineSegmenter
with custom options and LSTM payload data for Burmese, Khmer, Lao, and Thai, using a particular data source.
See the Rust documentation for new_lstm
for more information.
|
inlinestatic |
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf8
for more information.
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf16
for more information.
|
inline |
Segments a Latin-1 string.
See the Rust documentation for segment_latin1
for more information.