ICU4X
International Components for Unicode
|
#include <ICU4XLineSegmenter.hpp>
Public Member Functions | |
ICU4XLineBreakIteratorUtf8 | segment_utf8 (const std::string_view input) const |
ICU4XLineBreakIteratorUtf16 | segment_utf16 (const std::u16string_view input) const |
ICU4XLineBreakIteratorLatin1 | segment_latin1 (const diplomat::span< const uint8_t > input) const |
ICU4XLineSegmenter (capi::ICU4XLineSegmenter *i) | |
ICU4XLineSegmenter ()=default | |
ICU4XLineSegmenter (ICU4XLineSegmenter &&) noexcept=default | |
ICU4XLineSegmenter & | operator= (ICU4XLineSegmenter &&other) noexcept=default |
Static Public Member Functions | |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_auto (const ICU4XDataProvider &provider) |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_lstm (const ICU4XDataProvider &provider) |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_dictionary (const ICU4XDataProvider &provider) |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_auto_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options) |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_lstm_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options) |
static diplomat::result< ICU4XLineSegmenter, ICU4XError > | create_dictionary_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options) |
An ICU4X line-break segmenter, capable of finding breakpoints in strings.
See the Rust documentation for LineSegmenter
for more information.
|
inlineexplicit |
|
default |
|
defaultnoexcept |
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with default options. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto
for more information.
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with custom options. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_auto_with_options
for more information.
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with default options and dictionary payload data for Burmese, Khmer, Lao, and Thai..
See the Rust documentation for new_dictionary
for more information.
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with custom options and dictionary payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_dictionary_with_options
for more information.
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with default options and LSTM payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_lstm
for more information.
|
inlinestatic |
Construct a [ICU4XLineSegmenter
] with custom options and LSTM payload data for Burmese, Khmer, Lao, and Thai.
See the Rust documentation for new_lstm_with_options
for more information.
|
defaultnoexcept |
|
inline |
Segments a Latin-1 string.
See the Rust documentation for segment_latin1
for more information.
Lifetimes: this
, input
must live at least as long as the output.
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf16
for more information.
Lifetimes: this
, input
must live at least as long as the output.
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf8
for more information.
Lifetimes: this
, input
must live at least as long as the output.