ICU4X
International Components for Unicode
Loading...
Searching...
No Matches
ICU4XLineSegmenter Class Reference

#include <ICU4XLineSegmenter.hpp>

Public Member Functions

ICU4XLineBreakIteratorUtf8 segment_utf8 (const std::string_view input) const
 
ICU4XLineBreakIteratorUtf16 segment_utf16 (const std::u16string_view input) const
 
ICU4XLineBreakIteratorLatin1 segment_latin1 (const diplomat::span< const uint8_t > input) const
 
 ICU4XLineSegmenter (capi::ICU4XLineSegmenter *i)
 
 ICU4XLineSegmenter ()=default
 
 ICU4XLineSegmenter (ICU4XLineSegmenter &&) noexcept=default
 
ICU4XLineSegmenteroperator= (ICU4XLineSegmenter &&other) noexcept=default
 

Static Public Member Functions

static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_auto (const ICU4XDataProvider &provider)
 
static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_lstm (const ICU4XDataProvider &provider)
 
static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_dictionary (const ICU4XDataProvider &provider)
 
static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_auto_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options)
 
static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_lstm_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options)
 
static diplomat::result< ICU4XLineSegmenter, ICU4XErrorcreate_dictionary_with_options_v1 (const ICU4XDataProvider &provider, ICU4XLineBreakOptionsV1 options)
 

Detailed Description

An ICU4X line-break segmenter, capable of finding breakpoints in strings.

See the Rust documentation for LineSegmenter for more information.

Constructor & Destructor Documentation

◆ ICU4XLineSegmenter() [1/3]

ICU4XLineSegmenter::ICU4XLineSegmenter ( capi::ICU4XLineSegmenter * i)
inlineexplicit

◆ ICU4XLineSegmenter() [2/3]

ICU4XLineSegmenter::ICU4XLineSegmenter ( )
default

◆ ICU4XLineSegmenter() [3/3]

ICU4XLineSegmenter::ICU4XLineSegmenter ( ICU4XLineSegmenter && )
defaultnoexcept

Member Function Documentation

◆ create_auto()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_auto ( const ICU4XDataProvider & provider)
inlinestatic

Construct a [ICU4XLineSegmenter] with default options. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.

See the Rust documentation for new_auto for more information.

◆ create_auto_with_options_v1()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_auto_with_options_v1 ( const ICU4XDataProvider & provider,
ICU4XLineBreakOptionsV1 options )
inlinestatic

Construct a [ICU4XLineSegmenter] with custom options. It automatically loads the best available payload data for Burmese, Khmer, Lao, and Thai.

See the Rust documentation for new_auto_with_options for more information.

◆ create_dictionary()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_dictionary ( const ICU4XDataProvider & provider)
inlinestatic

Construct a [ICU4XLineSegmenter] with default options and dictionary payload data for Burmese, Khmer, Lao, and Thai..

See the Rust documentation for new_dictionary for more information.

◆ create_dictionary_with_options_v1()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_dictionary_with_options_v1 ( const ICU4XDataProvider & provider,
ICU4XLineBreakOptionsV1 options )
inlinestatic

Construct a [ICU4XLineSegmenter] with custom options and dictionary payload data for Burmese, Khmer, Lao, and Thai.

See the Rust documentation for new_dictionary_with_options for more information.

◆ create_lstm()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_lstm ( const ICU4XDataProvider & provider)
inlinestatic

Construct a [ICU4XLineSegmenter] with default options and LSTM payload data for Burmese, Khmer, Lao, and Thai.

See the Rust documentation for new_lstm for more information.

◆ create_lstm_with_options_v1()

diplomat::result< ICU4XLineSegmenter, ICU4XError > ICU4XLineSegmenter::create_lstm_with_options_v1 ( const ICU4XDataProvider & provider,
ICU4XLineBreakOptionsV1 options )
inlinestatic

Construct a [ICU4XLineSegmenter] with custom options and LSTM payload data for Burmese, Khmer, Lao, and Thai.

See the Rust documentation for new_lstm_with_options for more information.

◆ operator=()

ICU4XLineSegmenter & ICU4XLineSegmenter::operator= ( ICU4XLineSegmenter && other)
defaultnoexcept

◆ segment_latin1()

ICU4XLineBreakIteratorLatin1 ICU4XLineSegmenter::segment_latin1 ( const diplomat::span< const uint8_t > input) const
inline

Segments a Latin-1 string.

See the Rust documentation for segment_latin1 for more information.

Lifetimes: this, input must live at least as long as the output.

◆ segment_utf16()

ICU4XLineBreakIteratorUtf16 ICU4XLineSegmenter::segment_utf16 ( const std::u16string_view input) const
inline

Segments a string.

Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.

See the Rust documentation for segment_utf16 for more information.

Lifetimes: this, input must live at least as long as the output.

◆ segment_utf8()

ICU4XLineBreakIteratorUtf8 ICU4XLineSegmenter::segment_utf8 ( const std::string_view input) const
inline

Segments a string.

Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.

See the Rust documentation for segment_utf8 for more information.

Lifetimes: this, input must live at least as long as the output.


The documentation for this class was generated from the following file: