|
ICU4X
International Components for Unicode
|
#include <SentenceSegmenter.d.hpp>
Public Member Functions | |
| std::unique_ptr< icu4x::SentenceBreakIteratorUtf8 > | segment (std::string_view input) const |
| std::unique_ptr< icu4x::SentenceBreakIteratorUtf16 > | segment16 (std::u16string_view input) const |
| std::unique_ptr< icu4x::SentenceBreakIteratorLatin1 > | segment_latin1 (diplomat::span< const uint8_t > input) const |
Static Public Member Functions | |
| static std::unique_ptr< icu4x::SentenceSegmenter > | create () |
| static diplomat::result< std::unique_ptr< icu4x::SentenceSegmenter >, icu4x::DataError > | create_with_content_locale (const icu4x::Locale &locale) |
| static diplomat::result< std::unique_ptr< icu4x::SentenceSegmenter >, icu4x::DataError > | create_with_content_locale_and_provider (const icu4x::DataProvider &provider, const icu4x::Locale &locale) |
| static void | operator delete (void *ptr) |
An ICU4X sentence-break segmenter, capable of finding sentence breakpoints in strings.
See the Rust documentation for SentenceSegmenter for more information.
|
inlinestatic |
Construct a SentenceSegmenter using compiled data. This does not assume any content locale.
See the Rust documentation for new for more information.
|
inlinestatic |
Construct a SentenceSegmenter for content known to be of a given locale, using compiled data.
|
inlinestatic |
Construct a SentenceSegmenter for content known to be of a given locale, using a particular data source.
|
inlinestatic |
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf8 for more information.
|
inline |
Segments a string.
Ill-formed input is treated as if errors had been replaced with REPLACEMENT CHARACTERs according to the WHATWG Encoding Standard.
See the Rust documentation for segment_utf16 for more information.
|
inline |
Segments a Latin-1 string.
See the Rust documentation for segment_latin1 for more information.