pub struct LineSegmenter { /* private fields */ }
Expand description

Supports loading line break data, and creating line break iterators for different string encodings.

🚧 This code is experimental; it may change at any time, in breaking or non-breaking ways, including in SemVer minor releases. It can be enabled with the "experimental" Cargo feature of the icu meta-crate. Use with caution. #2259

Examples

Segment a string with default options:

use icu_segmenter::LineSegmenter;

let segmenter =
    LineSegmenter::try_new_unstable(&icu_testdata::unstable())
        .expect("Data exists");

let breakpoints: Vec<usize> =
    segmenter.segment_str("Hello World").collect();
assert_eq!(&breakpoints, &[6, 11]);

Segment a string with CSS option overrides:

use icu_segmenter::{
    LineBreakOptions, LineBreakRule, LineSegmenter, WordBreakRule,
};

let mut options = LineBreakOptions::default();
options.line_break_rule = LineBreakRule::Strict;
options.word_break_rule = WordBreakRule::BreakAll;
options.ja_zh = false;
let segmenter = LineSegmenter::try_new_with_options_unstable(
    &icu_testdata::unstable(),
    options,
)
.expect("Data exists");

let breakpoints: Vec<usize> =
    segmenter.segment_str("Hello World").collect();
assert_eq!(&breakpoints, &[1, 2, 3, 4, 6, 7, 8, 9, 10, 11]);

Segment a Latin1 byte string:

use icu_segmenter::LineSegmenter;

let segmenter =
    LineSegmenter::try_new_unstable(&icu_testdata::unstable())
        .expect("Data exists");

let breakpoints: Vec<usize> =
    segmenter.segment_latin1(b"Hello World").collect();
assert_eq!(&breakpoints, &[6, 11]);

Implementations

Construct a LineSegmenter with default LineBreakOptions.

Creates a new instance using an AnyProvider.

For details on the behavior of this function, see: Self::try_new_unstable

📚 Help choosing a constructor

Enabled with the "serde" feature.

Creates a new instance using a BufferProvider.

For details on the behavior of this function, see: Self::try_new_unstable

📚 Help choosing a constructor

Construct a LineSegmenter with custom LineBreakOptions.

Creates a new instance using an AnyProvider.

For details on the behavior of this function, see: Self::try_new_with_options_unstable

📚 Help choosing a constructor

Enabled with the "serde" feature.

Creates a new instance using a BufferProvider.

For details on the behavior of this function, see: Self::try_new_with_options_unstable

📚 Help choosing a constructor

Create a line break iterator for an str (a UTF-8 string).

Create a line break iterator for a potentially ill-formed UTF8 string

Invalid characters are treated as REPLACEMENT CHARACTER

Create a line break iterator for a Latin-1 (8-bit) string.

Create a line break iterator for a UTF-16 string.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.