#[repr(transparent)]
pub struct GeneralCategoryGroup(_);
Expand description

Groupings of multiple General_Category property values.

Instances of GeneralCategoryGroup represent the defined multi-category values that are useful for users in certain contexts, such as regex. In other words, unlike GeneralCategory, this supports groups of general categories: for example, Letter /// is the union of UppercaseLetter, LowercaseLetter, etc.

See https://www.unicode.org/reports/tr44/ .

The discriminants correspond to the U_GC_XX_MASK constants in ICU4C. Unlike GeneralCategory, this supports groups of general categories: for example, Letter is the union of UppercaseLetter, LowercaseLetter, etc.

See UCharCategory and U_GET_GC_MASK in ICU4C.

Implementations

(Lu) An uppercase letter

(Ll) A lowercase letter

(Lt) A digraphic letter, with first part uppercase

(Lm) A modifier letter

(Lo) Other letters, including syllables and ideographs

(LC) The union of UppercaseLetter, LowercaseLetter, and TitlecaseLetter

(L) The union of all letter categories

(Mn) A nonspacing combining mark (zero advance width)

(Mc) A spacing combining mark (positive advance width)

(Me) An enclosing combining mark

(M) The union of all mark categories

(Nd) A decimal digit

(Nl) A letterlike numeric character

(No) A numeric character of other type

(N) The union of all number categories

(Zs) A space character (of various non-zero widths)

(Zl) U+2028 LINE SEPARATOR only

(Zp) U+2029 PARAGRAPH SEPARATOR only

(Z) The union of all separator categories

(Cc) A C0 or C1 control code

(Cf) A format control character

(Co) A private-use character

(Cs) A surrogate code point

(Cn) A reserved unassigned code point or a noncharacter

(C) The union of all control code, reserved, and unassigned categories

(Pd) A dash or hyphen punctuation mark

(Ps) An opening punctuation mark (of a pair)

(Pe) A closing punctuation mark (of a pair)

(Pc) A connecting punctuation mark, like a tie

(Pi) An initial quotation mark

(Pf) A final quotation mark

(Po) A punctuation mark of other type

(P) The union of all punctuation categories

(Sm) A symbol of mathematical use

(Sc) A currency sign

(Sk) A non-letterlike modifier symbol

(So) A symbol of other type

(S) The union of all symbol categories

Return whether the code point belongs in the provided multi-value category.

use icu::properties::{maps, GeneralCategory, GeneralCategoryGroup};
use icu_collections::codepointtrie::CodePointTrie;

let data = maps::load_general_category(&icu_testdata::unstable())
    .expect("The data should be valid");
let gc = data.as_borrowed();

assert_eq!(gc.get('A'), GeneralCategory::UppercaseLetter);
assert!(GeneralCategoryGroup::CasedLetter.contains(gc.get('A')));

// U+0B1E ORIYA LETTER NYA
assert_eq!(gc.get('ଞ'), GeneralCategory::OtherLetter);
assert!(GeneralCategoryGroup::Letter.contains(gc.get('ଞ')));
assert!(!GeneralCategoryGroup::CasedLetter.contains(gc.get('ଞ')));

// U+0301 COMBINING ACUTE ACCENT
assert_eq!(gc.get32(0x0301), GeneralCategory::NonspacingMark);
assert!(GeneralCategoryGroup::Mark.contains(gc.get32(0x0301)));
assert!(!GeneralCategoryGroup::Letter.contains(gc.get32(0x0301)));

assert_eq!(gc.get('0'), GeneralCategory::DecimalNumber);
assert!(GeneralCategoryGroup::Number.contains(gc.get('0')));
assert!(!GeneralCategoryGroup::Mark.contains(gc.get('0')));

assert_eq!(gc.get('('), GeneralCategory::OpenPunctuation);
assert!(GeneralCategoryGroup::Punctuation.contains(gc.get('(')));
assert!(!GeneralCategoryGroup::Number.contains(gc.get('(')));

// U+2713 CHECK MARK
assert_eq!(gc.get('✓'), GeneralCategory::OtherSymbol);
assert!(GeneralCategoryGroup::Symbol.contains(gc.get('✓')));
assert!(!GeneralCategoryGroup::Punctuation.contains(gc.get('✓')));

assert_eq!(gc.get(' '), GeneralCategory::SpaceSeparator);
assert!(GeneralCategoryGroup::Separator.contains(gc.get(' ')));
assert!(!GeneralCategoryGroup::Symbol.contains(gc.get(' ')));

// U+E007F CANCEL TAG
assert_eq!(gc.get32(0xE007F), GeneralCategory::Format);
assert!(GeneralCategoryGroup::Other.contains(gc.get32(0xE007F)));
assert!(!GeneralCategoryGroup::Separator.contains(gc.get32(0xE007F)));

Trait Implementations

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Formats the value using the given formatter. Read more

Converts to this type from the input type.

Converts to this type from the input type.

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

🔬 This is a nightly-only experimental API. (toowned_clone_into)

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.