Value | Meaning |
---|---|
UTF8PROC_NULLTERM(1 << 0) | The given UTF-8 input is NULL terminated. |
UTF8PROC_STABLE(1 << 1) | Unicode Versioning Stability has to be respected. |
UTF8PROC_COMPAT(1 << 2) | Compatibility decomposition (i.e. formatting information is lost). |
UTF8PROC_COMPOSE(1 << 3) | Return a result with decomposed characters. |
UTF8PROC_DECOMPOSE(1 << 4) | Return a result with decomposed characters. |
UTF8PROC_IGNORE(1 << 5) | Strip "default ignorable characters" such as SOFT-HYPHEN or ZERO-WIDTH-SPACE. |
UTF8PROC_REJECTNA(1 << 6) | Return an error, if the input contains unassigned codepoints. |
UTF8PROC_NLF2LS(1 << 7) | Indicating that NLF-sequences (LF, CRLF, CR, NEL) are representing a line break, and should be converted to the codepoint for line separation (LS). |
UTF8PROC_NLF2PS(1 << 8) | Indicating that NLF-sequences are representing a paragraph break, and should be converted to the codepoint for paragraph separation (PS). |
UTF8PROC_NLF2LF(UTF8PROC_NLF2LS | UTF8PROC_NLF2PS) | Indicating that the meaning of NLF-sequences is unknown. |
UTF8PROC_STRIPCC(1 << 9) | Strips and/or convers control characters. NLF-sequences are transformed into space, except if one of the NLF2LS/PS/LF options is given. HorizontalTab (HT) and FormFeed (FF) are treated as a NLF-sequence in this case. All other control characters are simply removed. |
UTF8PROC_CASEFOLD(1 << 10) | Performs unicode case folding, to be able to do a case-insensitive string comparison. |
UTF8PROC_CHARBOUND(1 << 11) | Inserts 0xFF bytes at the beginning of each sequence which is representing a single grapheme cluster (see UAX#29). |
UTF8PROC_LUMP(1 << 12) | Lumps certain characters together. E.g. HYPHEN U+2010 and MINUS U+2212 to ASCII "-". See lump.md for details. If NLF2LF is set, this includes a transformation of paragraph and line separators to ASCII line-feed (LF). |
UTF8PROC_STRIPMARK(1 << 13) | Strips all character markings. This includes non-spacing, spacing and enclosing (i.e. accents). @note This option works only with @ref UTF8PROC_COMPOSE or @ref UTF8PROC_DECOMPOSE |
Option flags used by several functions in the library.