CodePoint

CodePoint

A Unicode code point.

to_u32 : CodePoint -> U32

Converts a CodePoint to its underlying Unicode code point integer representation.

from_u32 : U32 -> Result CodePoint [InvalidCodePoint]

Converts a U32 to a CodePoint by verifying that it is a valid Unicode code point (that is, it's between 0 and 0x10FFFF).

is_valid_scalar : CodePoint -> Bool

Returns false if this is [isHighSurrogate] or [isLowSurrogate]

is_high_surrogate : CodePoint -> Bool

Returns true if this is a high-surrogate code point from U+D800 to U+DBFF

is_low_surrogate : CodePoint -> Bool

Returns true if this is a low-surrogate code point from U+DC00 to U+DFFF

utf8_len : CodePoint -> Result U8 [InvalidCodePoint]

Zig docs: bytes the UTF-8 representation would require for the given codepoint.

append_utf8 : List U8, CodePoint -> List U8

Encode a Scalar as UTF-8 bytes and append those bytes to an existing list of UTF-8 bytes.

count_utf8_bytes : CodePoint -> U8

The number of UTF-8 bytes it takes to represent this Scalar.

Utf8ParseErr : [ OverlongEncoding, ExpectedContinuation, EncodesSurrogateHalf, InvalidUtf8, ListWasEmpty, CodepointTooLarge ]

parse_utf8 : List U8 -> Result (List CodePoint) Utf8ParseErr

Parses a list of bytes into a list of code points

parse_partial_utf8 : List U8 -> Result { code_point : CodePoint, bytes_parsed : U64 } Utf8ParseErr

Parses the first code point found in a list of bytes encoded as UTF-8. Returns ListWasEmpty if the list was empty, or InvalidUtf8 if the bytes were not valid UTF-8.

to_str : List CodePoint -> Result Str [BadUtf8]

east_asian_width_property : CodePoint -> EastAsianProperty

Computes the "east asian width" property for a given code point. See https://www.unicode.org/Public/15.1.0/ucd/EastAsianWidth.txt

visual_width : CodePoint -> U32

Computes the visual width of a code point as assigned by the Unicode Character Database