CodePoint
CodePoint
to_u32 : CodePoint -> U32
Converts a CodePoint
to its underlying Unicode code point
integer representation.
from_u32 : U32 -> Result CodePoint [InvalidCodePoint]
Converts a U32
to a CodePoint
by verifying that it is a valid Unicode code point
(that is, it's between 0
and 0x10FFFF
).
is_valid_scalar : CodePoint -> Bool
Returns false if this is [isHighSurrogate] or [isLowSurrogate]
is_high_surrogate : CodePoint -> Bool
Returns true if this is a high-surrogate code point from U+D800 to U+DBFF
is_low_surrogate : CodePoint -> Bool
Returns true if this is a low-surrogate code point from U+DC00 to U+DFFF
utf8_len : CodePoint -> Result U8 [InvalidCodePoint]
Zig docs: bytes the UTF-8 representation would require for the given codepoint.
append_utf8 : List U8, CodePoint -> List U8
Encode a Scalar as UTF-8 bytes and append those bytes to an existing list of UTF-8 bytes.
count_utf8_bytes : CodePoint -> U8
The number of UTF-8 bytes it takes to represent this Scalar.
Utf8ParseErr :
[
OverlongEncoding,
ExpectedContinuation,
EncodesSurrogateHalf,
InvalidUtf8,
ListWasEmpty,
CodepointTooLarge
]
parse_utf8 : List U8 -> Result (List CodePoint) Utf8ParseErr
Parses a list of bytes into a list of code points
parse_partial_utf8 :
List U8
-> Result
{
code_point : CodePoint,
bytes_parsed : U64
} Utf8ParseErr
Parses the first code point found in a list of bytes encoded as UTF-8. Returns ListWasEmpty
if the list was empty, or InvalidUtf8
if the bytes were not valid UTF-8.
to_str : List CodePoint -> Result Str [BadUtf8]
east_asian_width_property : CodePoint -> EastAsianProperty
Computes the "east asian width" property for a given code point. See https://www.unicode.org/Public/15.1.0/ucd/EastAsianWidth.txt
visual_width : CodePoint -> U32
Computes the visual width of a code point as assigned by the Unicode Character Database