# LLM Prompt for Documentation ## Documentation ### CodePoint #### CodePoint **Type Annotation** **Description** A [Unicode code point](http://www.unicode.org/glossary/#code_point). #### to_u32 **Type Annotation** ```roc CodePoint -> U32 ``` **Description** Converts a [CodePoint] to its underlying [Unicode code point](http://www.unicode.org/glossary/#code_point) integer representation. #### from_u32 **Type Annotation** ```roc U32 -> Result CodePoint [InvalidCodePoint] ``` **Description** Converts a [U32] to a [CodePoint] by verifying that it is a valid [Unicode code point](http://www.unicode.org/glossary/#code_point) (that is, it's between `0` and `0x10FFFF`). #### is_valid_scalar **Type Annotation** ```roc CodePoint -> Bool ``` **Description** Returns false if this is [isHighSurrogate] or [isLowSurrogate] #### is_high_surrogate **Type Annotation** ```roc CodePoint -> Bool ``` **Description** Returns true if this is a [high-surrogate code point](http://www.unicode.org/glossary/#high_surrogate_code_point) from U+D800 to U+DBFF #### is_low_surrogate **Type Annotation** ```roc CodePoint -> Bool ``` **Description** Returns true if this is a [low-surrogate code point](https://www.unicode.org/glossary/#low_surrogate_code_point) from U+DC00 to U+DFFF #### utf8_len **Type Annotation** ```roc CodePoint -> Result U8 [InvalidCodePoint] ``` **Description** Zig docs: bytes the UTF-8 representation would require for the given codepoint. #### append_utf8 **Type Annotation** ```roc List U8, CodePoint -> List U8 ``` **Description** Encode a Scalar as UTF-8 bytes and append those bytes to an existing list of UTF-8 bytes. #### count_utf8_bytes **Type Annotation** ```roc CodePoint -> U8 ``` **Description** The number of UTF-8 bytes it takes to represent this Scalar. #### Utf8ParseErr **Type Annotation** ```roc [ OverlongEncoding, ExpectedContinuation, EncodesSurrogateHalf, InvalidUtf8, ListWasEmpty, CodepointTooLarge ] ``` #### parse_utf8 **Type Annotation** ```roc List U8 -> Result (List CodePoint) Utf8ParseErr ``` **Description** Parses a list of bytes into a list of code points #### parse_partial_utf8 **Type Annotation** ```roc List U8 -> Result { code_point : CodePoint, bytes_parsed : U64 } Utf8ParseErr ``` **Description** Parses the first code point found in a list of bytes encoded as UTF-8. Returns `ListWasEmpty` if the list was empty, or `InvalidUtf8` if the bytes were not valid UTF-8. #### to_str **Type Annotation** ```roc List CodePoint -> Result Str [BadUtf8] ``` #### east_asian_width_property **Type Annotation** ```roc CodePoint -> EastAsianProperty ``` **Description** Computes the "east asian width" property for a given code point. See https://www.unicode.org/Public/15.1.0/ucd/EastAsianWidth.txt #### visual_width **Type Annotation** ```roc CodePoint -> U32 ``` **Description** Computes the visual width of a code point as assigned by the Unicode Character Database ### Grapheme #### Grapheme **Type Annotation** **Description** Extended Grapheme Cluster #### split **Type Annotation** ```roc Str -> Result (List Str) Utf8ParseErr ``` **Description** Split a string into extended grapheme clusters