# LLM Prompt for Documentation

## Documentation

### CodePoint

#### CodePoint

**Type Annotation**

**Description**

A [Unicode code point](http://www.unicode.org/glossary/#code_point).

#### to_u32

**Type Annotation**

```roc
CodePoint -> U32
```

**Description**

Converts a [CodePoint] to its underlying [Unicode code point](http://www.unicode.org/glossary/#code_point)
integer representation.

#### from_u32

**Type Annotation**

```roc
U32 -> Result CodePoint [InvalidCodePoint]
```

**Description**

Converts a [U32] to a [CodePoint] by verifying that it is a valid [Unicode code point](http://www.unicode.org/glossary/#code_point)
(that is, it's between `0` and `0x10FFFF`).

#### is_valid_scalar

**Type Annotation**

```roc
CodePoint -> Bool
```

**Description**

Returns false if this is [isHighSurrogate] or [isLowSurrogate]

#### is_high_surrogate

**Type Annotation**

```roc
CodePoint -> Bool
```

**Description**

Returns true if this is a [high-surrogate code point](http://www.unicode.org/glossary/#high_surrogate_code_point)
from U+D800 to U+DBFF

#### is_low_surrogate

**Type Annotation**

```roc
CodePoint -> Bool
```

**Description**

Returns true if this is a [low-surrogate code point](https://www.unicode.org/glossary/#low_surrogate_code_point)
from U+DC00 to U+DFFF

#### utf8_len

**Type Annotation**

```roc
CodePoint -> Result U8 [InvalidCodePoint]
```

**Description**

Zig docs: bytes the UTF-8 representation would require
for the given codepoint.

#### append_utf8

**Type Annotation**

```roc
List U8, CodePoint -> List U8
```

**Description**

Encode a Scalar as UTF-8 bytes and append those bytes to an existing list of UTF-8 bytes.

#### count_utf8_bytes

**Type Annotation**

```roc
CodePoint -> U8
```

**Description**

The number of UTF-8 bytes it takes to represent this Scalar.

#### Utf8ParseErr

**Type Annotation**

```roc

    [
        OverlongEncoding,
        ExpectedContinuation,
        EncodesSurrogateHalf,
        InvalidUtf8,
        ListWasEmpty,
        CodepointTooLarge
    ]
```

#### parse_utf8

**Type Annotation**

```roc
List U8 -> Result (List CodePoint) Utf8ParseErr
```

**Description**

Parses a list of bytes into a list of code points

#### parse_partial_utf8

**Type Annotation**

```roc

    List U8
    -> Result 
        {
            code_point : CodePoint,
            bytes_parsed : U64
        } Utf8ParseErr
```

**Description**

Parses the first code point found in a list of bytes encoded as UTF-8. Returns `ListWasEmpty`
if the list was empty, or `InvalidUtf8` if the bytes were not valid UTF-8.

#### to_str

**Type Annotation**

```roc
List CodePoint -> Result Str [BadUtf8]
```

#### east_asian_width_property

**Type Annotation**

```roc
CodePoint -> EastAsianProperty
```

**Description**

Computes the "east asian width" property for a given code point.
See https://www.unicode.org/Public/15.1.0/ucd/EastAsianWidth.txt

#### visual_width

**Type Annotation**

```roc
CodePoint -> U32
```

**Description**

Computes the visual width of a code point as assigned by the Unicode Character Database

### Grapheme

#### Grapheme

**Type Annotation**

**Description**

Extended Grapheme Cluster

#### split

**Type Annotation**

```roc
Str -> Result (List Str) Utf8ParseErr
```

**Description**

Split a string into extended grapheme clusters