# `Unicode.String.Break.Grapheme`
[🔗](https://github.com/elixir-unicode/unicode_string/blob/v2.1.0/lib/unicode/string/break/grapheme.ex#L1)

Single-pass DFA-style implementation of UAX #29 grapheme cluster
segmentation.

The state carried between characters is intentionally small:

* `prev` — the Grapheme_Cluster_Break property of the previous codepoint
* `ri_parity` — `:even` or `:odd`, tracking the parity of the run of
  Regional_Indicators ending at `prev` (used by GB12/GB13)
* `ext_pict_zwj` — `true` when the prefix ends with
  `\p{Extended_Pictographic} \p{Extend}* \p{ZWJ}` (used by GB11)
* `incb` — `:none | :consonant | :linker`, tracking progress through
  the GB9c sequence
  `\p{InCB=Consonant} [\p{InCB=Extend}\p{InCB=Linker}]* \p{InCB=Linker}
   [\p{InCB=Extend}\p{InCB=Linker}]* × \p{InCB=Consonant}`

Each character is classified once via `Unicode.GraphemeClusterBreak`,
`Unicode.IndicConjunctBreak` and a compile-time set of
Extended_Pictographic ranges, then a constant-time decision determines
whether to emit a break or continue the cluster.

# `break?`

```elixir
@spec break?(String.t(), String.t()) :: boolean()
```

Returns `true` if there is a grapheme cluster boundary between
`string_before` and `string_after`.

When `string_before` is empty there is always a boundary (GB1).
When `string_after` is empty there is always a boundary (GB2).

# `next`

```elixir
@spec next(String.t()) :: {String.t(), String.t()} | nil
```

Returns the index of the next grapheme cluster boundary after position 0
in `string`, expressed as a `{first_grapheme, rest}` tuple.

Returns `nil` for the empty string.

# `split`

```elixir
@spec split(String.t()) :: [String.t()]
```

Splits `string` into a list of grapheme clusters according to UAX #29.

---

*Consult [api-reference.md](api-reference.md) for complete listing*