suckless.org/1/git/sites/commit/bfea05be7798690d75fa3b547e9908d77aa8796d.gph

       Update context for libunistring on the libgrapheme page - sites - public wiki contents of suckless.org
  HTML git clone git://git.suckless.org/sites
   DIR Log
   DIR Files
   DIR Refs
       ---
   DIR commit bfea05be7798690d75fa3b547e9908d77aa8796d
   DIR parent ab029cafc41c976c061eed2e49367e0400fd8fd2
  HTML Author: Laslo Hunhold <laslo@hunhold.de>
       Date:   Sat,  3 Jan 2026 11:40:55 +0100
       
       Update context for libunistring on the libgrapheme page
       
       Some of the points raised in this old rant are not true (anymore) or
       were imprecise/wrong regarding libunistring. Thank you Bruno Haible for
       reaching out about this!
       
       Signed-off-by: Laslo Hunhold <dev@frign.de>
       
       Diffstat:
         M libs.suckless.org/libgrapheme/inde… |      28 +++++++++++++++-------------
       
       1 file changed, 15 insertions(+), 13 deletions(-)
       ---
   DIR diff --git a/libs.suckless.org/libgrapheme/index.md b/libs.suckless.org/libgrapheme/index.md
       @@ -152,19 +152,21 @@ embedded applications.
        The problem can be easily seen when looking at the sizes of the respective
        libraries: The ICU library (libicudata.a, libicui18n.a, libicuio.a,
        libicutest.a, libicutu.a, libicuuc.a) is around 38MB and libunistring
       -(libunistring.a) is around 2MB, which is unacceptable for static
       -linking. Both take many minutes to compile even on a good computer and
       -require a lot of dependencies, including Python for ICU. On
       -the other hand libgrapheme (libgrapheme.a) only weighs in at around 300K
       -and is compiled (including Unicode data parsing and compression) in
       -under a second, requiring nothing but a C99 compiler and POSIX make(1).
       -
       -Some libraries, like libutf8proc and libunistring, are incorrect by
       -basing their API on assumptions that haven't been true for years
       -(e.g. offering stateless grapheme cluster segmentation even though the
       -underlying algorithm is not stateless). As an additional factor,
       -libutf8proc's UTF-8-decoder is unsafe, as it allows overlong encodings
       -that can be easily used for exploits.
       +(libunistring.a) is around 2MB. Both take many minutes to compile even on
       +a good computer, and ICU depends on Python, among others. On the other hand,
       +libgrapheme (libgrapheme.a) only weighs in at around 400K and is compiled
       +(including Unicode data parsing and compression) in under a second,
       +requiring nothing but a C99 compiler and POSIX make(1).
       +
       +Some libraries, like libutf8proc, are incorrect by basing their API on
       +assumptions that haven't been true for years (e.g. offering stateless
       +grapheme cluster segmentation even though the underlying algorithm is
       +not stateless). As an additional factor, libutf8proc's UTF-8-decoder
       +is unsafe, as it allows overlong encodings that can be easily used for
       +exploits. While libunistring has expanded their API offering e.g.
       +u8_grapheme_next() and u8_grapheme_prev() that are standard conformant,
       +its API still contains not-explicitly deprecated functions assuming
       +an older data model, for instance uc_is_grapheme_break().
        
        While ICU and libunistring offer a lot of functions and the weight mostly
        comes from locale-data provided by the Unicode standard, which is applied