SIGN IN SIGN UP

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder (#119398)

Add unicode_decode_utf8_writer() to write directly characters into a
_PyUnicodeWriter writer: avoid the creation of a temporary string.
Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().

Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().

Microbenchmark on the code:

    return PyUnicode_FromFormat(
        "%s %s %s %s %s.",
        "format", "multiple", "utf8", "short", "strings");

Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.
V
Victor Stinner committed
9b422fc6af87b81812aaf3010c004eb27c4dc280
Parent: 14b063c
Committed by GitHub <noreply@github.com> on 5/22/2024, 9:05:26 PM