[3.13] Correctly fold unknown-8bit originating from encoded words. (GH-142517) (#143147)
The unknown-8bit trick was designed to deal with unknown bytes in an ASCII message, and it works fine for that. However, I also tried to extend it to handle bytes that can't be decoded using the charset specified in an encoded word, and there it fails because there can be other non-ASCII characters that were *successfully* decoded. The fix is simple: do the unknown-8bit encoding using the utf-8 codec. This is especially appropriate since anyone trying to do recovery on an unknown byte string will probably attempt utf-8 first. (cherry picked from commit 1e17ccd030a2285ad53db5952360fffa33a8a877) Co-authored-by: R. David Murray <rdmurray@bitdance.com> Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
M
Miss Islington (bot) committed
88025560aa2c275b811907ef21e9cb7e09cdcdca
Parent: 86504f2
Committed by GitHub <noreply@github.com>
on 12/24/2025, 6:19:28 PM