upb/json: fix sign-extended index in jsondec_base64_tablelookup (#27215)
## Problem
`jsondec_base64_tablelookup()` in `upb/json/decode.c` indexes a 256-byte `signed char` table with `table[(unsigned)ch]`. Because C integer promotion of a `signed char` runs *before* the cast, any input byte with the high bit set (`0x80..0xFF`) is sign-extended to a negative `int` and then reinterpreted as a huge unsigned value (`0xFFFFFF80..0xFFFFFFFF`). The 256-byte table is then read approximately 4 GiB past its base, producing an out-of-bounds read.
The same pattern exists in the amalgamated copies that the Ruby and PHP extensions ship:
- `ruby/ext/google/protobuf_c/ruby-upb.c`
- `php/ext/google/protobuf/php-upb.c`
## Fix
Cast through `unsigned char` so the byte is zero-extended to `[0x80, 0xFF]` before being used as a table index. One-character change in three files.
```diff
- return table[(unsigned)ch];
+ return table[(unsigned char)ch];
```
## Compatibility
- `ch` in `[0x00, 0x7F]`: `(unsigned)ch` and `(unsigned char)ch` produce identical values — no behavior change.
- `ch` in `[0x80, 0xFF]`: previously OOB read. The fix returns the table's `-1` sentinel, which `jsondec_base64()` already handles as "invalid base64 char" via `if (val < 0)`.
No public API changes, no new allocations, no new branches.
## Test plan
- Adds `optional bytes data = 11;` to `upb_test.Box` in `upb/json/test.proto` so a `bytes`-typed field is reachable from the existing `JsonDecode` helper in `decode_test.cc`.
- Adds `TEST(JsonTest, RejectsBase64WithHighBitBytes)` to `upb/json/decode_test.cc`, which decodes `{"data":"����"}` and verifies the decoder fails gracefully (no crash, returns nullptr). On the unfixed code under ASan this test exhibits the OOB read.
- Existing `upb/json/decode_test.cc` cases continue to pass.
- **Verified locally on `f331eba78` with `bazel test //upb/json:decode_test`**: with the fix all 5 tests pass; with the fix reverted (test kept), the new test fails with **SIGBUS** in `jsondec_base64_tablelookup` while the other 4 still pass — confirming the test exercises the exact code path the fix repairs.
## Files changed
| File | Change |
|---|---|
| `upb/json/decode.c` | One-character cast fix |
| `ruby/ext/google/protobuf_c/ruby-upb.c` | Same fix in amalgamated copy |
| `php/ext/google/protobuf/php-upb.c` | Same fix in amalgamated copy |
| `upb/json/test.proto` | `+optional bytes data = 11;` (test-only) |
| `upb/json/decode_test.cc` | Regression test |
If the project regenerates `ruby-upb.c` / `php-upb.c` from `upb/json/decode.c` automatically, please let me know and I will drop those two files from the PR.
## Reference
Reported via Google Bug Hunters / OSS VRP.
Closes #27215
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/27215 from sukhoon0975:fix/upb-json-base64-sign-extend 18d34d7b070d2a4c3e2b80b18aba43d8132eddbc
PiperOrigin-RevId: 915053186 K
Koon committed
58edfa50c0cc2481eeff701ad5ba6df6e997de15
Parent: 51f4c18
Committed by Copybara-Service <copybara-worker@google.com>
on 5/13/2026, 9:19:07 PM