CVE-2025-49847

HIGH

llama.cpp < b5662 - Buffer Overflow via Malicious GGUF Model Vocabulary

Title source: llm
STIX 2.1

Description

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

Scores

CVSS v3 8.8
EPSS 0.0044
EPSS Percentile 35.3%
Attack Vector NETWORK
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CISA SSVC

Vulnrichment
Exploitation poc
Automatable no
Technical Impact total

Details

CWE
CWE-119 CWE-195
Status published
Products (1)
ggml/llama.cpp < b5662
Published Jun 17, 2025
Tracked Since Feb 18, 2026