CVE-2025-49847

HIGH

Ggml Llama.cpp < b5662 - Buffer Overflow

Title source: rule

Description

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

References (2)

Core 2

Core References

Mitigation, Vendor Advisory x_refsource_confirm

https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr

Patch x_refsource_misc

https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5

View Patch ZIP pw:eip

Scores

CVSS v3 8.8

EPSS 0.0061

EPSS Percentile 69.9%

Attack Vector NETWORK

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CISA SSVC

Vulnrichment

Exploitation poc

Automatable no

Technical Impact total

Details

CWE

CWE-119 CWE-195

Status published

Products (1)

ggml/llama.cpp < b5662

Published Jun 17, 2025

Tracked Since Feb 18, 2026