CVE-2026-34760

MEDIUM

vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models

Title source: cna

Description

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.

Scores

CVSS v3 5.9
EPSS 0.0006
EPSS Percentile 20.0%
Attack Vector NETWORK
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L

Details

CWE
CWE-20
Status published
Products (1)
vllm-project/vllm >= 0.5.5, < 0.18.0
Published Apr 02, 2026
Tracked Since Apr 03, 2026