CVE-2026-34760
MEDIUMvLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models
Title source: cnaDescription
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.
Scores
CVSS v3
5.9
EPSS
0.0006
EPSS Percentile
20.0%
Attack Vector
NETWORK
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L
Details
CWE
CWE-20
Status
published
Products (1)
vllm-project/vllm
>= 0.5.5, < 0.18.0
Published
Apr 02, 2026
Tracked Since
Apr 03, 2026