CVE-2026-34760
MEDIUMvLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models
Title source: cnaDescription
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.
References (4)
Core 4
Core References
X_Refsource_Confirm x_refsource_confirm
https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8
X_Refsource_Misc x_refsource_misc
https://github.com/vllm-project/vllm/pull/37058
X_Refsource_Misc x_refsource_misc
https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4
X_Refsource_Misc x_refsource_misc
https://github.com/vllm-project/vllm/releases/tag/v0.18.0
Scores
CVSS v3
5.9
EPSS
0.0027
EPSS Percentile
18.1%
Attack Vector
NETWORK
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L
CISA SSVC
Vulnrichment
Exploitation
none
Automatable
no
Technical Impact
partial
Details
CWE
CWE-20
Status
published
Products (2)
vllm/vllm
0.5.5 - 0.18.0
vllm-project/vllm
>= 0.5.5, < 0.18.0
Published
Apr 02, 2026
Tracked Since
Apr 03, 2026