Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. This vulnerability is fixed in 0.20.0.
References (2)
Core 2
Core References
X_Refsource_Confirm x_refsource_confirm
https://github.com/vllm-project/vllm/security/advisories/GHSA-hpv8-x276-m59f
X_Refsource_Misc x_refsource_misc
https://github.com/vllm-project/vllm/issues/32656
Scores
CVSS v3
6.5
EPSS
0.0001
EPSS Percentile
3.1%
Attack Vector
NETWORK
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
CISA SSVC
Vulnrichment
Exploitation
none
Automatable
no
Technical Impact
partial
Details
CWE
CWE-129
Status
published
Products (3)
pypi/vllm
0.6.1 - 0.20.0PyPI
vllm/vllm
0.6.1 - 0.20.0
vllm-project/vllm
>= 0.6.1, < 0.20.0
Published
May 12, 2026
Tracked Since
May 13, 2026