木叶吟
木叶吟
Home
Experience
Publications
Posts
CV
Light
Dark
Automatic
English
中文 (简体)
Congestion Control
CONCUR: Controlling Mid-Phase Thrashing in Agentic Batch Inference
A technical note on CONCUR, an agent-level admission control layer that prevents KV cache collapse during long-running agentic LLM inference.
Zhisheng YE
May 17, 2026
4 min read
CONCUR:让 Agent 批量推理避开中期拥塞
一篇关于 CONCUR 的技术笔记:它在 agent 层做准入控制,避免长时间运行的 LLM agent 推理把 KV cache 推入失控区间。
Zhisheng YE
May 17, 2026
Cite
×