mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-28 06:51:16 +08:00
docs(skills): salvage dropped trigger content into skill bodies
For 14 of 74 compressed skills, the original description contained trigger keywords, technique counts, attribution, or use-case phrases not covered by the existing body content. Prepends a 'When to use' / 'What's inside' block near the top so the agent still has the full context when the skill is loaded. Skills salvaged: - codex, ascii-video, creative-ideation, excalidraw, manim-video, p5js - gif-search, heartmula, youtube-content - lm-evaluation-harness, obliteratus, vllm, axolotl - powerpoint Remaining 60 skills were verified to already cover the dropped content in their existing body sections (When to Use, overview, intro prose) or had short descriptions fully captured by the new compressed form.
This commit is contained in:
@@ -13,6 +13,10 @@ metadata:
|
||||
|
||||
# vLLM - High-Performance LLM Serving
|
||||
|
||||
## When to use
|
||||
|
||||
Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
|
||||
|
||||
## Quick start
|
||||
|
||||
vLLM achieves 24x higher throughput than standard transformers through PagedAttention (block-based KV cache) and continuous batching (mixing prefill/decode requests).
|
||||
|
||||
Reference in New Issue
Block a user