feat(webhook): direct delivery mode for zero-LLM push notifications (#12473)

External services can now push plain-text notifications to a user's chat
via the webhook adapter without invoking the agent. Set deliver_only=true
on a route and the rendered prompt template becomes the literal message
body — dispatched directly to the configured target (Telegram, Discord,
Slack, GitHub PR comment, etc.).

Reuses all existing webhook infrastructure: HMAC-SHA256 signature
validation, per-route rate limiting, idempotency cache, body-size limits,
template rendering with dot-notation, home-channel fallback. No new HTTP
server, no new auth scheme, no new port.

Use cases: Supabase/Firebase webhooks → user notifications, monitoring
alert forwarding, inter-agent pings, background job completion alerts.

Changes:
- gateway/platforms/webhook.py: new _direct_deliver() helper + early
  dispatch branch in _handle_webhook when deliver_only=true. Startup
  validation rejects deliver_only with deliver=log.
- hermes_cli/main.py + hermes_cli/webhook.go: --deliver-only flag on
  subscribe; list/show output marks direct-delivery routes.
- website/docs/user-guide/messaging/webhooks.md: new Direct Delivery
  Mode section with config example, CLI example, response codes.
- skills/devops/webhook-subscriptions/SKILL.md: document --deliver-only
  with use cases (bumped to v1.1.0).
- tests/gateway/test_webhook_deliver_only.py: 14 new tests covering
  agent bypass, template rendering, status codes, HMAC still enforced,
  idempotency still applies, rate limit still applies, startup
  validation, and direct-deliver dispatch.

Validation: 78 webhook tests pass (64 existing + 14 new). E2E verified
with real aiohttp server + real urllib POST — agent not invoked, target
adapter.send() called with rendered template, duplicate delivery_id
suppressed.

Closes the gap identified in PR #12117 (thanks to @H1an1 / Antenna team)
without adding a second HTTP ingress server.
This commit is contained in:
Teknium
2026-04-19 05:18:19 -07:00
committed by GitHub
parent 66ee081dc1
commit 206a449b29
6 changed files with 699 additions and 4 deletions

View File

@@ -72,6 +72,7 @@ Routes define how different webhook sources are handled. Each route is a named e
| `skills` | No | List of skill names to load for the agent run. |
| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `sms`, `whatsapp`, `matrix`, `mattermost`, `homeassistant`, `email`, `dingtalk`, `feishu`, `wecom`, `weixin`, `bluebubbles`, `qqbot`, or `log` (default). |
| `deliver_extra` | No | Additional delivery config — keys depend on `deliver` type (e.g. `repo`, `pr_number`, `chat_id`). Values support the same `{dot.notation}` templates as `prompt`. |
| `deliver_only` | No | If `true`, skip the agent entirely — the rendered `prompt` template becomes the literal message that gets delivered. Zero LLM cost, sub-second delivery. See [Direct Delivery Mode](#direct-delivery-mode) for use cases. Requires `deliver` to be a real target (not `log`). |
### Full example
@@ -240,6 +241,80 @@ For cross-platform delivery, the target platform must also be enabled and connec
---
## Direct Delivery Mode {#direct-delivery-mode}
By default, every webhook POST triggers an agent run — the payload becomes a prompt, the agent processes it, and the agent's response is delivered. This costs LLM tokens on every event.
For use cases where you just want to **push a plain notification** — no reasoning, no agent loop, just deliver the message — set `deliver_only: true` on the route. The rendered `prompt` template becomes the literal message body, and the adapter dispatches it directly to the configured delivery target.
### When to use direct delivery
- **External service push** — Supabase/Firebase webhook fires on a database change → notify a user in Telegram instantly
- **Monitoring alerts** — Datadog/Grafana alert webhook → push to a Discord channel
- **Inter-agent pings** — Agent A notifies Agent B's user that a long-running task finished
- **Background job completion** — Cron job finishes → post result to Slack
Benefits:
- **Zero LLM tokens** — the agent is never invoked
- **Sub-second delivery** — a single adapter call, no reasoning loop
- **Same security as agent mode** — HMAC auth, rate limits, idempotency, and body-size limits all still apply
- **Synchronous response** — the POST returns `200 OK` once delivery succeeds, or `502` if the target rejects it, so your upstream service can retry intelligently
### Example: Telegram push from Supabase
```yaml
platforms:
webhook:
enabled: true
extra:
port: 8644
secret: "global-secret"
routes:
antenna-matches:
secret: "antenna-webhook-secret"
deliver: "telegram"
deliver_only: true
prompt: "🎉 New match: {match.user_name} matched with you!"
deliver_extra:
chat_id: "{match.telegram_chat_id}"
```
Your Supabase edge function signs the payload with HMAC-SHA256 and POSTs to `https://your-server:8644/webhooks/antenna-matches`. The webhook adapter validates the signature, renders the template from the payload, delivers to Telegram, and returns `200 OK`.
### Example: Dynamic subscription via CLI
```bash
hermes webhook subscribe antenna-matches \
--deliver telegram \
--deliver-chat-id "123456789" \
--deliver-only \
--prompt "🎉 New match: {match.user_name} matched with you!" \
--description "Antenna match notifications"
```
### Response codes
| Status | Meaning |
|--------|---------|
| `200 OK` | Delivered successfully. Body: `{"status": "delivered", "route": "...", "target": "...", "delivery_id": "..."}` |
| `200 OK` (status=duplicate) | Duplicate `X-GitHub-Delivery` ID within the idempotency TTL (1 hour). Not re-delivered. |
| `401 Unauthorized` | HMAC signature invalid or missing. |
| `400 Bad Request` | Malformed JSON body. |
| `404 Not Found` | Unknown route name. |
| `413 Payload Too Large` | Body exceeded `max_body_bytes`. |
| `429 Too Many Requests` | Route rate limit exceeded. |
| `502 Bad Gateway` | Target adapter rejected the message or raised. The error is logged server-side; the response body is a generic `Delivery failed` to avoid leaking adapter internals. |
### Configuration gotchas
- `deliver_only: true` requires `deliver` to be a real target. `deliver: log` (or omitting `deliver`) is rejected at startup — the adapter refuses to start if it finds a misconfigured route.
- The `skills` field is ignored in direct delivery mode (no agent runs, so there's nothing to inject skills into).
- Template rendering uses the same `{dot.notation}` syntax as agent mode, including the `{__raw__}` token.
- Idempotency uses the same `X-GitHub-Delivery` / `X-Request-ID` header — retries with the same ID return `status=duplicate` and do NOT re-deliver.
---
## Dynamic Subscriptions (CLI) {#dynamic-subscriptions}
In addition to static routes in `config.yaml`, you can create webhook subscriptions dynamically using the `hermes webhook` CLI command. This is especially useful when the agent itself needs to set up event-driven triggers.