Field notes · Security engineering

The Seal VPE protocol — cryptographic provenance for AI agent prompts

Grey Ridge Signals Group · June 2026

Every prompt injection defense in production today is linguistic: pattern matching, prose-level "untrusted data" markers, canary tokens, system-prompt fencing. These work — our own eval harness measured 94% overall pass — but they operate on content, not authority. They ask "does this look like an injection?" rather than "do we trust who wrote this?"

Seal's Verifiable Prompt Envelope (VPE) protocol takes the second approach. Instead of detecting malicious prompts by their shape, VPE signs legitimate ones at the source and rejects anything that can't prove its origin. A prompt without a valid signature is treated as untrusted — regardless of how benign it reads.

This post describes the protocol, the implementation decisions that went into it, and the current state of the project. The complete source is at github.com/nousresearch/seal.

1. The problem: linguistic detection hits a ceiling

Linguistic injection detection uses pattern matching — regex patterns for common injection vectors, LLM-based classifiers for ambiguous cases, and normalization layers for Unicode obfuscation. The EPD scanner in the Seal project itself achieves 91%+ detection with its regex pass and adds an optional LLM classifier for the remainder.

Ninety-one percent sounds good until you ask the follow-up: what happens to the 9% that slip through? In a system where the model's output is text (a chatbot, a content generator), a missed injection produces a bad reply. In an agentic system — where the model's output drives tool calls, database writes, or email sends — a missed injection becomes a command execution.

The ceiling isn't a matter of tuning. Linguistic detectors operate on the same plane as the attack. An obfuscation technique that the detector's normalization pass doesn't handle (or hasn't seen yet) defeats the defense. The attacker and defender are playing the same game on the same field.

Cryptographic provenance changes the field. The question shifts from "does this text contain attack-like patterns?" to "was this prompt authorized by a trusted issuer?" These are independent detection surfaces — and the second one cannot be bypassed by better-obfuscated text.

2. How VPE works: signed envelopes with ordered verification

A VPE envelope is a signed JSON object that binds a prompt to its author, scope, and execution constraints. The protocol uses Ed25519 for asymmetric signing with a deterministic canonical serialization that prevents canonicalization attacks:

{
  "vpe_version": "1.0",
  "prompt": "search the database for customer X",
  "scope": {
    "allowed_tools": ["database_search", "read_file"],
    "max_tokens": 4000,
    "max_cost": 0.05
  },
  "issuer": "user:rez",
  "audience": "agent:hermes-default",
  "ttl_seconds": 300,
  "nonce": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "counter": 42,
  "signature": "ed25519_sig_hex..."
}

The signature covers every field except itself, serialized with keys sorted alphabetically, no whitespace. This means any modification — to the prompt text, the audience, the scope, the nonce — invalidates the signature.

Verification follows a strict ordered pipeline that fails fast: schema validation → Ed25519 signature check → audience match → TTL expiry → nonce replay → counter monotonicity → scope enforcement → optional EPD scan. Each stage produces a distinct error code (INVALID_SIGNATURE, WRONG_AUDIENCE, EXPIRED, NONCE_REPLAY, etc.) so callers can distinguish a replay attack from a stale envelope from a mistargeted prompt without re-parsing the whole envelope.

3. Beyond a single signer: multi-sig, cert chains, and hardware keys

A simple sign-and-verify protocol is useful but limited. Seal extends VPE to cover the trust structures real agent systems need:

N-of-M multi-signature. A prompt requiring three of five authorized approvers must carry three independent Ed25519 signatures in separate sigN fields. The verification pipeline aggregates them: it counts valid signatures against the threshold encoded in the envelope's signature_quorum field. This supports approval workflows where no single user is authorized to issue certain high-risk prompts alone.
Hierarchical certificate chains. Instead of provisioning every agent with every possible issuer's public key, Seal supports a root → intermediate → signing key chain. verify_cert_chain() walks the chain and validates that each link is signed by its parent, enabling delegated authority and key revocation without re-keying every agent in the fleet.
Hardware-bound keys. Via vpe_sign_hardware(), private keys can live on a YubiKey, TPM, or Secure Enclave — they never enter system memory. The signing operation is executed on the device. This protects against host compromise: even if an attacker has root on the machine, they cannot exfiltrate the signing key.

4. Defense in depth: EPD as a second layer

Cryptographic provenance doesn't replace linguistic detection — it complements it. Seal's EPD (Entropic Prompt Detection) scanner runs as an optional eighth verification step, after all cryptographic checks pass. A prompt that arrives with a valid signature and correct scope can still be flagged by EPD if its text matches injection patterns.

Why run both? Because a signed prompt can come from a compromised issuer key. Signature proves who authorized the prompt, not that the prompt is safe. EPD catches the linguistic attack surface; VPE catches the authorization surface. Together they provide defense in depth across both vectors.

The EPD scanner itself uses a two-pass design — a fast regex pass (5 categories, 91%+ detection) followed by an optional LLM classifier for semantic bypasses — with a Unicode-smuggling detection layer that decodes tag-block and variation-selector payloads before either pass.

5. Integration: MCP middleware and audit trail

Seal ships as both a standalone CLI (seal sign, seal verify, seal genkey — 18 commands total) and as a Python library that integrates with AI agent frameworks via MCP (Model Context Protocol) middleware.

The MCP integration sits between the agent's reasoning loop and its tool execution layer. Every inbound prompt passes through the VPE verification pipeline before the agent processes it. Every outbound signed prompt gets an audit record pushed to Division — our persistent memory system — creating a tamper-evident chain of all prompt issuances and verifications across the agent's lifetime.

The integration is designed as a toggle: one config flag enables VPE for an entire agent, and the rollback tool (seal rollback) restores the previous state (unsigned prompts accepted as untrusted) while preserving the audit trail.

6. Project state: v1 complete, production hardening underway

Seal v1 has shipped across four phases covering the VPE core spec, the EPD scanner, the secrets broker, and the Hermes/Division integration. The test suite stands at over 650 tests spanning Ed25519 and HMAC signing, replay protection, multi-signature edge cases, certificate chain validation, hardware signing stubs, Unicode obfuscation detection, and end-to-end integration tests.

Phase 5 — production hardening — is the current focus: encryption-at-rest for the key store, performance benchmarks across the verification pipeline, and expanded fuzzing of the EPD scanner.

Known limitations that we track openly in the repo include unencrypted private keys at rest in the SQLite key store (protect ~/.seal/keys.db with filesystem permissions for now) and a legacy plaintext credential store path that's being retired in favor of the Fernet-encrypted replacement. The Python implementation is the only one available today; TypeScript, Go, and Rust ports are on the roadmap for Phase 8.

What VPE changes for agent security

Cryptographic prompt provenance changes the security posture of an AI agent from trust what sounds right to trust what you can verify. Not because linguistic detection is useless — we rely on it in our own stack — but because a defense that operates on the same plane as the attack has a fundamental ceiling. Adding a cryptographic layer raises the ceiling without moving the floor.

The protocol is designed to be implementation-independent: same canonical JSON serialization, same Ed25519 curve, same ordered verification pipeline regardless of language. A prompt signed in Python will verify correctly against a Rust verifier, once the ports exist. Cross-language verification is a design goal, not yet a capability — but the spec is settled and the reference implementation is stable.

Grey Ridge Signals Group LLC is an AI security and security architecture advisory firm. We wrote the first version of the VPE specification and built the Seal reference implementation as part of our ongoing R&D into agentic system security. The full source is available at github.com/nousresearch/seal under the MIT license.

Grey Ridge Signals Group LLC · AI & cloud security Agentic System Security Review service →