Skip to content

[SECURITY FEATURE]: Policy-as-Code Engine - Rego PrototypeΒ #271

@crivetimihai

Description

@crivetimihai

🧭 Epic

Title: Policy-as-Code Engine – Rego Prototype


Goal

Embed an Open Policy Agent (OPA) Rego engine inside the MCP Gateway so operators can declare fine-grained allow/deny rules without changing code.
The prototype must:

  • intercept every initialize, tools/call, and resource/* request
  • consult Rego policies (bundled or remote) in < 2 ms per decision
  • default to permissive (log-only) behind EXPERIMENTAL_POLICY_ENGINE=true
  • support hot-reload via bundle polling or SIGHUP
  • ship with starter policies for tool allow-lists, argument bounds, and output sanitization

Milestone: Release 0.8.0 (Enterprise Security & Policy Guardrails)
Repository impact: core gateway (mcpgateway/)


🧭 Type of Feature

  • Security hardening
  • Extensibility / configuration

πŸ›  Rego Policy Scope (v0.1)

Input JSON sent to OPA Example Fields
input.kind "tools/call", "initialize"
input.user { id: "alice", roles: ["dev"] }
input.tool.name "weather.get"
input.tool.args { city: "Berlin" }
input.request_ip "192.0.2.44"
input.headers full HTTP headers
input.response (post-eval hook only) tool result payload

OPA must return:

{
  "allow": true,
  "patch": { ... },       // optional, for arg/output rewrite
  "reason": "string..."   // optional, logged on deny
}

πŸ™‹β€β™‚οΈ User Stories & Acceptance Criteria

Story 1 – Role-based Tool Allow-List

Scenario: Dev user tries to call admin-only tool
Given policy "devs may NOT call database.backup"
And alice@org has role "dev"
When alice calls tools/call {name:"database.backup"}
Then gateway responds 403 "policy_deny"
And audit log contains "blocked by rego: role_mismatch"

Story 2 – Argument Bounds Enforcement

Scenario: Temperature conversion only supports -273..10 000 Β°C
Given policy that denies temp.convert if temp < -273
When request temp = -500
Then gateway responds 422 "validation_failed"

Story 3 – Output Sanitization Hook

Scenario: Strip ANSI escapes before response
Given policy patches response text replacing /\\u001B\\[[0-9;]*m/ with ""
When tool returns \"\\u001B[31mALERT\\u001B[0m\"
Then client sees \"ALERT\"

Story 4 – Hot-Reload Without Restart

Scenario: Update bundle on disk
Given gateway started with --policy-bundle=/policies/bundle.tar.gz
When bundle file is replaced and SIGHUP sent
Then new policies apply to the next request

Story 5 – Feature Flag Defaults to Permissive

Scenario: Engine disabled
Given env EXPRIMENTAL_POLICY_ENGINE is unset
When request violates sample policy
Then gateway still allows request
And violation logged at WARN "would_deny"

πŸ“ Design Sketch (Mermaid)

graph TD
    subgraph Request Path
        A[Inbound JSON-RPC] --> P[Policy Middleware]
        P -- allow --> H[Handler]
        P -- deny  --> D[403/422]
    end
    subgraph Response Path
        H --> P2[Policy Middleware - PostHook]
        P2 --> O[Outbound JSON-RPC]
    end
    P & P2:::opa
    classDef opa fill:#ffd,stroke:#333,stroke-width:1px;
Loading

Evaluation Budget: < 2 ms / decision (benchmark on Ryzen 7)
Bundle Format: OPA bundle with policy.rego + data.json


πŸ“‚ Component Matrix

Path / Component Status Purpose
mcpgateway/policy_middleware.py NEW Pre/post hooks, JSON marshalling, deny/patch logic
mcpgateway/policy_loader.py NEW Load bundle (disk, HTTP), verify SHA256, hot-reload
mcpgateway/config.py UPDATE POLICY_BUNDLE_PATH, POLICY_BUNDLE_URL, POLICY_POLL_SEC, POLICY_ENABLED
scripts/example_policy/ NEW Sample Rego & bundle build script
helm/values.yaml UPDATE policy.enabled, policy.bundleUrl, policy.pollSeconds
CI: .github/workflows/policy.yml NEW Unit tests (rego eval via opa exec), bundle lint
Docs: docs/security/policy.md NEW How to write, test, bundle, and hot-reload policies

πŸ“‹ Global Acceptance Checklist

  • Middleware denies/patches per Stories 1-3.
  • Hot-reload via SIGHUP or poll loop works (Story 4).
  • Flag off β†’ noop; flag on with missing bundle β†’ fails open, logs error.
  • Performance test: 99th pct decision < 2 ms.
  • k8s Helm values wire OPA bundle ConfigMap or remote URL.
  • CI runs opa test on sample policies.

πŸ”„ Roll-Out Plan

  1. Phase 0 – Spike: embed [github.com/open-policy-agent/opa/rego] with hard-coded policy.
  2. Phase 1 – Loader: implement bundle loader + CRC check + hot-reload.
  3. Phase 2 – Middleware: wire pre & post hooks, patch/deny logic, metrics (policy_eval_ms).
  4. Phase 3 – Helm & Docs: add chart values, sample ConfigMap, authoring guide.
  5. Phase 4 – Benchmarks & CI: k6 10 k RPS test, opa test, size impact audit.
  6. Merge to develop, behind EXPERIMENTAL_POLICY_ENGINE env flag.

Metadata

Metadata

Assignees

Labels

devopsDevOps activities (containers, automation, deployment, makefiles, etc)enhancementNew feature or requestpythonPython / backend development (FastAPI)securityImproves securitytriageIssues / Features awaiting triage

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions