Skip to content

[SECURITY FEATURE]: Guardrails - Input/Output Sanitization & PII Masking #229

@crivetimihai

Description

@crivetimihai

🧭 Epic

Title: Gateway-Level Input/Output Guardrails
Goal: Inject a sanitization layer that validates every inbound arg and redacts PII/control chars on every outbound payload.
Why now: We've already seen proof-of-concept command injections and leakage of email/SSN data in logs-time to ship a default shield.

Note: the PII filters should be configurable in the form of multiple rules, (preferably by API and UI), with multiple types of filter you can turn on/off + RegExp support and support Export/Import of the rules. This will likely require specifying the type of rule (input or output), the JSON Path filter on which fields it should apply to (support multiple?) and an action (delete, encrypt, obfuscate?). Both RegExp and LLM powered rules should exist.
Note: this should likely be designed as a stand-alone component (firewall) rather than built in, and can be enabled separately.


🧭 Type of Feature

  • Security hardening
  • New functionality (experimental)

🙋‍♂️ User Story 1 - Length & Charset Validation

As a: Gateway maintainer
I want: the runtime to reject payloads exceeding max length or containing non-UTF-8 bytes
So that: malicious blobs never reach shells, DBs, or LLM prompts.

✅ Acceptance Criteria

Scenario: Reject over-sized prompt variable
Given MAX_INPUT_SIZE=8192 bytes
When a client passes 9000-byte JSON param
Then respond 413 "payload_too_large"

🙋‍♂️ User Story 2 - Automatic PII Masking

As a: Compliance officer
I want: emails, phone numbers, and card PANs masked in logs & UI responses
So that: sensitive data stays out of unsecured storage.

✅ Acceptance Criteria

Scenario: Mask email in tool output
Given tool returns "User: [email protected]"
When gateway serializes the JSON-RPC result
Then replace localpart with "***""User: ***@example.com"

🙋‍♂️ User Story 3 - Outbound Control-Char Stripping

As a: Client integrator
I want: the gateway to remove ASCII C0/C1 controls from every response
So that: terminal/HTML escape exploits are neutralized.

✅ Acceptance Criteria

Scenario: Strip escape sequence
Given a tool returns "\x1B[31mALERT\x1B[0m"
Then the gateway emits "ALERT" (plain text)

📐 Design Sketch

flowchart TD
  Inbound[JSON-RPC Request]-->SanitizeIn[Validate & Sanitize In]
  SanitizeIn--✔-->Handler
  Handler-->SanitizeOut[Sanitize Outbound]
  SanitizeOut-->Outbound[Response]
Loading
Component Change Detail
validation.py NEW Length, charset, regex allow-list
pii_masker.py NEW Regexes for email, phone, PAN
response_pipeline.py UPDATE strip controls, enforce MIME
Config ADD MAX_INPUT_SIZE, MASK_PII=true

🔄 Roll-out Plan

  1. Phase 0: Feature-flag EXPERIMENTAL_GUARDRAILS.
  2. Phase 1: Log-only "warn" mode.
  3. Phase 2: Block mode in staging.
  4. Phase 3: Enforce in prod, publish metrics dashboard.

📝 Spec-Draft Clauses

  1. Validation - "Servers MUST treat all inbound values as untrusted and validate length/charset."
  2. PII Masking - "Implementations SHOULD redact emails, phone numbers, PANs in logs."
  3. Control Char - "Responses MUST exclude non-printable ASCII except LF/CR/TAB."

📣 Next Steps

  • Draft JSON-Schema for built-in methods.
  • Integrate open-source pii-extraction lib (e.g., pii-extractor).
  • Add unit tests in tests/security/.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpythonPython / backend development (FastAPI)securityImproves securitytriageIssues / Features awaiting triage

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions