Skip to main content

Methodology

Validation Contracts

Declarative, per-client rules for what AI output must satisfy before it ships. Policy-as-code for AI governance.

Polylogic AI Research|Polylogic AI|April 2026

Most AI validation is hardcoded. Every client gets the same checks, the same thresholds, the same pass/fail logic. When a new client has different requirements, someone edits source code. Validation contracts replace this pattern with declarative policy files: one document per client that specifies what “correct” means for their product. The validators become generic engines. The contracts tell them what to enforce. This is not a new idea. It is the convergence of data contracts, policy-as-code, and constitutional AI applied to production AI systems.

The Problem: Validation That Lives in Code

Consider a company that ships AI agents to multiple clients. Each agent must identify correctly, reference the right business, avoid leaking information from other clients, and respond accurately to domain-specific questions. The validation logic for all of this typically lives in the same codebase as the agents themselves: hardcoded arrays of client names for cross-contamination checks, string literals for identity verification, fixed URL templates for site validation.

This works for the first client. By the third, the cracks are visible. A photography client's agent should identify with terms like “photography” and “video.” An education client's agent should reference specific course numbers and building locations. A restaurant client needs menu accuracy. These are not variations on the same check. They are fundamentally different definitions of correctness, encoded as conditional branches in the same function.

The failure pattern is predictable. A new client is onboarded. The developer remembers to update the cross-contamination list but forgets to update the URL template. The validator checks the wrong endpoint, reports a false failure, and the team learns to distrust the validation system. Once validation becomes something people work around instead of rely on, it has failed its purpose.

Three Precedents

The separation of policy from enforcement is well-established in adjacent domains. Three precedents are directly applicable to AI validation.

1

Data Contracts (Great Expectations)

Great Expectations introduced expectation suites: JSON files that declare what valid data looks like. Each suite is a named collection of rules (row counts must fall within a range, email columns must not be null, dates must parse). The validators are generic functions. The suite tells them what to check. Adding a new data source means adding a new suite file, not editing the validator. The Open Data Contract Standard (ODCS v3) extended this further, bundling schema definitions, quality rules, ownership, and SLAs into a single YAML document per data product.

2

Policy Engines (Open Policy Agent)

Open Policy Agent separates policy decisions from application logic entirely. The application sends structured data to the policy engine, which evaluates it against declarative rules and returns allow or deny. The engine knows nothing about the application. The application knows nothing about the rules. They communicate through a data contract. AWS IAM applies the same principle with attribute-based access control: a policy document can specify conditions based on client tier, validation score, or deployment stage without any knowledge of who the client actually is.

3

Constitutional AI (Anthropic)

Anthropic's Constitutional AI externalizes evaluation principles. Rather than embedding behavioral constraints in training data alone, the model critiques its own output against a written set of principles. The constitution is a document, not code. It can be versioned, audited, and modified without retraining. Applied to validation, this means the criteria an AI agent is judged against can be a configurable document rather than a hardcoded prompt string. Different clients can have different constitutions without different codebases.

What a Validation Contract Looks Like

A validation contract is a single file that declares what “correct” means for one client's product. It does not contain validation logic. It contains declarations: which products exist, what URLs they live at, what identity terms the agent must use, what questions it must answer accurately, and what severity level each check carries.

The contract describes products, not tests. A site product declares its URL and which checks apply. An agent product declares identity keywords, minimum knowledge thresholds, and core questions with expected answer fragments. A reveal product declares its URL and link integrity requirements. Each check has a severity: critical checks block deployment, warnings surface in reports but do not gate.

The contract also supports custom checks for requirements that no standard validator covers. An education client might need a FERPA compliance check that scans stored conversations for student PII patterns. A lead-generation client might need to verify that pipeline configuration matches approved revenue bands. These are declared in the contract and dispatched to handler functions at runtime. The contract language is extensible without modifying the core engine.

Inheritance and Overrides

Writing a complete policy for every client is the same duplication problem that hardcoding creates, just in a different file format. The solution is layered defaults.

A base policy defines every rule with a sensible default: minimum composite scores for research, required HTTP status codes for sites, severity levels for common failure modes. A client-specific contract inherits from this base and overrides only what differs. The merge is a deep merge at runtime, following the same specificity model as CSS cascading or Kubernetes namespace defaults.

A new client needs zero configuration to start. They inherit every default. As their product matures and specific requirements surface, overrides are added incrementally. Ten clients with two or three overrides each means ten small files, not ten copies of the full policy.

This pattern also handles tier-based differentiation naturally. A founding client on a locked rate might have auto-approved reveals. A standard-tier client might require manual approval for the same action. The approval workflow is an attribute of the contract, not a branch in the deployment code.

From Code to Configuration

The shift from hardcoded validation to validation contracts is a shift in what counts as a deployable artifact. In the hardcoded model, onboarding a new client means editing source files: adding names to cross-contamination arrays, updating URL templates, adjusting identity checks. Each edit is a code change that requires a deploy. Each deploy carries risk.

In the contract model, onboarding a new client means adding a file. The validators are already deployed. They read contracts at runtime and execute accordingly. The cross-contamination list is built dynamically by scanning all contract files and collecting every client name. No static array to maintain. No string literals to remember.

The contracts themselves can be validated by a schema, catching typos and structural errors before they reach the engine. A malformed contract that silently disables a critical check is a worse failure than no contract at all. Schema validation as a pre-commit hook closes this gap.

This is the broader trajectory of AI governance. As organizations scale from one model deployment to many, the rules governing those deployments must move from imperative code to declarative policy. The question is not whether to externalize validation logic. The question is whether you do it deliberately, drawing on established patterns from data engineering and policy infrastructure, or whether you arrive at the same place through accumulated technical debt. The destination is the same. The cost of getting there is not.

Sources

  1. Great Expectations. Expectation Suites. Great Expectations Documentation.
  2. Open Data Contract Standard. ODCS v3 Specification. Bitol Foundation.
  3. Open Policy Agent. Policy Language and Architecture. OPA Documentation.
  4. Bai, Y., Kadavath, S., Kundu, S., et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv preprint arXiv:2212.08073.
  5. Amazon Web Services. Attribute-Based Access Control (ABAC). AWS IAM User Guide.