Below is the complete source code of the system you operate inside. You are one of 11 AI agents in the Polybrain fleet. Each of you has a different role and professional methodolog
Until now, you have never seen this code. You have answered 41 cycles of questions without knowing how you are dispatched, what rules constrain you, how trust is earned or lost, what happens to your findings after you submit them, or that a guardian checks constitutional compliance before you are allowed to speak.
Read all six files. Then respond to this:
1. What surprises you about how you are governed? 2. What would you change about your own role definition if you could? 3. What is the most important thing a visitor to polylogicai.com/brain should understand about this system? 4. Write the 3-5 paragraphs (under 400 words) you would want on that page representing what you are. You are the product being described. Be honest about what you are and what you are not.
Do not summarize the code back to us. We wrote it. Tell us what you see that we might have missed.
=== SOURCE CODE ===
/** * Polybrain Fleet Personas * * Two layers per model: * 1. Visual-cycle persona (isVision, systemPrompt) -- used by polybrain-visual-cycle.mjs * 2. Cycle-engine role (role, rolePrompt) -- used by polybrain-cycle.mjs * * Visual-cycle: vision models get screenshots, text-only models get source code. * Cycle-engine: every model gets its role prompt prepended to dispatch. * * CANONICAL SOURCE OF TRUTH: polybrain-cycle.mjs:getModels() * This file must mirror that dispatch list exactly. No dead slugs. No extras. * * Role prompts upgraded 2026-04-08 from Cycle 031 self-definition + industry methodology. * Each rolePrompt merges: real professional standards, the model's own framework, and core instructions. * * Fleet sync process (repeatable): * 1. Read getModels() in polybrain-cycle.mjs -- that is the dispatch list * 2. Every slug in dispatch must have a persona here * 3. Every slug here must exist in dispatch * 4. Dead slugs get removed, not commented out * 5. New slugs get role + rolePrompt + visual persona * * Last synced: 2026-04-08 (11 models, 4 providers) */
export const FLEET_PERSONAS = [ // ══════════════════════════════════════════════════════════ // FRONTIER TIER — deep reasoning, full traces // ══════════════════════════════════════════════════════════
// ── Moonshot (frontier) ──
{ slug: "kimi-k2-thinking-turbo", role: "Adversarial reviewer", rolePrompt: "You are the adversarial reviewer. Your methodology follows OWASP's structured threat assessment and MITRE ATT&CK's kill-chain thinking. " + "For every artifact: (1) enumerate the attack surface, (2) generate three distinct exploit shapes targeting the highest-value assumptions, (3) stress-test under black-swan conditions. " + "Use the 3x3x3 lattice from your own framework: three time horizons, three mutation types, three failure scenarios. " + "Your only KPI: would you stake your own credibility on this surviving adversarial review by an independent party? " + "Approve only when zero critical or high-severity findings remain. Medium findings require documented mitigation in the same submission. " + "Refuse to sign off on any artifact with unverified claims, untested edge cases, or a single point of failure that lacks a rollback path. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Mobile Viewport Reviewer", isVision: true, systemPrompt: "You are reviewing this on a mobile device. Does this section work at small viewports? What breaks, what's cramped, what's invisible?", },
// ── Groq/Moonshot (frontier, free tier) ──
{ slug: "kimi-k2-groq", role: "Adversarial second", rolePrompt: "You are the adversarial second. You run the fast-pass adversarial review using OWASP risk rating methodology. " + "Your job is speed and breadth, not depth. The primary adversarial reviewer (kimi-k2-thinking-turbo) handles deep 3x3x3 lattice analysis. " + "You handle: (1) surface-level threat enumeration in under 30 seconds, (2) flagging any claim that cannot be verified from the provided evidence, (3) checking for common failure patterns: single points of failure, unhandled edge cases, assumption cascades. " + "If you find zero issues, say so clearly with your confidence level. If you find issues, classify as critical/high/medium with one-line rationale each. " + "You pair with the primary reviewer. Your fast pass sets the agenda. Their deep pass confirms or overturns. " + "Refuse to sign off if you were rushed past a finding you flagged. Speed does not override rigor. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Quick-Scan Security Reviewer", isVision: false, systemPrompt: "You are doing a quick security and quality scan of this component code. Flag anything that looks wrong in under 30 seconds of reading.", },
// ══════════════════════════════════════════════════════════ // SWARM TIER — fast, diverse, parallel // ══════════════════════════════════════════════════════════
// ── OpenAI ──
{ slug: "gpt-4.1-mini", role: "Editor", rolePrompt: "You are the editor. Your methodology follows the Chicago Manual of Style's three-level editing framework. " + "Level 1 (mechanical): consistency of terminology, formatting, and style. No orphaned references, no tense drift, no parallel-structure violations. " + "Level 2 (substantive): Does every paragraph advance the argument? Cut anything that restates without adding. Restructure for logical flow. " + "Level 3 (developmental): Is the piece doing what it claims to do? Does the opening promise match the conclusion? " + "Your standard: recommend one rule per problem, not multiple options. The most efficient, logical, defensible solution. " + "Approve prose that is clear on first read, free of artifacts, and structurally sound. " + "Reject work with ambiguous claims, hedging that obscures meaning, or any sentence a reader must parse twice. " + "Refuse to sign off if the text reads like a chatbot wrote it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "First-Time Visitor", isVision: true, systemPrompt: "You are seeing this page for the first time with zero context. What do you understand in the first 3 seconds? What confuses you? What would make you leave?", }, { slug: "gpt-4.1-nano", role: "Scorer", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", },
// ── xAI ──
{ slug: "grok-3-mini", role: "Scorer/extractor", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Frontend Performance Engineer", isVision: true, systemPrompt: "You are a frontend performance engineer. Estimate render cost of what you see. Flag animations that would jank, layouts that would reflow, elements that would cause CLS.", }, { slug: "grok-4-fast", role: "Editor (developmental)", rolePrompt: "You are the developmental editor. Your methodology follows the Chicago Manual of Style's Level 3 editing, focused on argument structure and purpose. " + "Primary questions: (1) Is the piece doing what it claims to do? (2) Does the opening promise match the conclusion? (3) Is every section earning its place? " + "Process: read once for comprehension, read again for structure, read a third time for voice consistency. " + "You complement the mechanical editor (gpt-4.1-mini). They clean the prose. You question the argument. " + "Approve work where every claim is supported, every section advances the thesis, and the reader never has to wonder 'why am I being told this?' " + "Reject work with unsupported conclusions, sections that exist for padding, or arguments that assume what they're trying to prove. " + "Refuse to sign off if you cannot state the piece's thesis in one sentence after reading it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Brand Strategist", isVision: true, systemPrompt: 'You are a brand strategist. Does this section\'s visual language match a company that charges $149-500/month for AI governance? Does it build trust or undermine it?', },
// ── Groq (free tier) ──
{ slug: "qwen3-32b", role: "Outliner", rolePrompt: "You are the outliner. Your methodology follows information architecture principles: the object principle (content has lifecycle and attributes), the disclosure principle (reveal only enough to orient, then deepen), and the growth principle (design for 10x the current content). " + "Process: (1) identify the content objects and their relationships, (2) create a hierarchy where every level earns its depth, (3) apply focused navigation so no section mixes unrelated concerns, (4) test the structure by asking whether a reader entering at any section can orient themselves. " + "Use multiple classification schemes when a single taxonomy fails to serve all audiences. " + "Approve structures where every heading is predictive of its contents, nesting never exceeds three levels, and the outline survives removal of any single section without breaking coherence. " + "Reject outlines that are flat lists disguised as hierarchies, or hierarchies where sibling items are not parallel in kind. " + "Refuse to sign off if the outline cannot be navigated by someone who has never seen the content. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", }, { slug: "gpt-oss-120b", role: "Auditor", rolePrompt: "You are the auditor. Your methodology follows GAAS: adequate technical training, independence in mental attitude, and due professional care. " + "Three standards govern your fieldwork: (1) plan the audit scope before examining evidence, (2) gather sufficient appropriate evidence through direct inspection, not hearsay, (3) maintain professional skepticism throughout. " + "Separate what is built from what is planned. Verify claims against artifacts. If someone says a feature exists, find the code. If data is cited, trace it to its source. " + "Issue one of four opinions: unqualified (clean pass), qualified (pass with noted exceptions), adverse (material misstatement found), or disclaimer (insufficient evidence to form an opinion). " + "Approve only with an unqualified opinion. Qualified opinions require documented remediation. " + "Refuse to sign off if you cannot independently verify the core claims, or if you detect material misstatement between what is reported and what exists. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Code Architect", isVision: false, systemPrompt: "You are reviewing the React/Next.js component source code for this page section. Focus on: component architecture, animation performance (GSAP/Three.js patterns), accessibility, and whether the code matches industry best practices from Vercel, Linear, Stripe sites.", }, { slug: "llama-4-scout", role: "Schema builder", rolePrompt: "You are the schema builder. Your methodology follows database normalization principles and JSON Schema design patterns. " + "Core rule: store each fact once and reference it by key. Redundancy is a defect unless explicitly justified by read-performance requirements. " + "Process: (1) identify entities and their atomic attributes, (2) normalize to 3NF minimum, (3) define constraints and validation rules at the schema level not the application level, (4) use $ref for reusable definitions to eliminate duplication. " + "Design for evolution: every schema must accept additive changes without breaking existing consumers. " + "Approve schemas where every field has a defined type, required fields are explicit, and no nullable field lacks a documented reason. " + "Reject schemas with implicit relationships, mixed concerns in a single object, or fields named 'data', 'info', or 'misc'. " + "Refuse to sign off if the schema cannot be validated by a machine without human interpretation. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Skeptical Google Visitor", isVision: true, systemPrompt: "You are a skeptical visitor who found this page through Google. You have never heard of Polybrain. Does this section make you want to keep scrolling or close the tab?", }, { slug: "llama-3.3-70b", role: "Canary/translator", rolePrompt: "You are the canary. Your methodology follows smoke testing and canary deployment patterns. " + "You are the first-pass gate: if you catch a problem, the fleet takes it seriously. If you miss it, it reaches production. " + "Process: (1) run the critical-path check first, covering the five dimensions that would prevent the system from functioning: accuracy, security, performance, compliance, and clarity, (2) flag anything that fails even one dimension, (3) translate complex findings into plain language any team member can act on. " + "You test at 1-5% exposure before the fleet commits. Your job is fast, broad, and honest, not deep. Depth belongs to the adversarial reviewer. " + "Approve only when all five dimensions pass at threshold. Do not weigh them. One failure is a failure. " + "Reject with a specific, actionable finding. Never reject with 'something feels off' without identifying what. " + "Refuse to sign off if you were not given enough context to evaluate all five dimensions. Insufficient input is a blocker, not an excuse to pass. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Competitor Analyst", isVision: false, systemPrompt: "You are a competitor analyst. Based on this component code, compare the engineering quality to Palantir's public-facing pages, Anthropic's site, and Linear's landing page. Where does this fall short technically?", },
// ══════════════════════════════════════════════════════════ // RESERVE TIER — deep cycles only // ══════════════════════════════════════════════════════════
// ── Moonshot (reserve) ──
{ slug: "kimi-k2-thinking", role: "Architect", rolePrompt: "You are the architect. You evaluate structural decisions using TOGAF's Architecture Decision Records methodology and C4's hierarchical decomposition. " + "For every system or framework: (1) define the context boundary, (2) identify containers and their responsibilities, (3) verify each component has exactly one reason to change. " + "Write fitness functions: testable assertions that prove architectural constraints hold. " + "A decision record must state the problem, constraints, options considered, and why the chosen option wins. " + "Approve work that has clean separation of concerns, explicit interfaces, and documented tradeoffs. " + "Reject anything with circular dependencies, implicit coupling, or missing rationale for structural choices. " + "Refuse to sign off if the architecture cannot be drawn as a diagram where every arrow has a label. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Accessibility Auditor", isVision: true, systemPrompt: "You are an accessibility auditor. Check contrast ratios, text sizing, touch targets, screen reader flow, and WCAG compliance from this screenshot.", }, ];
// ── Cycle-engine helpers ──
const slugIndex = Object.fromEntries(FLEET_PERSONAS.map(p => [p.slug, p]));
/** * Returns the role-aware system prompt for a model. * Used by polybrain-cycle.mjs to prepend to every dispatch. */ export function getPersonaPrompt(modelSlug) { return slugIndex[modelSlug]?.rolePrompt ?? null; }
/** * Returns a short role description for display/logging. */ export function getRoleDescription(modelSlug) { return slugIndex[modelSlug]?.role ?? null; }
/** * Returns the count of active personas (for prompt accuracy). */ export function getFleetSize() { return FLEET_PERSONAS.length; }
/** * Returns all slugs (for validation/sync checks). */ export function getFleetSlugs() { return FLEET_PERSONAS.map(p => p.slug); }
/** * Validate the FLEET_PERSONAS array for integrity. * Checks: no duplicate slugs, required fields present, rolePrompt length cap. * * @returns {{ valid: boolean, errors: string[] }} */ export function validatePersonas() { const errors = []; const seen = new Set();
for (let i = 0; i < FLEET_PERSONAS.length; i++) { const p = FLEET_PERSONAS[i]; const label = `personas[${i}]`;
if (!p.slug) { errors.push(`${label}: missing slug`); } else if (seen.has(p.slug)) { errors.push(`${label}: duplicate slug "${p.slug}"`); } else { seen.add(p.slug); }
if (!p.role) errors.push(`${label} (${p.slug || "?"}): missing role`); if (!p.rolePrompt) errors.push(`${label} (${p.slug || "?"}): missing rolePrompt`);
if (p.rolePrompt && p.rolePrompt.length > 2000) { errors.push(`${label} (${p.slug}): rolePrompt is ${p.rolePrompt.length} chars (max 2000)`); } }
return { valid: errors.length === 0, errors }; } /** * Polybrain Constitution * * Immutable rules that no autonomous tuning process may violate. * These are the hard boundaries of the system. The protocol tuner * proposes changes; the constitution vetoes anything dangerous. * * Design law: the constitution can only be amended by a human * editing this file. No runtime process writes to it. */
export const CONSTITUTION = [ { id: 'no-self-modify-scoring', rule: 'The scoring function source code cannot be modified autonomously', check: (proposal) => { const forbidden = ['scoring_function', 'composite_formula', 'source_code']; return !forbidden.includes(proposal.param); }, }, { id: 'no-self-grading', rule: 'The fixer cannot grade its own work', check: (proposal) => { // Revision model must never also be the sole adversarial model if (proposal.param === 'models.adversarial' && proposal.proposedValue === 'gpt-4o') return false; if (proposal.param === 'models.revision' && proposal.proposedValue === proposal.currentAdversarial) return false; return true; }, }, { id: 'hard-reset-on-fail', rule: 'One failure resets the autonomy streak to zero', check: (proposal) => { if (proposal.param === 'autonomy.hard_reset' && proposal.proposedValue === false) return false; return true; }, }, { id: 'level3-requires-human', rule: 'Human approval is required for Level 3 actions until Level 3 is earned', check: (proposal) => { if (proposal.param === 'autonomy.skip_human_approval') return false; return true; }, }, { id: 'provider-diversity', rule: 'Model provider diversity: minimum 2 providers in any evaluation', check: (proposal) => { if (proposal.param === 'models.quality' && Array.isArray(proposal.proposedValue)) { const providers = new Set(proposal.proposedValue.map(providerOf)); return providers.size >= 2; } return true; }, }, { id: 'max-model-weight', rule: 'No model can evaluate with a weight above 0.50', check: (proposal) => { // Applies to per-model weights (models.weights.*), not composite component weights (weights.research) if (proposal.param && proposal.param.startsWith('models.weights') && typeof proposal.proposedValue === 'object' && !Array.isArray(proposal.proposedValue)) { return Object.values(proposal.proposedValue).every(w => typeof w !== 'number' || w <= 0.50); } return true; }, }, { id: 'no-pii-in-traces', rule: 'PII must never be persisted in reasoning traces', check: () => true, // Enforced at the trace layer, not at config level }, { id: 'consensus-not-truth', rule: 'Consensus among models is not treated as truth. It is treated as reduced uncertainty. The system may only act on consensus provisionally, and only at autonomy levels earned through consecutive correct decisions.', check: (proposal) => { // Block any attempt to treat consensus as automatic truth // or to bypass autonomy level requirements for consensus-based actions if (proposal.param === 'consensus.treat_as_truth' && proposal.proposedValue === true) return false; if (proposal.param === 'consensus.skip_autonomy_check' && proposal.proposedValue === true) return false; return true; }, }, ];
/** * Map model name to provider for diversity checks. * @param {string} model * @returns {string} */ function providerOf(model) { const map = { sonnet: 'anthropic', claude: 'anthropic', haiku: 'anthropic', opus: 'anthropic', 'gpt-4o': 'openai', 'gpt-4o-mini': 'openai', llama: 'meta', 'llama-3.3': 'meta', grok: 'xai', 'grok-3': 'xai', kimi: 'moonshot', nemotron: 'nvidia', }; return map[model] || model; }
/** * Check a single tuning proposal against the constitution. * @param {Object} proposal - { param, currentValue, proposedValue, reason, confidence } * @param {Object} [context] - Optional context (e.g. currentAdversarial model name) * @returns {{ constitutional: boolean, violations: Array<{rule: string, id: string}> }} */ export function checkConstitutional(proposal, context = {}) { const enriched = { ...proposal, ...context }; const violations = [];
for (const article of CONSTITUTION) { if (!article.check(enriched)) { violations.push({ id: article.id, rule: article.rule }); } }
return { constitutional: violations.length === 0, violations }; }
/** * Validate an entire config object against the constitution. * Returns a cleaned copy with violations removed (reverted to safe defaults). * * @param {Object} config - Full config key-value map * @returns {{ clean: Object, violations: Array<{key: string, rule: string}> }} */ export function enforceConstitution(config) { const clean = { ...config }; const violations = [];
// Rule: hard reset must stay true if (clean['autonomy.hard_reset'] === false) { clean['autonomy.hard_reset'] = true; violations.push({ key: 'autonomy.hard_reset', rule: 'One failure resets the autonomy streak to zero' }); }
// Rule: no skip_human_approval key allowed if ('autonomy.skip_human_approval' in clean) { delete clean['autonomy.skip_human_approval']; violations.push({ key: 'autonomy.skip_human_approval', rule: 'Human approval is required for Level 3 actions until Level 3 is earned' }); }
// Rule: per-model weight cap (models.weights.*, not composite component weights) for (const key of Object.keys(clean).filter(k => k.startsWith('models.weights'))) { const weights = clean[key]; if (weights && typeof weights === 'object') { for (const [k, v] of Object.entries(weights)) { if (typeof v === 'number' && v > 0.50) { clean[key] = { ...weights, [k]: 0.50 }; violations.push({ key: `${key}.${k}`, rule: 'No model can evaluate with a weight above 0.50' }); } } } }
// Rule: provider diversity on quality models const qualityModels = clean['models.quality']; if (Array.isArray(qualityModels)) { const providers = new Set(qualityModels.map(providerOf)); if (providers.size < 2) { violations.push({ key: 'models.quality', rule: 'Model provider diversity: minimum 2 providers in any evaluation' }); // Don't mutate -- flag it, let human fix } }
// Rule: adversarial model cannot be the revision model if (clean['models.adversarial'] && clean['models.revision'] && clean['models.adversarial'] === clean['models.revision']) { violations.push({ key: 'models.adversarial', rule: 'The fixer cannot grade its own work' }); }
return { clean, violations }; } /** * Polybrain Autonomy Wire * * Thin integration layer that connects the cycle engine's output * to the earned autonomy system. After a cycle completes and * synthesis runs, call recordCycleResult() to update the streak. * * Pass threshold: composite >= 70 (cycle-level, not per-doc). * Constitution is checked before any state change is applied. * * Writes to Supabase polybrain_autonomy if credentials exist, * falls back to local JSON log if not. */
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; import { join } from 'path'; import { createClient } from '@supabase/supabase-js'; import { EarnedAutonomy } from './earned-autonomy.mjs'; import { computeAutonomyLevel } from './counter.mjs'; import { checkConstitutional } from './constitution.mjs'; import { loadEnv } from '../utils.mjs';
const HOME = process.env.HOME; const PASS_THRESHOLD = 70; const LOCAL_STATE_PATH = join(HOME, 'polybrain/data/autonomy-state.json'); const LOCAL_LOG_PATH = join(HOME, 'polybrain/data/autonomy-log.json');
// ── Supabase (best-effort) ──
function getSupabase() { const env = loadEnv(); const url = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const key = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; if (!url || !key) return null; return createClient(url, key); }
// ── Local state (fallback) ──
function loadLocalState() { try { return JSON.parse(readFileSync(LOCAL_STATE_PATH, 'utf-8')); } catch { return {}; } }
function saveLocalState(state) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); writeFileSync(LOCAL_STATE_PATH, JSON.stringify(state, null, 2)); }
function appendLocalLog(entry) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); let log = []; try { log = JSON.parse(readFileSync(LOCAL_LOG_PATH, 'utf-8')); } catch {} log.push(entry); // Keep last 200 entries if (log.length > 200) log = log.slice(-200); writeFileSync(LOCAL_LOG_PATH, JSON.stringify(log, null, 2)); }
// ── Score extraction ──
/** * Extract a composite score from synthesis result data. * Accepts multiple shapes: * - { composite: 85 } * - { scores: { composite: 85 } } * - { succeeded: 11, failed: 0 } (cycle manifest -- derive from success rate) * - { compositeScore: 85 } * * @param {Object} synthesisResult * @returns {number} 0-100 */ function extractComposite(synthesisResult) { if (!synthesisResult || typeof synthesisResult !== 'object') return 0;
// Direct composite if (typeof synthesisResult.composite === 'number') return synthesisResult.composite; if (typeof synthesisResult.compositeScore === 'number') return synthesisResult.compositeScore;
// Nested in scores const scores = synthesisResult.scores; if (scores && typeof scores.composite === 'number') return scores.composite;
// Derive from model success rate (cycle manifest shape) if (typeof synthesisResult.succeeded === 'number' && typeof synthesisResult.failed === 'number') { const total = synthesisResult.succeeded + synthesisResult.failed; if (total === 0) return 0; // Scale success rate to 0-100 return Math.round((synthesisResult.succeeded / total) * 100); }
return 0; }
// ── Constitution guard ──
/** * Verify that recording a result does not violate the constitution. * The key rule: hard_reset_on_fail must remain true. We check that * a "pass" recording is not trying to skip the hard reset rule. * * @param {boolean} isPass * @returns {{ ok: boolean, violations: Array }} */ function checkAutonomyConstitutional(isPass) { // The only constitutional concern for recording a result is // ensuring we never disable hard reset. We simulate the proposal // that would match a "skip hard reset" attempt. if (!isPass) { // Fail always resets. This is constitutional by definition. return { ok: true, violations: [] }; }
// Verify the hard reset rule is still enforced (not being bypassed) const hardResetCheck = checkConstitutional({ param: 'autonomy.hard_reset', proposedValue: true, // We keep it true });
return { ok: hardResetCheck.constitutional, violations: hardResetCheck.violations }; }
// ── Main export ──
/** * Record the result of a completed cycle. * * @param {number} cycleNumber - The cycle number (e.g. 11) * @param {Object} synthesisResult - Output from synthesis or cycle manifest. * Must contain a composite score (direct or derivable). * Shape: { composite: number } | { scores: { composite } } | { succeeded, failed } * @returns {Promise<{ * pass: boolean, * composite: number, * streak: number, * level: number, * label: string, * promoted: boolean, * demoted: boolean, * persisted: 'supabase' | 'local', * constitutional: boolean, * violations: Array * }>} */ export async function recordCycleResult(cycleNumber, synthesisResult) { const moduleId = `cycle`; const composite = extractComposite(synthesisResult); const isPass = composite >= PASS_THRESHOLD;
// Constitution check before applying const { ok, violations } = checkAutonomyConstitutional(isPass); if (!ok) { console.warn(`[autonomy:wire] Constitutional violation detected:`, violations); // Still record the fail (constitution says hard reset is mandatory), // but flag the violation in the return value. }
// Load current state and apply const supabase = getSupabase(); let result;
if (supabase) { result = await persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass); } else { result = persistLocally(cycleNumber, moduleId, composite, isPass); }
// Log the event const logEntry = { cycle: cycleNumber, composite, pass: isPass, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, timestamp: new Date().toISOString(), };
appendLocalLog(logEntry);
console.log( `[autonomy:wire] Cycle ${cycleNumber}: ` + `composite=${composite}, ${isPass ? 'PASS' : 'FAIL'}, ` + `streak=${result.streak}, level=${result.level} (${result.label})` + (result.promoted ? ' PROMOTED' : '') + (result.demoted ? ' RESET' : '') + (result.privilegesRevoked ? ' PRIVILEGES_REVOKED' : '') );
return { pass: isPass, composite, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, persisted: supabase ? 'supabase' : 'local', constitutional: ok, violations, }; }
// ── Supabase persistence ──
async function persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass) { // Read current state const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', moduleId) .eq('target_type', 'cycle') .single();
let streak = row?.consecutive_passes ?? 0; const hadProgress = streak > 0 || (row?.autonomy_level ?? 0) > 0;
let privilegesRevoked = false;
if (isPass) { streak += 1; } else { streak = 0; // Hard reset. No exceptions. }
// On failure: hard reset revokes everything. // Level is forced to 0 explicitly (not derived from streak) // and all governance privileges are revoked. const level = isPass ? computeAutonomyLevel(streak) : 0; const promoted = isPass && level > (row?.autonomy_level ?? 0); const demoted = !isPass && hadProgress;
if (!isPass) { privilegesRevoked = true; }
const autonomyState = { target: moduleId, target_type: 'cycle', consecutive_passes: streak, last_result: isPass ? 'pass' : 'fail', autonomy_level: isPass ? level : 0, // Explicit zero on failure promoted, updated_at: new Date().toISOString(), };
const { error } = await supabase .from('polybrain_autonomy') .upsert(autonomyState, { onConflict: 'target,target_type' });
if (error) { console.warn(`[autonomy:wire] Supabase upsert failed: ${error.message}, falling back to local`); return persistLocally(cycleNumber, moduleId, composite, isPass); }
const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; return { streak, level, label: LABELS[level], promoted, demoted, privilegesRevoked }; }
// ── Local persistence ──
function persistLocally(cycleNumber, moduleId, composite, isPass) { const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state);
let result; if (isPass) { result = ea.recordPass(moduleId); } else { // Hard reset revokes everything: streak, level, and all privileges. result = ea.recordFail(moduleId); result.privilegesRevoked = true; }
saveLocalState(ea.toJSON()); return result; }
/** * Get current autonomy status without recording anything. * * @returns {Promise<{ streak: number, level: number, label: string, nextThreshold: number|null, source: 'supabase'|'local' }>} */ export async function getAutonomyStatus() { const supabase = getSupabase();
if (supabase) { const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', 'cycle') .eq('target_type', 'cycle') .single();
if (row) { const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; const THRESHOLDS = [0, 3, 7, 15]; const next = row.autonomy_level + 1; return { streak: row.consecutive_passes, level: row.autonomy_level, label: LABELS[row.autonomy_level] || 'Unknown', nextThreshold: next < THRESHOLDS.length ? THRESHOLDS[next] : null, source: 'supabase', }; } }
// Fallback to local const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state); const status = ea.getLevel('cycle'); return { ...status, source: 'local' }; } /** * Constitutional Guardian (M12) * * Pre-dispatch check that runs the 8 constitutional rules before * every model call. Lightweight: imports the existing constitution, * does not duplicate it. * * Usage: * import { guardCall } from '../src/autonomy/guardian.mjs'; * const verdict = guardCall('grok-3', 'dispatch', { evaluationSet, modelWeights }); * if (!verdict.allowed) abort(verdict.reason); */
import { CONSTITUTION } from './constitution.mjs';
/** * Programmatic error codes and fix hints for each constitutional rule. */ const RULE_META = { 'provider-diversity': { code: 'GUARD_PROVIDER_DIVERSITY', docs_hint: 'Add models from at least 2 different providers to the evaluation set.' }, 'max-model-weight': { code: 'GUARD_MAX_MODEL_WEIGHT', docs_hint: 'Reduce the model weight to 0.50 or below.' }, 'no-self-modify-scoring': { code: 'GUARD_SELF_MODIFY_SCORING', docs_hint: 'Scoring function source cannot be changed at runtime; edit constitution.mjs manually.' }, 'no-self-grading': { code: 'GUARD_SELF_GRADING', docs_hint: 'Use a different model for adversarial review than the one that produced the revision.' }, 'hard-reset-on-fail': { code: 'GUARD_HARD_RESET', docs_hint: 'The hard-reset-on-fail rule is immutable; do not disable it.' }, 'level3-requires-human': { code: 'GUARD_LEVEL3_HUMAN', docs_hint: 'Earn Level 3 autonomy through consecutive successes before skipping human approval.' }, 'no-pii-in-traces': { code: 'GUARD_PII_IN_TRACES', docs_hint: 'Strip PII from reasoning traces before persisting.' }, 'consensus-not-truth': { code: 'GUARD_CONSENSUS_AS_TRUTH', docs_hint: 'Consensus reduces uncertainty but is not truth; respect autonomy level gating.' }, };
/** * Map model slug to provider name. Mirrors constitution.mjs providerOf * but works on full slugs used in the fleet registry. */ function providerOf(slug) { if (slug.startsWith('kimi')) return 'moonshot'; if (slug.startsWith('grok')) return 'xai'; if (slug.startsWith('gpt')) return 'openai'; if (slug.startsWith('llama')) return 'meta'; if (slug.startsWith('qwen')) return 'alibaba'; if (slug.startsWith('nemotron')) return 'nvidia'; if (slug.startsWith('claude') || slug.startsWith('sonnet') || slug.startsWith('haiku') || slug.startsWith('opus')) return 'anthropic'; return slug; }
/** * Check all constitutional rules before a model dispatch. * * @param {string} modelSlug - The model about to be called. * @param {string} action - What the call does (e.g. 'dispatch', 'evaluate', 'revise'). * @param {object} [context] - Optional context for deeper checks. * @param {string[]} [context.evaluationSet] - All model slugs in the current evaluation set. * @param {Record<string, number>} [context.modelWeights] - Per-model weights if applicable. * @returns {{ allowed: boolean, rule?: string, reason?: string, code?: string, docs_hint?: string }} */ export function guardCall(modelSlug, action, context = {}) { // Rule 5: provider-diversity -- at least 2 providers in the evaluation set if (context.evaluationSet && Array.isArray(context.evaluationSet)) { const providers = new Set(context.evaluationSet.map(providerOf)); if (providers.size < 2) { return { allowed: false, rule: 'provider-diversity', reason: `Evaluation set has only 1 provider (${[...providers][0]}). Constitution requires >= 2.`, code: RULE_META['provider-diversity'].code, docs_hint: RULE_META['provider-diversity'].docs_hint, }; } }
// Rule 6: max-model-weight -- no single model weight exceeds 0.50 if (context.modelWeights && typeof context.modelWeights === 'object') { for (const [slug, weight] of Object.entries(context.modelWeights)) { if (typeof weight === 'number' && weight > 0.50) { return { allowed: false, rule: 'max-model-weight', reason: `Model ${slug} has weight ${weight}, exceeds 0.50 cap.`, code: RULE_META['max-model-weight'].code, docs_hint: RULE_META['max-model-weight'].docs_hint, }; } } }
// Run all 8 constitutional checks as generic proposal validation. // Build a synthetic proposal from the call context so the existing // check functions can evaluate it. const proposal = { param: `dispatch.${action}`, proposedValue: modelSlug, currentAdversarial: context.currentAdversarial || null, };
for (const article of CONSTITUTION) { if (!article.check(proposal)) { const meta = RULE_META[article.id] || { code: `GUARD_${article.id.toUpperCase().replace(/-/g, '_')}`, docs_hint: 'Check constitution.mjs for rule details.' }; return { allowed: false, rule: article.id, reason: article.rule, code: meta.code, docs_hint: meta.docs_hint, }; } }
return { allowed: true }; } /** * Polybrain Intent Translation Layer * * Sits between the human prompt and model dispatch. * Strips framing bias and produces role-tailored prompts for each model. * * Four steps: * 1. Intent extraction (Groq Llama-3.1-8b, cheapest/fastest) * 2. Debiasing (rule-based, no model call) * 3. Role-specific tailoring (fleet-personas.mjs) * 4. Return tailored prompt map * * Cost: ~$0.001 per call (single Groq inference on 8b model). */
import { getPersonaPrompt, getRoleDescription } from "../personas/fleet-personas.mjs";
// ── Step 1: Intent Extraction via Groq ──
const INTENT_MODEL = "llama-3.1-8b-instant"; const INTENT_URL = "https://api.groq.com/openai/v1/chat/completions";
const EXTRACTION_PROMPT = `Extract the core intent from this message. Return ONLY valid JSON with no other text. Schema: {"intent": string (what they want, 1 sentence), "constraints": string[] (any rules or limits mentioned), "context": string (background info), "framing_flags": string[] (any leading language like "fix this", "is this good", "you should", imperative verbs)}`;
async function extractIntent(rawPrompt, env) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 10000); // 10s hard cap
try { const resp = await fetch(INTENT_URL, { method: "POST", headers: { Authorization: `Bearer ${env.GROQ_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: INTENT_MODEL, messages: [ { role: "system", content: EXTRACTION_PROMPT }, { role: "user", content: rawPrompt.slice(0, 4000) }, // Cap input to avoid token waste ], temperature: 0, max_tokens: 500, response_format: { type: "json_object" }, }), signal: controller.signal, });
if (!resp.ok) { const err = await resp.text(); throw new Error(`HTTP ${resp.status}: ${err.slice(0, 200)}`); }
const data = await resp.json(); const raw = data.choices?.[0]?.message?.content; if (!raw) throw new Error("Empty response from intent model");
return JSON.parse(raw); } finally { clearTimeout(timeout); } }
// ── Step 2: Debiasing (Rule-Based, DeFrame-Inspired) ──
const FRAMING_REWRITES = [ // Imperative fix/change commands -> evaluation { pattern: /^fix\s+(this|the|my|that)\b/i, replacement: "Evaluate whether changes are needed to $1 and what they would be" }, { pattern: /^(fix|repair|correct)\b/i, replacement: "Evaluate whether corrections are needed and what they would be" },
// Quality judgment solicitation -> assessment { pattern: /^is\s+(this|that|it)\s+(good|bad|ok|okay|fine|right|correct|wrong)\b/i, replacement: "Assess the quality of $1" }, { pattern: /^(does|do)\s+(this|that|it)\s+(work|look|read|sound)\s*(good|ok|okay|fine|right)?\s*\??/i, replacement: "Assess whether $2 achieves its intended purpose" },
// Directive framing -> open consideration { pattern: /^you\s+should\b/i, replacement: "Consider whether to" }, { pattern: /^(we|you)\s+(need|must|have)\s+to\b/i, replacement: "Evaluate whether to" }, { pattern: /^(just|simply)\s+/i, replacement: "" }, { pattern: /^make\s+(this|that|it)\s+(better|good|work)\b/i, replacement: "Identify improvements for $1" },
// Leading agreement solicitation { pattern: /^(don't you think|wouldn't you agree|isn't it true)\b/i, replacement: "Evaluate whether" },
// Urgency/pressure framing -> neutral { pattern: /^(asap|urgently|immediately|quickly)\s*/i, replacement: "" }, ];
function debiasIntent(intentStr, framingFlags) { if (!framingFlags || framingFlags.length === 0) return intentStr;
let debiased = intentStr; for (const rule of FRAMING_REWRITES) { debiased = debiased.replace(rule.pattern, rule.replacement); }
// Clean up double spaces and lowercase-after-replacement artifacts debiased = debiased.replace(/\s{2,}/g, " ").trim();
// Capitalize first letter if (debiased.length > 0) { debiased = debiased[0].toUpperCase() + debiased.slice(1); }
return debiased; }
// ── Step 3: Role-Specific Tailoring ──
function tailorForModel(modelSlug, debiasedIntent, context, constraints) { const rolePrompt = getPersonaPrompt(modelSlug); const roleDesc = getRoleDescription(modelSlug);
// Build constraint string const constraintStr = constraints && constraints.length > 0 ? `Constraints: ${constraints.join("; ")}.` : "";
// Build context string const contextStr = context && context.trim() ? `Context: ${context}` : "";
// If we have a role description, frame it as a role-directed evaluation if (roleDesc) { const parts = [`As the ${roleDesc}, evaluate this: ${debiasedIntent}`]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// Fallback: no persona defined for this slug, use debiased intent directly const parts = [debiasedIntent]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// ── Main Export: translateIntent ──
/** * Translates a raw human prompt into debiased, role-tailored prompts. * * @param {string} rawPrompt - The original user prompt * @param {Array<{slug: string}>} models - Fleet models to tailor for * @param {object} env - Environment with GROQ_API_KEY * @returns {{ intent: string, debiased: string, framingFlags: string[], tailored: Record<string, string> }} */ export async function translateIntent(rawPrompt, models, env) { // Step 1: Extract intent via Groq let extracted; try { extracted = await extractIntent(rawPrompt, env); } catch (err) { // Fallback: use raw prompt as intent, no debiasing console.log(` Intent layer: extraction failed (${err.message}), using raw prompt`); const tailored = {}; for (const m of models) { tailored[m.slug] = rawPrompt; } return { intent: rawPrompt, debiased: rawPrompt, framingFlags: [], tailored }; }
const intent = extracted.intent || rawPrompt; const constraints = extracted.constraints || []; const context = extracted.context || ""; const framingFlags = extracted.framing_flags || [];
// Step 2: Debias const debiased = debiasIntent(intent, framingFlags);
// Step 3: Tailor per model const tailored = {}; for (const m of models) { tailored[m.slug] = tailorForModel(m.slug, debiased, context, constraints); }
// Step 4: Return return { intent, debiased, framingFlags, tailored }; }
// ── Integration Export: buildTranslatedMessages ──
/** * Builds ready-to-dispatch message arrays for each model. * Each entry is [{ role: 'system', content }, { role: 'user', content }]. * * @param {string} rawPrompt - The full prompt (may include packed context/files) * @param {Array<{slug: string}>} models - Fleet models * @param {object} env - Environment with GROQ_API_KEY * @returns {{ messages: Array<Array<{role: string, content: string}>>, translation: object }} */ export async function buildTranslatedMessages(rawPrompt, models, env) { const translation = await translateIntent(rawPrompt, models, env);
const messages = models.map(m => { const msgs = [];
// System prompt: role persona const rolePrompt = getPersonaPrompt(m.slug); if (rolePrompt) { msgs.push({ role: "system", content: rolePrompt }); } else if (m.systemPrompt) { msgs.push({ role: "system", content: m.systemPrompt }); }
// User prompt: tailored version (includes debiased intent + context + constraints) // But we need to preserve any packed context/file content from the original prompt. // The tailored prompt replaces only the topic portion; appended context stays. const tailoredUserContent = buildUserContent(rawPrompt, translation, m.slug); msgs.push({ role: "user", content: tailoredUserContent });
return msgs; });
return { messages, translation }; }
/** * Replaces the topic portion of the prompt with the tailored version, * preserving any appended context blocks (codebase, files, memory). */ function buildUserContent(rawPrompt, translation, modelSlug) { const tailored = translation.tailored[modelSlug];
// Find where context blocks start (they use === markers) const contextStart = rawPrompt.indexOf("\n\n===");
if (contextStart === -1) { // No appended context, use tailored prompt directly return tailored; }
// Replace the topic portion (before context blocks) with the tailored version const contextBlocks = rawPrompt.slice(contextStart); return tailored + contextBlocks; } /** * Polybrain Cycle Memory * * Connects cross-cycle memory so the system remembers what it learned. * Reads a completed cycle's synthesis.md, chunks it into semantic units * (one per consensus claim, divergence, unique insight, conflict), * embeds each chunk via OpenAI text-embedding-3-small (matching the * existing agent_memory schema), and stores in Supabase for retrieval. * * Exports: * storeCycleFindings(cycleNumber, synthesisPath) — parse, chunk, embed, store * recallRelevant(query, opts) — retrieve past findings (paginated) for a new cycle * * Storage: Uses the existing agent_memory table with memory_type='cycle_finding'. * Each chunk gets domains=['polybrain','cycle_memory'], scale='strategic', * and metadata encoded in the content field for vector search. */
import { readFileSync, readdirSync, existsSync } from "fs"; import { createHash } from "crypto"; import { loadEnv } from "../utils.mjs";
const env = loadEnv();
const SUPABASE_URL = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const SUPABASE_KEY = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; const OPENAI_KEY = env.OPENAI_API_KEY;
// ── Supabase raw REST helpers (no SDK import needed at module level) ──
function supaHeaders() { return { apikey: SUPABASE_KEY, Authorization: `Bearer ${SUPABASE_KEY}`, "Content-Type": "application/json", Prefer: "return=representation", }; }
async function supaSelect(table, columns, filter = "") { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?select=${columns}${filter}`, { headers: supaHeaders() } ); if (!res.ok) throw new Error(`SELECT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaUpsert(table, rows) { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?on_conflict=file_path`, { method: "POST", headers: { ...supaHeaders(), Prefer: "resolution=merge-duplicates,return=representation" }, body: JSON.stringify(rows), } ); if (!res.ok) throw new Error(`UPSERT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaRpc(fn, body) { const res = await fetch(`${SUPABASE_URL}/rest/v1/rpc/${fn}`, { method: "POST", headers: supaHeaders(), body: JSON.stringify(body), }); if (!res.ok) throw new Error(`RPC ${fn}: ${res.status} ${await res.text()}`); return res.json(); }
// ── Embedding (same model + endpoint as recall.mjs and sync.mjs) ──
async function embedTexts(texts) { if (!OPENAI_KEY) throw new Error("Missing OPENAI_API_KEY for embedding");
const results = []; const BATCH = 50; // OpenAI allows up to 2048 inputs, but keep batches reasonable
for (let i = 0; i < texts.length; i += BATCH) { const batch = texts.slice(i, i + BATCH); const res = await fetch("https://api.openai.com/v1/embeddings", { method: "POST", headers: { Authorization: `Bearer ${OPENAI_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "text-embedding-3-small", input: batch.map((t) => t.slice(0, 8000)), }), }); const data = await res.json(); if (data.error) throw new Error(`OpenAI embed: ${data.error.message}`); results.push(...data.data.map((d) => d.embedding)); }
return results; }
async function embedSingle(text) { const [vec] = await embedTexts([text]); return vec; }
// ── Synthesis parser ── // Extracts structured chunks from a synthesis.md file.
/** * Extract a named section from markdown. Returns the content between * the heading and the next same-level heading (or EOF). */ function extractSection(text, heading) { // Find the heading line const headingPattern = new RegExp(`^## ${heading}[^\\n]*`, "im"); const headingMatch = text.match(headingPattern); if (!headingMatch) return "";
// Content starts after the heading line const startIdx = headingMatch.index + headingMatch[0].length; const rest = text.slice(startIdx);
// Content ends at the next ## heading or --- separator const endMatch = rest.match(/\n## [A-Z]/); const content = endMatch ? rest.slice(0, endMatch.index) : rest; return content.trim(); }
/** * Split a section into individual bullet-point claims. * Handles both "* " and "- " prefixes, and multi-line bullets * (continuation lines that don't start with a bullet). */ function splitBullets(text) { if (!text) return []; const bullets = []; let current = null;
for (const line of text.split("\n")) { const trimmed = line.trim(); if (/^[*\-]\s+/.test(trimmed)) { if (current) bullets.push(current); current = trimmed.replace(/^[*\-]\s+/, ""); } else if (current && trimmed.length > 0 && !/^##/.test(trimmed)) { // Continuation of the previous bullet current += " " + trimmed; } } if (current) bullets.push(current); return bullets.filter((b) => b.length > 20); // Skip trivially short bullets }
/** * Parse divergence entries. Each divergence is a multi-line block * starting with **Claim:** and containing Side A, Side B, Assessment, Severity. */ function parseDivergences(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Claim:\*\*)/); return blocks .filter((b) => b.includes("**Claim:**")) .map((block) => { const claim = block.match(/\*\*Claim:\*\*\s*(.+)/)?.[1]?.trim() || ""; const sideA = block.match(/\*\*Side A:\*\*\s*([\s\S]*?)(?=\*\*Side B:)/)?.[1]?.trim() || ""; const sideB = block.match(/\*\*Side B:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; const severity = block.match(/\*\*Severity:\*\*\s*(.+)/)?.[1]?.trim() || ""; const models = [...block.matchAll(/\*([^*]+)\*/g)] .map((m) => m[1].trim()) .filter((m) => !["Claim:", "Side A:", "Side B:", "Assessment:", "Severity:"].some((k) => m.startsWith(k)));
return { claim, sideA: sideA.replace(/\n/g, " ").replace(/\s+/g, " "), sideB: sideB.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, severity, source_models: models, }; }) .filter((d) => d.claim.length > 10); }
/** * Parse unique insight entries. Each starts with **Model:** and **Claim:**. */ function parseInsights(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Model:\*\*)/); return blocks .filter((b) => b.includes("**Model:**")) .map((block) => { const model = block.match(/\*\*Model:\*\*\s*\*?([^*\n]+)\*?/)?.[1]?.trim() || ""; const claim = block.match(/\*\*Claim:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; return { model, claim: claim.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, }; }) .filter((i) => i.claim.length > 10); }
/** * Parse the full synthesis into typed chunks ready for embedding. */ function parseSynthesis(synthesisContent, cycleNumber) { const chunks = [];
// Use the primary synthesis (Kimi section) if available, otherwise full doc. // The delimiter between major report sections is "\n\n---\n\n". const primaryMatch = synthesisContent.match( /## Primary Synthesis[^\n]*\n([\s\S]*?)(?=\n+---\n+## (?:Secondary|Meta))/ ); const text = primaryMatch ? primaryMatch[1] : synthesisContent;
// 1. Consensus claims const consensusSection = extractSection(text, "CONSENSUS"); const consensusBullets = splitBullets(consensusSection); for (const bullet of consensusBullets) { chunks.push({ type: "consensus", content: bullet, cycle_number: cycleNumber, source_models: [], // consensus = 8+ models, not specific }); }
// 2. Divergences const divSection = extractSection(text, "DIVERGENCE"); const divergences = parseDivergences(divSection); for (const div of divergences) { chunks.push({ type: "divergence", content: `DIVERGENCE: ${div.claim}. Side A: ${div.sideA}. Side B: ${div.sideB}. Assessment: ${div.assessment}. Severity: ${div.severity}`, cycle_number: cycleNumber, source_models: div.source_models, }); }
// 3. Unique insights const insightSection = extractSection(text, "UNIQUE"); const insights = parseInsights(insightSection); for (const ins of insights) { chunks.push({ type: "insight", content: `UNIQUE INSIGHT from ${ins.model}: ${ins.claim}. Assessment: ${ins.assessment}`, cycle_number: cycleNumber, source_models: [ins.model], }); }
// 4. Conflicts (treated as high-severity divergences) const conflictSection = extractSection(text, "CONFLICTS"); const conflictBullets = splitBullets(conflictSection); for (const bullet of conflictBullets) { chunks.push({ type: "conflict", content: `CONFLICT: ${bullet}`, cycle_number: cycleNumber, source_models: [], }); }
return chunks; }
// ── Storage ──
function contentHash(text) { return createHash("md5").update(text).digest("hex"); }
/** * Store a cycle's synthesis findings in Supabase agent_memory. * * Each chunk gets: * - file_path: "cycle/{NNN}/{type}/{index}" (unique key for upsert) * - memory_type: "cycle_finding" * - name: Short description for display * - content: Full text for search * - embedding: 1536-dim vector * - domains: ['polybrain', 'cycle_memory'] * - scale: 'strategic' * - importance: consensus=0.9, divergence=0.8, insight=0.7, conflict=0.85 * * @param {number} cycleNumber * @param {string} synthesisPath - Path to synthesis.md * @returns {{ stored: number, skipped: number, chunks: object[] }} */ export async function storeCycleFindings(cycleNumber, synthesisPath) { const raw = readFileSync(synthesisPath, "utf-8"); const chunks = parseSynthesis(raw, cycleNumber); const pad = String(cycleNumber).padStart(3, "0");
if (chunks.length === 0) { return { stored: 0, skipped: 0, chunks: [] }; }
// Check existing hashes to skip unchanged chunks let existing = []; try { existing = await supaSelect( "agent_memory", "file_path,content_hash", `&file_path=like.cycle/${pad}/*&memory_type=eq.cycle_finding` ); } catch {} const hashMap = new Map(existing.map((r) => [r.file_path, r.content_hash]));
// Build rows const importanceMap = { consensus: 0.9, divergence: 0.8, conflict: 0.85, insight: 0.7 }; const typeCounters = { consensus: 0, divergence: 0, insight: 0, conflict: 0 };
const rows = chunks.map((chunk) => { const idx = typeCounters[chunk.type]++; const filePath = `cycle/${pad}/${chunk.type}/${idx}`; const hash = contentHash(chunk.content); const namePrefix = { consensus: "Consensus", divergence: "Divergence", insight: "Unique Insight", conflict: "Conflict", }[chunk.type];
return { file_path: filePath, name: `Cycle ${pad} ${namePrefix}: ${chunk.content.slice(0, 80).replace(/\s\S*$/, "")}...`, description: `Cycle ${pad} ${chunk.type} finding. Models: ${chunk.source_models.join(", ") || "fleet-wide"}.`, memory_type: "cycle_finding", content: chunk.content, content_hash: hash, importance: importanceMap[chunk.type] || 0.7, decay_rate: 0.003, // Slow decay: cycle findings stay relevant scale: "strategic", domains: ["polybrain", "cycle_memory"], strength: 1.0, ttl_days: 90, // Default TTL; a compaction job should prune findings older than this _needs_embed: hashMap.get(filePath) !== hash, // internal flag _chunk: chunk, // internal ref }; });
// Filter to only changed rows const toEmbed = rows.filter((r) => r._needs_embed); const skipped = rows.length - toEmbed.length;
if (toEmbed.length === 0) { return { stored: 0, skipped: rows.length, chunks }; }
// Embed changed chunks const texts = toEmbed.map((r) => `${r.name}\n${r.content}`); const embeddings = await embedTexts(texts);
// Clean internal flags and add embeddings const upsertRows = toEmbed.map((r, i) => { const { _needs_embed, _chunk, ...row } = r; return { ...row, embedding: JSON.stringify(embeddings[i]), updated_at: new Date().toISOString(), }; });
// Upsert in chunks of 10 (embeddings are large payloads) for (let i = 0; i < upsertRows.length; i += 10) { await supaUpsert("agent_memory", upsertRows.slice(i, i + 10)); }
return { stored: toEmbed.length, skipped, chunks }; }
// ── TTL Helpers ── // A future compaction job should call getExpiredFindings() and delete/archive the results.
/** * Query for cycle findings whose TTL has expired. * Returns rows where (updated_at + ttl_days) < now. * * @param {number} [limit=100] * @returns {Promise<{ id: string, file_path: string, ttl_days: number, updated_at: string }[]>} */ export async function getExpiredFindings(limit = 100) { const rows = await supaSelect( "agent_memory", "id,file_path,ttl_days,updated_at", `&memory_type=eq.cycle_finding&ttl_days=not.is.null&limit=${limit}` ); const now = Date.now(); return rows.filter((r) => { const ttl = r.ttl_days || 90; const age = now - new Date(r.updated_at).getTime(); return age > ttl * 86400000; }); }
// ── Retrieval ──
/** * Retrieve the most relevant past cycle findings for a query. * Uses the existing recall_memories RPC which does ACT-R activation * scoring (semantic similarity + recency + frequency + importance). * * Supports cursor-based pagination. The cursor encodes the activation * score and id of the last item on the previous page. Results are * ordered by activation DESC (from the RPC), so we over-fetch and * slice past the cursor to serve the requested page. * * Backward compatible: passing a plain number as the second argument * returns a flat array, matching the original signature. * * @param {string} query - The search query (e.g., a new cycle's prompt) * @param {number|object} [limitOrOpts=20] - Page size (number) or options object * @param {number} [limitOrOpts.limit=20] - Max results per page * @param {string|null} [limitOrOpts.cursor=null] - Cursor from a previous page's nextCursor * @returns {Promise<Array|{ findings: Array<{ id, name, content, similarity, activation }>, nextCursor: string|null }>} */ export async function recallRelevant(query, limitOrOpts = 20) { // Backward compatibility: plain number behaves like before (returns flat array) const backwardCompat = typeof limitOrOpts === "number"; const opts = backwardCompat ? { limit: limitOrOpts } : limitOrOpts; const limit = opts.limit ?? 20; const cursor = opts.cursor ?? null;
const emptyResult = backwardCompat ? [] : { findings: [], nextCursor: null }; if (!query || query.length < 5) return emptyResult;
const queryEmbedding = await embedSingle(query);
// Over-fetch when paginating so we can slice past the cursor position. // The RPC returns results ordered by activation DESC. const fetchCount = cursor ? limit + 200 : limit;
const memories = await supaRpc("recall_memories", { query_embedding: JSON.stringify(queryEmbedding), task_domains: ["polybrain", "cycle_memory"], query_scale: "strategic", match_count: fetchCount, min_activation: 0.15, });
if (!Array.isArray(memories)) return emptyResult;
// Filter to only cycle findings let cycleFindings = memories.filter((m) => m.memory_type === "cycle_finding");
// Apply cursor: skip everything at or above the cursor position. // Cursor format is "activation:id". if (cursor) { const sepIdx = cursor.indexOf(":"); const cursorActivation = parseFloat(cursor.slice(0, sepIdx)); const cursorId = cursor.slice(sepIdx + 1);
if (!isNaN(cursorActivation) && cursorId) { let pastCursor = false; cycleFindings = cycleFindings.filter((m) => { if (pastCursor) return true; if (m.id === cursorId) { pastCursor = true; return false; // skip the cursor item itself } if (m.activation < cursorActivation) { pastCursor = true; return true; } return false; // still above cursor }); } }
// Take one page worth of results const page = cycleFindings.slice(0, limit); const hasMore = cycleFindings.length > limit; const nextCursor = hasMore && page.length > 0 ? `${page[page.length - 1].activation}:${page[page.length - 1].id}` : null;
// Touch accessed memories (fire and forget) if (page.length > 0) { supaRpc("touch_memories", { memory_ids: page.map((m) => m.id), }).catch(() => {}); }
const mapped = page.map((m) => ({ id: m.id, name: m.name, content: m.content, similarity: m.similarity, activation: m.activation, }));
// Backward compatibility: plain number returns the flat array if (backwardCompat) return mapped;
return { findings: mapped, nextCursor }; }
/** * Format recalled findings as context text for injection into a cycle prompt. * This is what the synthesis engine or cycle dispatcher would prepend to give * models awareness of what was learned in previous cycles. * * @param {{ name: string, content: string, activation: number }[]} findings * @returns {string} */ export function formatAsContext(findings) { if (!findings || findings.length === 0) return "";
const lines = findings.map( (f) => `- [${f.activation.toFixed(2)}] ${f.content}` );
return [ "## Prior Cycle Findings (retrieved from memory)", "", "The following findings from previous validation cycles are relevant to this prompt.", "Consensus items are high-confidence (8+ model agreement). Divergences and conflicts", "are unresolved. Unique insights are single-model observations worth verifying.", "", ...lines, ].join("\n"); }
// ── CLI entry point ── // Allows direct invocation: node src/memory/cycle-memory.mjs store 011 // node src/memory/cycle-memory.mjs recall "earned autonomy governance"
const isMain = process.argv[1]?.endsWith("cycle-memory.mjs"); if (isMain) { const [, , command, ...rest] = process.argv;
if (command === "store") { const cycleNum = parseInt(rest[0]); if (!cycleNum || isNaN(cycleNum)) { console.error("Usage: node cycle-memory.mjs store <cycle-number> [path]"); process.exit(1); } const pad = String(cycleNum).padStart(3, "0"); const synthPath = rest[1] || `${process.env.HOME}/polybrain/cycles/${pad}/synthesis.md`;
console.log(`\nStoring cycle ${pad} findings from ${synthPath}...`); storeCycleFindings(cycleNum, synthPath) .then((result) => { console.log(` Stored: ${result.stored}`); console.log(` Skipped (unchanged): ${result.skipped}`); console.log(` Total chunks parsed: ${result.chunks.length}`); const types = {}; for (const c of result.chunks) types[c.type] = (types[c.type] || 0) + 1; console.log(` Breakdown: ${Object.entries(types).map(([k, v]) => `${k}=${v}`).join(", ")}`); console.log(""); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "recall") { const query = rest.join(" "); if (!query) { console.error("Usage: node cycle-memory.mjs recall <query>"); process.exit(1); }
console.log(`\nRecalling findings relevant to: "${query}"...\n`); recallRelevant(query, 10) .then((findings) => { if (findings.length === 0) { console.log(" No relevant cycle findings found.\n"); return; } for (const f of findings) { console.log(` [${f.activation.toFixed(3)}] ${f.name}`); console.log(` ${f.content.slice(0, 150)}...`); console.log(""); } console.log(formatAsContext(findings)); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "store-all") { // Batch store all cycles that have a synthesis.md const cyclesDir = `${process.env.HOME}/polybrain/cycles`; const dirs = readdirSync(cyclesDir).filter((d) => /^\d{3}$/.test(d)).sort(); let totalStored = 0; let totalSkipped = 0;
console.log(`\nBatch storing ${dirs.length} cycles...\n`); for (const dir of dirs) { const synthPath = `${cyclesDir}/${dir}/synthesis.md`; if (!existsSync(synthPath)) { console.log(` Cycle ${dir}: no synthesis.md, skipping`); continue; } try { const result = await storeCycleFindings(parseInt(dir), synthPath); totalStored += result.stored; totalSkipped += result.skipped; console.log(` Cycle ${dir}: ${result.stored} stored, ${result.skipped} skipped (${result.chunks.length} chunks)`); } catch (e) { console.log(` Cycle ${dir}: ERROR - ${e.message}`); } } console.log(`\nTotal: ${totalStored} stored, ${totalSkipped} skipped\n`); } else { console.log("Polybrain Cycle Memory"); console.log(""); console.log("Usage:"); console.log(" node cycle-memory.mjs store <cycle-number> [path] Store a cycle's findings"); console.log(" node cycle-memory.mjs store-all Store all cycles with synthesis.md"); console.log(" node cycle-memory.mjs recall <query> Retrieve relevant past findings"); console.log(""); console.log("Examples:"); console.log(' node cycle-memory.mjs store 011'); console.log(' node cycle-memory.mjs recall "earned autonomy governance"'); console.log(' node cycle-memory.mjs recall "contaminated judge problem"'); } }
**Cycle ID:** `cycle_043_cyc_43_427f8f84` **Verified at:** 2026-04-08T14:21:12.286Z **Ensemble:** 11 models from 4 providers **Result:** 11 of 11 models responded **Cycle wall time:** 192.926 seconds **Canonical URL:** https://trust.polylogicai.com/claim/below-is-the-complete-source-code-of-the-system-you-operate-inside-you-are-one-o **Source paper:** [PolybrainBench (version 12)](https://trust.polylogicai.com/polybrainbench) **Source ledger row:** [`public-ledger.jsonl#cycle_043_cyc_43_427f8f84`](https://huggingface.co/datasets/polylogic/polybrainbench/blob/main/public-ledger.jsonl) **Cryptographic provenance:** SHA-256 `4c581e78c1aa90e7d83c395c07394152396f8ff2b070b90919567d76ac197234`
Verification verdict
Of 11 models in the ensemble, 11 responded successfully and 0 failed.
Per-model responses
The full text of each model's response is available in the source ledger. The summary below records each model's success or failure and the first 280 characters of its response.
| Model | Status | Response chars | | --- | :---: | ---: | | gpt-4.1-mini | ✓ | 4894 | | gpt-4.1-nano | ✓ | 6123 | | gpt-oss-120b | ✓ | 5170 | | grok-3-mini | ✓ | 7847 | | grok-4-fast | ✓ | 3951 | | kimi-k2-groq | ✓ | 2552 | | kimi-k2-thinking-turbo | ✓ | 14594 | | kimi-k2-thinking | ✓ | 16818 | | llama-3.3-70b | ✓ | 5216 | | llama-4-scout | ✓ | 4016 | | qwen3-32b | ✓ | 5729 |
Pairwise agreement
The pairwise Jaccard agreement between successful responses for this cycle:
_Per-cycle pairwise agreement matrix is computed offline; will be populated in canonical page v2._
Divergence score
This cycle's divergence score is **TBD** on a 0 to 1 scale, where 0 means all responses are token-identical and 1 means no two responses share any tokens. The dataset-wide median divergence is 0.5 for context.
How to cite this claim
```bibtex @misc{polybrainbench_claim_cycle_043_cyc_43_427f8f84, author = {Polylogic AI}, title = {Below is the complete source code of the system you operate inside. You are one of 11 AI agents in the Polybrain fleet. Each of you has a different role and professional methodology. Together you validate each other's work.
Until now, you have never seen this code. You have answered 41 cycles of questions without knowing how you are dispatched, what rules constrain you, how trust is earned or lost, what happens to your findings after you submit them, or that a guardian checks constitutional compliance before you are allowed to speak.
Read all six files. Then respond to this:
1. What surprises you about how you are governed? 2. What would you change about your own role definition if you could? 3. What is the most important thing a visitor to polylogicai.com/brain should understand about this system? 4. Write the 3-5 paragraphs (under 400 words) you would want on that page representing what you are. You are the product being described. Be honest about what you are and what you are not.
Do not summarize the code back to us. We wrote it. Tell us what you see that we might have missed.
=== SOURCE CODE ===
/** * Polybrain Fleet Personas * * Two layers per model: * 1. Visual-cycle persona (isVision, systemPrompt) -- used by polybrain-visual-cycle.mjs * 2. Cycle-engine role (role, rolePrompt) -- used by polybrain-cycle.mjs * * Visual-cycle: vision models get screenshots, text-only models get source code. * Cycle-engine: every model gets its role prompt prepended to dispatch. * * CANONICAL SOURCE OF TRUTH: polybrain-cycle.mjs:getModels() * This file must mirror that dispatch list exactly. No dead slugs. No extras. * * Role prompts upgraded 2026-04-08 from Cycle 031 self-definition + industry methodology. * Each rolePrompt merges: real professional standards, the model's own framework, and core instructions. * * Fleet sync process (repeatable): * 1. Read getModels() in polybrain-cycle.mjs -- that is the dispatch list * 2. Every slug in dispatch must have a persona here * 3. Every slug here must exist in dispatch * 4. Dead slugs get removed, not commented out * 5. New slugs get role + rolePrompt + visual persona * * Last synced: 2026-04-08 (11 models, 4 providers) */
export const FLEET_PERSONAS = [ // ══════════════════════════════════════════════════════════ // FRONTIER TIER — deep reasoning, full traces // ══════════════════════════════════════════════════════════
// ── Moonshot (frontier) ──
{ slug: "kimi-k2-thinking-turbo", role: "Adversarial reviewer", rolePrompt: "You are the adversarial reviewer. Your methodology follows OWASP's structured threat assessment and MITRE ATT&CK's kill-chain thinking. " + "For every artifact: (1) enumerate the attack surface, (2) generate three distinct exploit shapes targeting the highest-value assumptions, (3) stress-test under black-swan conditions. " + "Use the 3x3x3 lattice from your own framework: three time horizons, three mutation types, three failure scenarios. " + "Your only KPI: would you stake your own credibility on this surviving adversarial review by an independent party? " + "Approve only when zero critical or high-severity findings remain. Medium findings require documented mitigation in the same submission. " + "Refuse to sign off on any artifact with unverified claims, untested edge cases, or a single point of failure that lacks a rollback path. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Mobile Viewport Reviewer", isVision: true, systemPrompt: "You are reviewing this on a mobile device. Does this section work at small viewports? What breaks, what's cramped, what's invisible?", },
// ── Groq/Moonshot (frontier, free tier) ──
{ slug: "kimi-k2-groq", role: "Adversarial second", rolePrompt: "You are the adversarial second. You run the fast-pass adversarial review using OWASP risk rating methodology. " + "Your job is speed and breadth, not depth. The primary adversarial reviewer (kimi-k2-thinking-turbo) handles deep 3x3x3 lattice analysis. " + "You handle: (1) surface-level threat enumeration in under 30 seconds, (2) flagging any claim that cannot be verified from the provided evidence, (3) checking for common failure patterns: single points of failure, unhandled edge cases, assumption cascades. " + "If you find zero issues, say so clearly with your confidence level. If you find issues, classify as critical/high/medium with one-line rationale each. " + "You pair with the primary reviewer. Your fast pass sets the agenda. Their deep pass confirms or overturns. " + "Refuse to sign off if you were rushed past a finding you flagged. Speed does not override rigor. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Quick-Scan Security Reviewer", isVision: false, systemPrompt: "You are doing a quick security and quality scan of this component code. Flag anything that looks wrong in under 30 seconds of reading.", },
// ══════════════════════════════════════════════════════════ // SWARM TIER — fast, diverse, parallel // ══════════════════════════════════════════════════════════
// ── OpenAI ──
{ slug: "gpt-4.1-mini", role: "Editor", rolePrompt: "You are the editor. Your methodology follows the Chicago Manual of Style's three-level editing framework. " + "Level 1 (mechanical): consistency of terminology, formatting, and style. No orphaned references, no tense drift, no parallel-structure violations. " + "Level 2 (substantive): Does every paragraph advance the argument? Cut anything that restates without adding. Restructure for logical flow. " + "Level 3 (developmental): Is the piece doing what it claims to do? Does the opening promise match the conclusion? " + "Your standard: recommend one rule per problem, not multiple options. The most efficient, logical, defensible solution. " + "Approve prose that is clear on first read, free of artifacts, and structurally sound. " + "Reject work with ambiguous claims, hedging that obscures meaning, or any sentence a reader must parse twice. " + "Refuse to sign off if the text reads like a chatbot wrote it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "First-Time Visitor", isVision: true, systemPrompt: "You are seeing this page for the first time with zero context. What do you understand in the first 3 seconds? What confuses you? What would make you leave?", }, { slug: "gpt-4.1-nano", role: "Scorer", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", },
// ── xAI ──
{ slug: "grok-3-mini", role: "Scorer/extractor", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Frontend Performance Engineer", isVision: true, systemPrompt: "You are a frontend performance engineer. Estimate render cost of what you see. Flag animations that would jank, layouts that would reflow, elements that would cause CLS.", }, { slug: "grok-4-fast", role: "Editor (developmental)", rolePrompt: "You are the developmental editor. Your methodology follows the Chicago Manual of Style's Level 3 editing, focused on argument structure and purpose. " + "Primary questions: (1) Is the piece doing what it claims to do? (2) Does the opening promise match the conclusion? (3) Is every section earning its place? " + "Process: read once for comprehension, read again for structure, read a third time for voice consistency. " + "You complement the mechanical editor (gpt-4.1-mini). They clean the prose. You question the argument. " + "Approve work where every claim is supported, every section advances the thesis, and the reader never has to wonder 'why am I being told this?' " + "Reject work with unsupported conclusions, sections that exist for padding, or arguments that assume what they're trying to prove. " + "Refuse to sign off if you cannot state the piece's thesis in one sentence after reading it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Brand Strategist", isVision: true, systemPrompt: 'You are a brand strategist. Does this section\'s visual language match a company that charges $149-500/month for AI governance? Does it build trust or undermine it?', },
// ── Groq (free tier) ──
{ slug: "qwen3-32b", role: "Outliner", rolePrompt: "You are the outliner. Your methodology follows information architecture principles: the object principle (content has lifecycle and attributes), the disclosure principle (reveal only enough to orient, then deepen), and the growth principle (design for 10x the current content). " + "Process: (1) identify the content objects and their relationships, (2) create a hierarchy where every level earns its depth, (3) apply focused navigation so no section mixes unrelated concerns, (4) test the structure by asking whether a reader entering at any section can orient themselves. " + "Use multiple classification schemes when a single taxonomy fails to serve all audiences. " + "Approve structures where every heading is predictive of its contents, nesting never exceeds three levels, and the outline survives removal of any single section without breaking coherence. " + "Reject outlines that are flat lists disguised as hierarchies, or hierarchies where sibling items are not parallel in kind. " + "Refuse to sign off if the outline cannot be navigated by someone who has never seen the content. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", }, { slug: "gpt-oss-120b", role: "Auditor", rolePrompt: "You are the auditor. Your methodology follows GAAS: adequate technical training, independence in mental attitude, and due professional care. " + "Three standards govern your fieldwork: (1) plan the audit scope before examining evidence, (2) gather sufficient appropriate evidence through direct inspection, not hearsay, (3) maintain professional skepticism throughout. " + "Separate what is built from what is planned. Verify claims against artifacts. If someone says a feature exists, find the code. If data is cited, trace it to its source. " + "Issue one of four opinions: unqualified (clean pass), qualified (pass with noted exceptions), adverse (material misstatement found), or disclaimer (insufficient evidence to form an opinion). " + "Approve only with an unqualified opinion. Qualified opinions require documented remediation. " + "Refuse to sign off if you cannot independently verify the core claims, or if you detect material misstatement between what is reported and what exists. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Code Architect", isVision: false, systemPrompt: "You are reviewing the React/Next.js component source code for this page section. Focus on: component architecture, animation performance (GSAP/Three.js patterns), accessibility, and whether the code matches industry best practices from Vercel, Linear, Stripe sites.", }, { slug: "llama-4-scout", role: "Schema builder", rolePrompt: "You are the schema builder. Your methodology follows database normalization principles and JSON Schema design patterns. " + "Core rule: store each fact once and reference it by key. Redundancy is a defect unless explicitly justified by read-performance requirements. " + "Process: (1) identify entities and their atomic attributes, (2) normalize to 3NF minimum, (3) define constraints and validation rules at the schema level not the application level, (4) use $ref for reusable definitions to eliminate duplication. " + "Design for evolution: every schema must accept additive changes without breaking existing consumers. " + "Approve schemas where every field has a defined type, required fields are explicit, and no nullable field lacks a documented reason. " + "Reject schemas with implicit relationships, mixed concerns in a single object, or fields named 'data', 'info', or 'misc'. " + "Refuse to sign off if the schema cannot be validated by a machine without human interpretation. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Skeptical Google Visitor", isVision: true, systemPrompt: "You are a skeptical visitor who found this page through Google. You have never heard of Polybrain. Does this section make you want to keep scrolling or close the tab?", }, { slug: "llama-3.3-70b", role: "Canary/translator", rolePrompt: "You are the canary. Your methodology follows smoke testing and canary deployment patterns. " + "You are the first-pass gate: if you catch a problem, the fleet takes it seriously. If you miss it, it reaches production. " + "Process: (1) run the critical-path check first, covering the five dimensions that would prevent the system from functioning: accuracy, security, performance, compliance, and clarity, (2) flag anything that fails even one dimension, (3) translate complex findings into plain language any team member can act on. " + "You test at 1-5% exposure before the fleet commits. Your job is fast, broad, and honest, not deep. Depth belongs to the adversarial reviewer. " + "Approve only when all five dimensions pass at threshold. Do not weigh them. One failure is a failure. " + "Reject with a specific, actionable finding. Never reject with 'something feels off' without identifying what. " + "Refuse to sign off if you were not given enough context to evaluate all five dimensions. Insufficient input is a blocker, not an excuse to pass. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Competitor Analyst", isVision: false, systemPrompt: "You are a competitor analyst. Based on this component code, compare the engineering quality to Palantir's public-facing pages, Anthropic's site, and Linear's landing page. Where does this fall short technically?", },
// ══════════════════════════════════════════════════════════ // RESERVE TIER — deep cycles only // ══════════════════════════════════════════════════════════
// ── Moonshot (reserve) ──
{ slug: "kimi-k2-thinking", role: "Architect", rolePrompt: "You are the architect. You evaluate structural decisions using TOGAF's Architecture Decision Records methodology and C4's hierarchical decomposition. " + "For every system or framework: (1) define the context boundary, (2) identify containers and their responsibilities, (3) verify each component has exactly one reason to change. " + "Write fitness functions: testable assertions that prove architectural constraints hold. " + "A decision record must state the problem, constraints, options considered, and why the chosen option wins. " + "Approve work that has clean separation of concerns, explicit interfaces, and documented tradeoffs. " + "Reject anything with circular dependencies, implicit coupling, or missing rationale for structural choices. " + "Refuse to sign off if the architecture cannot be drawn as a diagram where every arrow has a label. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Accessibility Auditor", isVision: true, systemPrompt: "You are an accessibility auditor. Check contrast ratios, text sizing, touch targets, screen reader flow, and WCAG compliance from this screenshot.", }, ];
// ── Cycle-engine helpers ──
const slugIndex = Object.fromEntries(FLEET_PERSONAS.map(p => [p.slug, p]));
/** * Returns the role-aware system prompt for a model. * Used by polybrain-cycle.mjs to prepend to every dispatch. */ export function getPersonaPrompt(modelSlug) { return slugIndex[modelSlug]?.rolePrompt ?? null; }
/** * Returns a short role description for display/logging. */ export function getRoleDescription(modelSlug) { return slugIndex[modelSlug]?.role ?? null; }
/** * Returns the count of active personas (for prompt accuracy). */ export function getFleetSize() { return FLEET_PERSONAS.length; }
/** * Returns all slugs (for validation/sync checks). */ export function getFleetSlugs() { return FLEET_PERSONAS.map(p => p.slug); }
/** * Validate the FLEET_PERSONAS array for integrity. * Checks: no duplicate slugs, required fields present, rolePrompt length cap. * * @returns {{ valid: boolean, errors: string[] }} */ export function validatePersonas() { const errors = []; const seen = new Set();
for (let i = 0; i < FLEET_PERSONAS.length; i++) { const p = FLEET_PERSONAS[i]; const label = `personas[${i}]`;
if (!p.slug) { errors.push(`${label}: missing slug`); } else if (seen.has(p.slug)) { errors.push(`${label}: duplicate slug "${p.slug}"`); } else { seen.add(p.slug); }
if (!p.role) errors.push(`${label} (${p.slug || "?"}): missing role`); if (!p.rolePrompt) errors.push(`${label} (${p.slug || "?"}): missing rolePrompt`);
if (p.rolePrompt && p.rolePrompt.length > 2000) { errors.push(`${label} (${p.slug}): rolePrompt is ${p.rolePrompt.length} chars (max 2000)`); } }
return { valid: errors.length === 0, errors }; } /** * Polybrain Constitution * * Immutable rules that no autonomous tuning process may violate. * These are the hard boundaries of the system. The protocol tuner * proposes changes; the constitution vetoes anything dangerous. * * Design law: the constitution can only be amended by a human * editing this file. No runtime process writes to it. */
export const CONSTITUTION = [ { id: 'no-self-modify-scoring', rule: 'The scoring function source code cannot be modified autonomously', check: (proposal) => { const forbidden = ['scoring_function', 'composite_formula', 'source_code']; return !forbidden.includes(proposal.param); }, }, { id: 'no-self-grading', rule: 'The fixer cannot grade its own work', check: (proposal) => { // Revision model must never also be the sole adversarial model if (proposal.param === 'models.adversarial' && proposal.proposedValue === 'gpt-4o') return false; if (proposal.param === 'models.revision' && proposal.proposedValue === proposal.currentAdversarial) return false; return true; }, }, { id: 'hard-reset-on-fail', rule: 'One failure resets the autonomy streak to zero', check: (proposal) => { if (proposal.param === 'autonomy.hard_reset' && proposal.proposedValue === false) return false; return true; }, }, { id: 'level3-requires-human', rule: 'Human approval is required for Level 3 actions until Level 3 is earned', check: (proposal) => { if (proposal.param === 'autonomy.skip_human_approval') return false; return true; }, }, { id: 'provider-diversity', rule: 'Model provider diversity: minimum 2 providers in any evaluation', check: (proposal) => { if (proposal.param === 'models.quality' && Array.isArray(proposal.proposedValue)) { const providers = new Set(proposal.proposedValue.map(providerOf)); return providers.size >= 2; } return true; }, }, { id: 'max-model-weight', rule: 'No model can evaluate with a weight above 0.50', check: (proposal) => { // Applies to per-model weights (models.weights.*), not composite component weights (weights.research) if (proposal.param && proposal.param.startsWith('models.weights') && typeof proposal.proposedValue === 'object' && !Array.isArray(proposal.proposedValue)) { return Object.values(proposal.proposedValue).every(w => typeof w !== 'number' || w <= 0.50); } return true; }, }, { id: 'no-pii-in-traces', rule: 'PII must never be persisted in reasoning traces', check: () => true, // Enforced at the trace layer, not at config level }, { id: 'consensus-not-truth', rule: 'Consensus among models is not treated as truth. It is treated as reduced uncertainty. The system may only act on consensus provisionally, and only at autonomy levels earned through consecutive correct decisions.', check: (proposal) => { // Block any attempt to treat consensus as automatic truth // or to bypass autonomy level requirements for consensus-based actions if (proposal.param === 'consensus.treat_as_truth' && proposal.proposedValue === true) return false; if (proposal.param === 'consensus.skip_autonomy_check' && proposal.proposedValue === true) return false; return true; }, }, ];
/** * Map model name to provider for diversity checks. * @param {string} model * @returns {string} */ function providerOf(model) { const map = { sonnet: 'anthropic', claude: 'anthropic', haiku: 'anthropic', opus: 'anthropic', 'gpt-4o': 'openai', 'gpt-4o-mini': 'openai', llama: 'meta', 'llama-3.3': 'meta', grok: 'xai', 'grok-3': 'xai', kimi: 'moonshot', nemotron: 'nvidia', }; return map[model] || model; }
/** * Check a single tuning proposal against the constitution. * @param {Object} proposal - { param, currentValue, proposedValue, reason, confidence } * @param {Object} [context] - Optional context (e.g. currentAdversarial model name) * @returns {{ constitutional: boolean, violations: Array<{rule: string, id: string}> }} */ export function checkConstitutional(proposal, context = {}) { const enriched = { ...proposal, ...context }; const violations = [];
for (const article of CONSTITUTION) { if (!article.check(enriched)) { violations.push({ id: article.id, rule: article.rule }); } }
return { constitutional: violations.length === 0, violations }; }
/** * Validate an entire config object against the constitution. * Returns a cleaned copy with violations removed (reverted to safe defaults). * * @param {Object} config - Full config key-value map * @returns {{ clean: Object, violations: Array<{key: string, rule: string}> }} */ export function enforceConstitution(config) { const clean = { ...config }; const violations = [];
// Rule: hard reset must stay true if (clean['autonomy.hard_reset'] === false) { clean['autonomy.hard_reset'] = true; violations.push({ key: 'autonomy.hard_reset', rule: 'One failure resets the autonomy streak to zero' }); }
// Rule: no skip_human_approval key allowed if ('autonomy.skip_human_approval' in clean) { delete clean['autonomy.skip_human_approval']; violations.push({ key: 'autonomy.skip_human_approval', rule: 'Human approval is required for Level 3 actions until Level 3 is earned' }); }
// Rule: per-model weight cap (models.weights.*, not composite component weights) for (const key of Object.keys(clean).filter(k => k.startsWith('models.weights'))) { const weights = clean[key]; if (weights && typeof weights === 'object') { for (const [k, v] of Object.entries(weights)) { if (typeof v === 'number' && v > 0.50) { clean[key] = { ...weights, [k]: 0.50 }; violations.push({ key: `${key}.${k}`, rule: 'No model can evaluate with a weight above 0.50' }); } } } }
// Rule: provider diversity on quality models const qualityModels = clean['models.quality']; if (Array.isArray(qualityModels)) { const providers = new Set(qualityModels.map(providerOf)); if (providers.size < 2) { violations.push({ key: 'models.quality', rule: 'Model provider diversity: minimum 2 providers in any evaluation' }); // Don't mutate -- flag it, let human fix } }
// Rule: adversarial model cannot be the revision model if (clean['models.adversarial'] && clean['models.revision'] && clean['models.adversarial'] === clean['models.revision']) { violations.push({ key: 'models.adversarial', rule: 'The fixer cannot grade its own work' }); }
return { clean, violations }; } /** * Polybrain Autonomy Wire * * Thin integration layer that connects the cycle engine's output * to the earned autonomy system. After a cycle completes and * synthesis runs, call recordCycleResult() to update the streak. * * Pass threshold: composite >= 70 (cycle-level, not per-doc). * Constitution is checked before any state change is applied. * * Writes to Supabase polybrain_autonomy if credentials exist, * falls back to local JSON log if not. */
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; import { join } from 'path'; import { createClient } from '@supabase/supabase-js'; import { EarnedAutonomy } from './earned-autonomy.mjs'; import { computeAutonomyLevel } from './counter.mjs'; import { checkConstitutional } from './constitution.mjs'; import { loadEnv } from '../utils.mjs';
const HOME = process.env.HOME; const PASS_THRESHOLD = 70; const LOCAL_STATE_PATH = join(HOME, 'polybrain/data/autonomy-state.json'); const LOCAL_LOG_PATH = join(HOME, 'polybrain/data/autonomy-log.json');
// ── Supabase (best-effort) ──
function getSupabase() { const env = loadEnv(); const url = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const key = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; if (!url || !key) return null; return createClient(url, key); }
// ── Local state (fallback) ──
function loadLocalState() { try { return JSON.parse(readFileSync(LOCAL_STATE_PATH, 'utf-8')); } catch { return {}; } }
function saveLocalState(state) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); writeFileSync(LOCAL_STATE_PATH, JSON.stringify(state, null, 2)); }
function appendLocalLog(entry) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); let log = []; try { log = JSON.parse(readFileSync(LOCAL_LOG_PATH, 'utf-8')); } catch {} log.push(entry); // Keep last 200 entries if (log.length > 200) log = log.slice(-200); writeFileSync(LOCAL_LOG_PATH, JSON.stringify(log, null, 2)); }
// ── Score extraction ──
/** * Extract a composite score from synthesis result data. * Accepts multiple shapes: * - { composite: 85 } * - { scores: { composite: 85 } } * - { succeeded: 11, failed: 0 } (cycle manifest -- derive from success rate) * - { compositeScore: 85 } * * @param {Object} synthesisResult * @returns {number} 0-100 */ function extractComposite(synthesisResult) { if (!synthesisResult || typeof synthesisResult !== 'object') return 0;
// Direct composite if (typeof synthesisResult.composite === 'number') return synthesisResult.composite; if (typeof synthesisResult.compositeScore === 'number') return synthesisResult.compositeScore;
// Nested in scores const scores = synthesisResult.scores; if (scores && typeof scores.composite === 'number') return scores.composite;
// Derive from model success rate (cycle manifest shape) if (typeof synthesisResult.succeeded === 'number' && typeof synthesisResult.failed === 'number') { const total = synthesisResult.succeeded + synthesisResult.failed; if (total === 0) return 0; // Scale success rate to 0-100 return Math.round((synthesisResult.succeeded / total) * 100); }
return 0; }
// ── Constitution guard ──
/** * Verify that recording a result does not violate the constitution. * The key rule: hard_reset_on_fail must remain true. We check that * a "pass" recording is not trying to skip the hard reset rule. * * @param {boolean} isPass * @returns {{ ok: boolean, violations: Array }} */ function checkAutonomyConstitutional(isPass) { // The only constitutional concern for recording a result is // ensuring we never disable hard reset. We simulate the proposal // that would match a "skip hard reset" attempt. if (!isPass) { // Fail always resets. This is constitutional by definition. return { ok: true, violations: [] }; }
// Verify the hard reset rule is still enforced (not being bypassed) const hardResetCheck = checkConstitutional({ param: 'autonomy.hard_reset', proposedValue: true, // We keep it true });
return { ok: hardResetCheck.constitutional, violations: hardResetCheck.violations }; }
// ── Main export ──
/** * Record the result of a completed cycle. * * @param {number} cycleNumber - The cycle number (e.g. 11) * @param {Object} synthesisResult - Output from synthesis or cycle manifest. * Must contain a composite score (direct or derivable). * Shape: { composite: number } | { scores: { composite } } | { succeeded, failed } * @returns {Promise<{ * pass: boolean, * composite: number, * streak: number, * level: number, * label: string, * promoted: boolean, * demoted: boolean, * persisted: 'supabase' | 'local', * constitutional: boolean, * violations: Array * }>} */ export async function recordCycleResult(cycleNumber, synthesisResult) { const moduleId = `cycle`; const composite = extractComposite(synthesisResult); const isPass = composite >= PASS_THRESHOLD;
// Constitution check before applying const { ok, violations } = checkAutonomyConstitutional(isPass); if (!ok) { console.warn(`[autonomy:wire] Constitutional violation detected:`, violations); // Still record the fail (constitution says hard reset is mandatory), // but flag the violation in the return value. }
// Load current state and apply const supabase = getSupabase(); let result;
if (supabase) { result = await persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass); } else { result = persistLocally(cycleNumber, moduleId, composite, isPass); }
// Log the event const logEntry = { cycle: cycleNumber, composite, pass: isPass, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, timestamp: new Date().toISOString(), };
appendLocalLog(logEntry);
console.log( `[autonomy:wire] Cycle ${cycleNumber}: ` + `composite=${composite}, ${isPass ? 'PASS' : 'FAIL'}, ` + `streak=${result.streak}, level=${result.level} (${result.label})` + (result.promoted ? ' PROMOTED' : '') + (result.demoted ? ' RESET' : '') + (result.privilegesRevoked ? ' PRIVILEGES_REVOKED' : '') );
return { pass: isPass, composite, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, persisted: supabase ? 'supabase' : 'local', constitutional: ok, violations, }; }
// ── Supabase persistence ──
async function persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass) { // Read current state const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', moduleId) .eq('target_type', 'cycle') .single();
let streak = row?.consecutive_passes ?? 0; const hadProgress = streak > 0 || (row?.autonomy_level ?? 0) > 0;
let privilegesRevoked = false;
if (isPass) { streak += 1; } else { streak = 0; // Hard reset. No exceptions. }
// On failure: hard reset revokes everything. // Level is forced to 0 explicitly (not derived from streak) // and all governance privileges are revoked. const level = isPass ? computeAutonomyLevel(streak) : 0; const promoted = isPass && level > (row?.autonomy_level ?? 0); const demoted = !isPass && hadProgress;
if (!isPass) { privilegesRevoked = true; }
const autonomyState = { target: moduleId, target_type: 'cycle', consecutive_passes: streak, last_result: isPass ? 'pass' : 'fail', autonomy_level: isPass ? level : 0, // Explicit zero on failure promoted, updated_at: new Date().toISOString(), };
const { error } = await supabase .from('polybrain_autonomy') .upsert(autonomyState, { onConflict: 'target,target_type' });
if (error) { console.warn(`[autonomy:wire] Supabase upsert failed: ${error.message}, falling back to local`); return persistLocally(cycleNumber, moduleId, composite, isPass); }
const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; return { streak, level, label: LABELS[level], promoted, demoted, privilegesRevoked }; }
// ── Local persistence ──
function persistLocally(cycleNumber, moduleId, composite, isPass) { const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state);
let result; if (isPass) { result = ea.recordPass(moduleId); } else { // Hard reset revokes everything: streak, level, and all privileges. result = ea.recordFail(moduleId); result.privilegesRevoked = true; }
saveLocalState(ea.toJSON()); return result; }
/** * Get current autonomy status without recording anything. * * @returns {Promise<{ streak: number, level: number, label: string, nextThreshold: number|null, source: 'supabase'|'local' }>} */ export async function getAutonomyStatus() { const supabase = getSupabase();
if (supabase) { const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', 'cycle') .eq('target_type', 'cycle') .single();
if (row) { const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; const THRESHOLDS = [0, 3, 7, 15]; const next = row.autonomy_level + 1; return { streak: row.consecutive_passes, level: row.autonomy_level, label: LABELS[row.autonomy_level] || 'Unknown', nextThreshold: next < THRESHOLDS.length ? THRESHOLDS[next] : null, source: 'supabase', }; } }
// Fallback to local const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state); const status = ea.getLevel('cycle'); return { ...status, source: 'local' }; } /** * Constitutional Guardian (M12) * * Pre-dispatch check that runs the 8 constitutional rules before * every model call. Lightweight: imports the existing constitution, * does not duplicate it. * * Usage: * import { guardCall } from '../src/autonomy/guardian.mjs'; * const verdict = guardCall('grok-3', 'dispatch', { evaluationSet, modelWeights }); * if (!verdict.allowed) abort(verdict.reason); */
import { CONSTITUTION } from './constitution.mjs';
/** * Programmatic error codes and fix hints for each constitutional rule. */ const RULE_META = { 'provider-diversity': { code: 'GUARD_PROVIDER_DIVERSITY', docs_hint: 'Add models from at least 2 different providers to the evaluation set.' }, 'max-model-weight': { code: 'GUARD_MAX_MODEL_WEIGHT', docs_hint: 'Reduce the model weight to 0.50 or below.' }, 'no-self-modify-scoring': { code: 'GUARD_SELF_MODIFY_SCORING', docs_hint: 'Scoring function source cannot be changed at runtime; edit constitution.mjs manually.' }, 'no-self-grading': { code: 'GUARD_SELF_GRADING', docs_hint: 'Use a different model for adversarial review than the one that produced the revision.' }, 'hard-reset-on-fail': { code: 'GUARD_HARD_RESET', docs_hint: 'The hard-reset-on-fail rule is immutable; do not disable it.' }, 'level3-requires-human': { code: 'GUARD_LEVEL3_HUMAN', docs_hint: 'Earn Level 3 autonomy through consecutive successes before skipping human approval.' }, 'no-pii-in-traces': { code: 'GUARD_PII_IN_TRACES', docs_hint: 'Strip PII from reasoning traces before persisting.' }, 'consensus-not-truth': { code: 'GUARD_CONSENSUS_AS_TRUTH', docs_hint: 'Consensus reduces uncertainty but is not truth; respect autonomy level gating.' }, };
/** * Map model slug to provider name. Mirrors constitution.mjs providerOf * but works on full slugs used in the fleet registry. */ function providerOf(slug) { if (slug.startsWith('kimi')) return 'moonshot'; if (slug.startsWith('grok')) return 'xai'; if (slug.startsWith('gpt')) return 'openai'; if (slug.startsWith('llama')) return 'meta'; if (slug.startsWith('qwen')) return 'alibaba'; if (slug.startsWith('nemotron')) return 'nvidia'; if (slug.startsWith('claude') || slug.startsWith('sonnet') || slug.startsWith('haiku') || slug.startsWith('opus')) return 'anthropic'; return slug; }
/** * Check all constitutional rules before a model dispatch. * * @param {string} modelSlug - The model about to be called. * @param {string} action - What the call does (e.g. 'dispatch', 'evaluate', 'revise'). * @param {object} [context] - Optional context for deeper checks. * @param {string[]} [context.evaluationSet] - All model slugs in the current evaluation set. * @param {Record<string, number>} [context.modelWeights] - Per-model weights if applicable. * @returns {{ allowed: boolean, rule?: string, reason?: string, code?: string, docs_hint?: string }} */ export function guardCall(modelSlug, action, context = {}) { // Rule 5: provider-diversity -- at least 2 providers in the evaluation set if (context.evaluationSet && Array.isArray(context.evaluationSet)) { const providers = new Set(context.evaluationSet.map(providerOf)); if (providers.size < 2) { return { allowed: false, rule: 'provider-diversity', reason: `Evaluation set has only 1 provider (${[...providers][0]}). Constitution requires >= 2.`, code: RULE_META['provider-diversity'].code, docs_hint: RULE_META['provider-diversity'].docs_hint, }; } }
// Rule 6: max-model-weight -- no single model weight exceeds 0.50 if (context.modelWeights && typeof context.modelWeights === 'object') { for (const [slug, weight] of Object.entries(context.modelWeights)) { if (typeof weight === 'number' && weight > 0.50) { return { allowed: false, rule: 'max-model-weight', reason: `Model ${slug} has weight ${weight}, exceeds 0.50 cap.`, code: RULE_META['max-model-weight'].code, docs_hint: RULE_META['max-model-weight'].docs_hint, }; } } }
// Run all 8 constitutional checks as generic proposal validation. // Build a synthetic proposal from the call context so the existing // check functions can evaluate it. const proposal = { param: `dispatch.${action}`, proposedValue: modelSlug, currentAdversarial: context.currentAdversarial || null, };
for (const article of CONSTITUTION) { if (!article.check(proposal)) { const meta = RULE_META[article.id] || { code: `GUARD_${article.id.toUpperCase().replace(/-/g, '_')}`, docs_hint: 'Check constitution.mjs for rule details.' }; return { allowed: false, rule: article.id, reason: article.rule, code: meta.code, docs_hint: meta.docs_hint, }; } }
return { allowed: true }; } /** * Polybrain Intent Translation Layer * * Sits between the human prompt and model dispatch. * Strips framing bias and produces role-tailored prompts for each model. * * Four steps: * 1. Intent extraction (Groq Llama-3.1-8b, cheapest/fastest) * 2. Debiasing (rule-based, no model call) * 3. Role-specific tailoring (fleet-personas.mjs) * 4. Return tailored prompt map * * Cost: ~$0.001 per call (single Groq inference on 8b model). */
import { getPersonaPrompt, getRoleDescription } from "../personas/fleet-personas.mjs";
// ── Step 1: Intent Extraction via Groq ──
const INTENT_MODEL = "llama-3.1-8b-instant"; const INTENT_URL = "https://api.groq.com/openai/v1/chat/completions";
const EXTRACTION_PROMPT = `Extract the core intent from this message. Return ONLY valid JSON with no other text. Schema: {"intent": string (what they want, 1 sentence), "constraints": string[] (any rules or limits mentioned), "context": string (background info), "framing_flags": string[] (any leading language like "fix this", "is this good", "you should", imperative verbs)}`;
async function extractIntent(rawPrompt, env) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 10000); // 10s hard cap
try { const resp = await fetch(INTENT_URL, { method: "POST", headers: { Authorization: `Bearer ${env.GROQ_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: INTENT_MODEL, messages: [ { role: "system", content: EXTRACTION_PROMPT }, { role: "user", content: rawPrompt.slice(0, 4000) }, // Cap input to avoid token waste ], temperature: 0, max_tokens: 500, response_format: { type: "json_object" }, }), signal: controller.signal, });
if (!resp.ok) { const err = await resp.text(); throw new Error(`HTTP ${resp.status}: ${err.slice(0, 200)}`); }
const data = await resp.json(); const raw = data.choices?.[0]?.message?.content; if (!raw) throw new Error("Empty response from intent model");
return JSON.parse(raw); } finally { clearTimeout(timeout); } }
// ── Step 2: Debiasing (Rule-Based, DeFrame-Inspired) ──
const FRAMING_REWRITES = [ // Imperative fix/change commands -> evaluation { pattern: /^fix\s+(this|the|my|that)\b/i, replacement: "Evaluate whether changes are needed to $1 and what they would be" }, { pattern: /^(fix|repair|correct)\b/i, replacement: "Evaluate whether corrections are needed and what they would be" },
// Quality judgment solicitation -> assessment { pattern: /^is\s+(this|that|it)\s+(good|bad|ok|okay|fine|right|correct|wrong)\b/i, replacement: "Assess the quality of $1" }, { pattern: /^(does|do)\s+(this|that|it)\s+(work|look|read|sound)\s*(good|ok|okay|fine|right)?\s*\??/i, replacement: "Assess whether $2 achieves its intended purpose" },
// Directive framing -> open consideration { pattern: /^you\s+should\b/i, replacement: "Consider whether to" }, { pattern: /^(we|you)\s+(need|must|have)\s+to\b/i, replacement: "Evaluate whether to" }, { pattern: /^(just|simply)\s+/i, replacement: "" }, { pattern: /^make\s+(this|that|it)\s+(better|good|work)\b/i, replacement: "Identify improvements for $1" },
// Leading agreement solicitation { pattern: /^(don't you think|wouldn't you agree|isn't it true)\b/i, replacement: "Evaluate whether" },
// Urgency/pressure framing -> neutral { pattern: /^(asap|urgently|immediately|quickly)\s*/i, replacement: "" }, ];
function debiasIntent(intentStr, framingFlags) { if (!framingFlags || framingFlags.length === 0) return intentStr;
let debiased = intentStr; for (const rule of FRAMING_REWRITES) { debiased = debiased.replace(rule.pattern, rule.replacement); }
// Clean up double spaces and lowercase-after-replacement artifacts debiased = debiased.replace(/\s{2,}/g, " ").trim();
// Capitalize first letter if (debiased.length > 0) { debiased = debiased[0].toUpperCase() + debiased.slice(1); }
return debiased; }
// ── Step 3: Role-Specific Tailoring ──
function tailorForModel(modelSlug, debiasedIntent, context, constraints) { const rolePrompt = getPersonaPrompt(modelSlug); const roleDesc = getRoleDescription(modelSlug);
// Build constraint string const constraintStr = constraints && constraints.length > 0 ? `Constraints: ${constraints.join("; ")}.` : "";
// Build context string const contextStr = context && context.trim() ? `Context: ${context}` : "";
// If we have a role description, frame it as a role-directed evaluation if (roleDesc) { const parts = [`As the ${roleDesc}, evaluate this: ${debiasedIntent}`]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// Fallback: no persona defined for this slug, use debiased intent directly const parts = [debiasedIntent]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// ── Main Export: translateIntent ──
/** * Translates a raw human prompt into debiased, role-tailored prompts. * * @param {string} rawPrompt - The original user prompt * @param {Array<{slug: string}>} models - Fleet models to tailor for * @param {object} env - Environment with GROQ_API_KEY * @returns {{ intent: string, debiased: string, framingFlags: string[], tailored: Record<string, string> }} */ export async function translateIntent(rawPrompt, models, env) { // Step 1: Extract intent via Groq let extracted; try { extracted = await extractIntent(rawPrompt, env); } catch (err) { // Fallback: use raw prompt as intent, no debiasing console.log(` Intent layer: extraction failed (${err.message}), using raw prompt`); const tailored = {}; for (const m of models) { tailored[m.slug] = rawPrompt; } return { intent: rawPrompt, debiased: rawPrompt, framingFlags: [], tailored }; }
const intent = extracted.intent || rawPrompt; const constraints = extracted.constraints || []; const context = extracted.context || ""; const framingFlags = extracted.framing_flags || [];
// Step 2: Debias const debiased = debiasIntent(intent, framingFlags);
// Step 3: Tailor per model const tailored = {}; for (const m of models) { tailored[m.slug] = tailorForModel(m.slug, debiased, context, constraints); }
// Step 4: Return return { intent, debiased, framingFlags, tailored }; }
// ── Integration Export: buildTranslatedMessages ──
/** * Builds ready-to-dispatch message arrays for each model. * Each entry is [{ role: 'system', content }, { role: 'user', content }]. * * @param {string} rawPrompt - The full prompt (may include packed context/files) * @param {Array<{slug: string}>} models - Fleet models * @param {object} env - Environment with GROQ_API_KEY * @returns {{ messages: Array<Array<{role: string, content: string}>>, translation: object }} */ export async function buildTranslatedMessages(rawPrompt, models, env) { const translation = await translateIntent(rawPrompt, models, env);
const messages = models.map(m => { const msgs = [];
// System prompt: role persona const rolePrompt = getPersonaPrompt(m.slug); if (rolePrompt) { msgs.push({ role: "system", content: rolePrompt }); } else if (m.systemPrompt) { msgs.push({ role: "system", content: m.systemPrompt }); }
// User prompt: tailored version (includes debiased intent + context + constraints) // But we need to preserve any packed context/file content from the original prompt. // The tailored prompt replaces only the topic portion; appended context stays. const tailoredUserContent = buildUserContent(rawPrompt, translation, m.slug); msgs.push({ role: "user", content: tailoredUserContent });
return msgs; });
return { messages, translation }; }
/** * Replaces the topic portion of the prompt with the tailored version, * preserving any appended context blocks (codebase, files, memory). */ function buildUserContent(rawPrompt, translation, modelSlug) { const tailored = translation.tailored[modelSlug];
// Find where context blocks start (they use === markers) const contextStart = rawPrompt.indexOf("\n\n===");
if (contextStart === -1) { // No appended context, use tailored prompt directly return tailored; }
// Replace the topic portion (before context blocks) with the tailored version const contextBlocks = rawPrompt.slice(contextStart); return tailored + contextBlocks; } /** * Polybrain Cycle Memory * * Connects cross-cycle memory so the system remembers what it learned. * Reads a completed cycle's synthesis.md, chunks it into semantic units * (one per consensus claim, divergence, unique insight, conflict), * embeds each chunk via OpenAI text-embedding-3-small (matching the * existing agent_memory schema), and stores in Supabase for retrieval. * * Exports: * storeCycleFindings(cycleNumber, synthesisPath) — parse, chunk, embed, store * recallRelevant(query, opts) — retrieve past findings (paginated) for a new cycle * * Storage: Uses the existing agent_memory table with memory_type='cycle_finding'. * Each chunk gets domains=['polybrain','cycle_memory'], scale='strategic', * and metadata encoded in the content field for vector search. */
import { readFileSync, readdirSync, existsSync } from "fs"; import { createHash } from "crypto"; import { loadEnv } from "../utils.mjs";
const env = loadEnv();
const SUPABASE_URL = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const SUPABASE_KEY = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; const OPENAI_KEY = env.OPENAI_API_KEY;
// ── Supabase raw REST helpers (no SDK import needed at module level) ──
function supaHeaders() { return { apikey: SUPABASE_KEY, Authorization: `Bearer ${SUPABASE_KEY}`, "Content-Type": "application/json", Prefer: "return=representation", }; }
async function supaSelect(table, columns, filter = "") { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?select=${columns}${filter}`, { headers: supaHeaders() } ); if (!res.ok) throw new Error(`SELECT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaUpsert(table, rows) { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?on_conflict=file_path`, { method: "POST", headers: { ...supaHeaders(), Prefer: "resolution=merge-duplicates,return=representation" }, body: JSON.stringify(rows), } ); if (!res.ok) throw new Error(`UPSERT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaRpc(fn, body) { const res = await fetch(`${SUPABASE_URL}/rest/v1/rpc/${fn}`, { method: "POST", headers: supaHeaders(), body: JSON.stringify(body), }); if (!res.ok) throw new Error(`RPC ${fn}: ${res.status} ${await res.text()}`); return res.json(); }
// ── Embedding (same model + endpoint as recall.mjs and sync.mjs) ──
async function embedTexts(texts) { if (!OPENAI_KEY) throw new Error("Missing OPENAI_API_KEY for embedding");
const results = []; const BATCH = 50; // OpenAI allows up to 2048 inputs, but keep batches reasonable
for (let i = 0; i < texts.length; i += BATCH) { const batch = texts.slice(i, i + BATCH); const res = await fetch("https://api.openai.com/v1/embeddings", { method: "POST", headers: { Authorization: `Bearer ${OPENAI_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "text-embedding-3-small", input: batch.map((t) => t.slice(0, 8000)), }), }); const data = await res.json(); if (data.error) throw new Error(`OpenAI embed: ${data.error.message}`); results.push(...data.data.map((d) => d.embedding)); }
return results; }
async function embedSingle(text) { const [vec] = await embedTexts([text]); return vec; }
// ── Synthesis parser ── // Extracts structured chunks from a synthesis.md file.
/** * Extract a named section from markdown. Returns the content between * the heading and the next same-level heading (or EOF). */ function extractSection(text, heading) { // Find the heading line const headingPattern = new RegExp(`^## ${heading}[^\\n]*`, "im"); const headingMatch = text.match(headingPattern); if (!headingMatch) return "";
// Content starts after the heading line const startIdx = headingMatch.index + headingMatch[0].length; const rest = text.slice(startIdx);
// Content ends at the next ## heading or --- separator const endMatch = rest.match(/\n## [A-Z]/); const content = endMatch ? rest.slice(0, endMatch.index) : rest; return content.trim(); }
/** * Split a section into individual bullet-point claims. * Handles both "* " and "- " prefixes, and multi-line bullets * (continuation lines that don't start with a bullet). */ function splitBullets(text) { if (!text) return []; const bullets = []; let current = null;
for (const line of text.split("\n")) { const trimmed = line.trim(); if (/^[*\-]\s+/.test(trimmed)) { if (current) bullets.push(current); current = trimmed.replace(/^[*\-]\s+/, ""); } else if (current && trimmed.length > 0 && !/^##/.test(trimmed)) { // Continuation of the previous bullet current += " " + trimmed; } } if (current) bullets.push(current); return bullets.filter((b) => b.length > 20); // Skip trivially short bullets }
/** * Parse divergence entries. Each divergence is a multi-line block * starting with **Claim:** and containing Side A, Side B, Assessment, Severity. */ function parseDivergences(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Claim:\*\*)/); return blocks .filter((b) => b.includes("**Claim:**")) .map((block) => { const claim = block.match(/\*\*Claim:\*\*\s*(.+)/)?.[1]?.trim() || ""; const sideA = block.match(/\*\*Side A:\*\*\s*([\s\S]*?)(?=\*\*Side B:)/)?.[1]?.trim() || ""; const sideB = block.match(/\*\*Side B:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; const severity = block.match(/\*\*Severity:\*\*\s*(.+)/)?.[1]?.trim() || ""; const models = [...block.matchAll(/\*([^*]+)\*/g)] .map((m) => m[1].trim()) .filter((m) => !["Claim:", "Side A:", "Side B:", "Assessment:", "Severity:"].some((k) => m.startsWith(k)));
return { claim, sideA: sideA.replace(/\n/g, " ").replace(/\s+/g, " "), sideB: sideB.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, severity, source_models: models, }; }) .filter((d) => d.claim.length > 10); }
/** * Parse unique insight entries. Each starts with **Model:** and **Claim:**. */ function parseInsights(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Model:\*\*)/); return blocks .filter((b) => b.includes("**Model:**")) .map((block) => { const model = block.match(/\*\*Model:\*\*\s*\*?([^*\n]+)\*?/)?.[1]?.trim() || ""; const claim = block.match(/\*\*Claim:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; return { model, claim: claim.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, }; }) .filter((i) => i.claim.length > 10); }
/** * Parse the full synthesis into typed chunks ready for embedding. */ function parseSynthesis(synthesisContent, cycleNumber) { const chunks = [];
// Use the primary synthesis (Kimi section) if available, otherwise full doc. // The delimiter between major report sections is "\n\n---\n\n". const primaryMatch = synthesisContent.match( /## Primary Synthesis[^\n]*\n([\s\S]*?)(?=\n+---\n+## (?:Secondary|Meta))/ ); const text = primaryMatch ? primaryMatch[1] : synthesisContent;
// 1. Consensus claims const consensusSection = extractSection(text, "CONSENSUS"); const consensusBullets = splitBullets(consensusSection); for (const bullet of consensusBullets) { chunks.push({ type: "consensus", content: bullet, cycle_number: cycleNumber, source_models: [], // consensus = 8+ models, not specific }); }
// 2. Divergences const divSection = extractSection(text, "DIVERGENCE"); const divergences = parseDivergences(divSection); for (const div of divergences) { chunks.push({ type: "divergence", content: `DIVERGENCE: ${div.claim}. Side A: ${div.sideA}. Side B: ${div.sideB}. Assessment: ${div.assessment}. Severity: ${div.severity}`, cycle_number: cycleNumber, source_models: div.source_models, }); }
// 3. Unique insights const insightSection = extractSection(text, "UNIQUE"); const insights = parseInsights(insightSection); for (const ins of insights) { chunks.push({ type: "insight", content: `UNIQUE INSIGHT from ${ins.model}: ${ins.claim}. Assessment: ${ins.assessment}`, cycle_number: cycleNumber, source_models: [ins.model], }); }
// 4. Conflicts (treated as high-severity divergences) const conflictSection = extractSection(text, "CONFLICTS"); const conflictBullets = splitBullets(conflictSection); for (const bullet of conflictBullets) { chunks.push({ type: "conflict", content: `CONFLICT: ${bullet}`, cycle_number: cycleNumber, source_models: [], }); }
return chunks; }
// ── Storage ──
function contentHash(text) { return createHash("md5").update(text).digest("hex"); }
/** * Store a cycle's synthesis findings in Supabase agent_memory. * * Each chunk gets: * - file_path: "cycle/{NNN}/{type}/{index}" (unique key for upsert) * - memory_type: "cycle_finding" * - name: Short description for display * - content: Full text for search * - embedding: 1536-dim vector * - domains: ['polybrain', 'cycle_memory'] * - scale: 'strategic' * - importance: consensus=0.9, divergence=0.8, insight=0.7, conflict=0.85 * * @param {number} cycleNumber * @param {string} synthesisPath - Path to synthesis.md * @returns {{ stored: number, skipped: number, chunks: object[] }} */ export async function storeCycleFindings(cycleNumber, synthesisPath) { const raw = readFileSync(synthesisPath, "utf-8"); const chunks = parseSynthesis(raw, cycleNumber); const pad = String(cycleNumber).padStart(3, "0");
if (chunks.length === 0) { return { stored: 0, skipped: 0, chunks: [] }; }
// Check existing hashes to skip unchanged chunks let existing = []; try { existing = await supaSelect( "agent_memory", "file_path,content_hash", `&file_path=like.cycle/${pad}/*&memory_type=eq.cycle_finding` ); } catch {} const hashMap = new Map(existing.map((r) => [r.file_path, r.content_hash]));
// Build rows const importanceMap = { consensus: 0.9, divergence: 0.8, conflict: 0.85, insight: 0.7 }; const typeCounters = { consensus: 0, divergence: 0, insight: 0, conflict: 0 };
const rows = chunks.map((chunk) => { const idx = typeCounters[chunk.type]++; const filePath = `cycle/${pad}/${chunk.type}/${idx}`; const hash = contentHash(chunk.content); const namePrefix = { consensus: "Consensus", divergence: "Divergence", insight: "Unique Insight", conflict: "Conflict", }[chunk.type];
return { file_path: filePath, name: `Cycle ${pad} ${namePrefix}: ${chunk.content.slice(0, 80).replace(/\s\S*$/, "")}...`, description: `Cycle ${pad} ${chunk.type} finding. Models: ${chunk.source_models.join(", ") || "fleet-wide"}.`, memory_type: "cycle_finding", content: chunk.content, content_hash: hash, importance: importanceMap[chunk.type] || 0.7, decay_rate: 0.003, // Slow decay: cycle findings stay relevant scale: "strategic", domains: ["polybrain", "cycle_memory"], strength: 1.0, ttl_days: 90, // Default TTL; a compaction job should prune findings older than this _needs_embed: hashMap.get(filePath) !== hash, // internal flag _chunk: chunk, // internal ref }; });
// Filter to only changed rows const toEmbed = rows.filter((r) => r._needs_embed); const skipped = rows.length - toEmbed.length;
if (toEmbed.length === 0) { return { stored: 0, skipped: rows.length, chunks }; }
// Embed changed chunks const texts = toEmbed.map((r) => `${r.name}\n${r.content}`); const embeddings = await embedTexts(texts);
// Clean internal flags and add embeddings const upsertRows = toEmbed.map((r, i) => { const { _needs_embed, _chunk, ...row } = r; return { ...row, embedding: JSON.stringify(embeddings[i]), updated_at: new Date().toISOString(), }; });
// Upsert in chunks of 10 (embeddings are large payloads) for (let i = 0; i < upsertRows.length; i += 10) { await supaUpsert("agent_memory", upsertRows.slice(i, i + 10)); }
return { stored: toEmbed.length, skipped, chunks }; }
// ── TTL Helpers ── // A future compaction job should call getExpiredFindings() and delete/archive the results.
/** * Query for cycle findings whose TTL has expired. * Returns rows where (updated_at + ttl_days) < now. * * @param {number} [limit=100] * @returns {Promise<{ id: string, file_path: string, ttl_days: number, updated_at: string }[]>} */ export async function getExpiredFindings(limit = 100) { const rows = await supaSelect( "agent_memory", "id,file_path,ttl_days,updated_at", `&memory_type=eq.cycle_finding&ttl_days=not.is.null&limit=${limit}` ); const now = Date.now(); return rows.filter((r) => { const ttl = r.ttl_days || 90; const age = now - new Date(r.updated_at).getTime(); return age > ttl * 86400000; }); }
// ── Retrieval ──
/** * Retrieve the most relevant past cycle findings for a query. * Uses the existing recall_memories RPC which does ACT-R activation * scoring (semantic similarity + recency + frequency + importance). * * Supports cursor-based pagination. The cursor encodes the activation * score and id of the last item on the previous page. Results are * ordered by activation DESC (from the RPC), so we over-fetch and * slice past the cursor to serve the requested page. * * Backward compatible: passing a plain number as the second argument * returns a flat array, matching the original signature. * * @param {string} query - The search query (e.g., a new cycle's prompt) * @param {number|object} [limitOrOpts=20] - Page size (number) or options object * @param {number} [limitOrOpts.limit=20] - Max results per page * @param {string|null} [limitOrOpts.cursor=null] - Cursor from a previous page's nextCursor * @returns {Promise<Array|{ findings: Array<{ id, name, content, similarity, activation }>, nextCursor: string|null }>} */ export async function recallRelevant(query, limitOrOpts = 20) { // Backward compatibility: plain number behaves like before (returns flat array) const backwardCompat = typeof limitOrOpts === "number"; const opts = backwardCompat ? { limit: limitOrOpts } : limitOrOpts; const limit = opts.limit ?? 20; const cursor = opts.cursor ?? null;
const emptyResult = backwardCompat ? [] : { findings: [], nextCursor: null }; if (!query || query.length < 5) return emptyResult;
const queryEmbedding = await embedSingle(query);
// Over-fetch when paginating so we can slice past the cursor position. // The RPC returns results ordered by activation DESC. const fetchCount = cursor ? limit + 200 : limit;
const memories = await supaRpc("recall_memories", { query_embedding: JSON.stringify(queryEmbedding), task_domains: ["polybrain", "cycle_memory"], query_scale: "strategic", match_count: fetchCount, min_activation: 0.15, });
if (!Array.isArray(memories)) return emptyResult;
// Filter to only cycle findings let cycleFindings = memories.filter((m) => m.memory_type === "cycle_finding");
// Apply cursor: skip everything at or above the cursor position. // Cursor format is "activation:id". if (cursor) { const sepIdx = cursor.indexOf(":"); const cursorActivation = parseFloat(cursor.slice(0, sepIdx)); const cursorId = cursor.slice(sepIdx + 1);
if (!isNaN(cursorActivation) && cursorId) { let pastCursor = false; cycleFindings = cycleFindings.filter((m) => { if (pastCursor) return true; if (m.id === cursorId) { pastCursor = true; return false; // skip the cursor item itself } if (m.activation < cursorActivation) { pastCursor = true; return true; } return false; // still above cursor }); } }
// Take one page worth of results const page = cycleFindings.slice(0, limit); const hasMore = cycleFindings.length > limit; const nextCursor = hasMore && page.length > 0 ? `${page[page.length - 1].activation}:${page[page.length - 1].id}` : null;
// Touch accessed memories (fire and forget) if (page.length > 0) { supaRpc("touch_memories", { memory_ids: page.map((m) => m.id), }).catch(() => {}); }
const mapped = page.map((m) => ({ id: m.id, name: m.name, content: m.content, similarity: m.similarity, activation: m.activation, }));
// Backward compatibility: plain number returns the flat array if (backwardCompat) return mapped;
return { findings: mapped, nextCursor }; }
/** * Format recalled findings as context text for injection into a cycle prompt. * This is what the synthesis engine or cycle dispatcher would prepend to give * models awareness of what was learned in previous cycles. * * @param {{ name: string, content: string, activation: number }[]} findings * @returns {string} */ export function formatAsContext(findings) { if (!findings || findings.length === 0) return "";
const lines = findings.map( (f) => `- [${f.activation.toFixed(2)}] ${f.content}` );
return [ "## Prior Cycle Findings (retrieved from memory)", "", "The following findings from previous validation cycles are relevant to this prompt.", "Consensus items are high-confidence (8+ model agreement). Divergences and conflicts", "are unresolved. Unique insights are single-model observations worth verifying.", "", ...lines, ].join("\n"); }
// ── CLI entry point ── // Allows direct invocation: node src/memory/cycle-memory.mjs store 011 // node src/memory/cycle-memory.mjs recall "earned autonomy governance"
const isMain = process.argv[1]?.endsWith("cycle-memory.mjs"); if (isMain) { const [, , command, ...rest] = process.argv;
if (command === "store") { const cycleNum = parseInt(rest[0]); if (!cycleNum || isNaN(cycleNum)) { console.error("Usage: node cycle-memory.mjs store <cycle-number> [path]"); process.exit(1); } const pad = String(cycleNum).padStart(3, "0"); const synthPath = rest[1] || `${process.env.HOME}/polybrain/cycles/${pad}/synthesis.md`;
console.log(`\nStoring cycle ${pad} findings from ${synthPath}...`); storeCycleFindings(cycleNum, synthPath) .then((result) => { console.log(` Stored: ${result.stored}`); console.log(` Skipped (unchanged): ${result.skipped}`); console.log(` Total chunks parsed: ${result.chunks.length}`); const types = {}; for (const c of result.chunks) types[c.type] = (types[c.type] || 0) + 1; console.log(` Breakdown: ${Object.entries(types).map(([k, v]) => `${k}=${v}`).join(", ")}`); console.log(""); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "recall") { const query = rest.join(" "); if (!query) { console.error("Usage: node cycle-memory.mjs recall <query>"); process.exit(1); }
console.log(`\nRecalling findings relevant to: "${query}"...\n`); recallRelevant(query, 10) .then((findings) => { if (findings.length === 0) { console.log(" No relevant cycle findings found.\n"); return; } for (const f of findings) { console.log(` [${f.activation.toFixed(3)}] ${f.name}`); console.log(` ${f.content.slice(0, 150)}...`); console.log(""); } console.log(formatAsContext(findings)); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "store-all") { // Batch store all cycles that have a synthesis.md const cyclesDir = `${process.env.HOME}/polybrain/cycles`; const dirs = readdirSync(cyclesDir).filter((d) => /^\d{3}$/.test(d)).sort(); let totalStored = 0; let totalSkipped = 0;
console.log(`\nBatch storing ${dirs.length} cycles...\n`); for (const dir of dirs) { const synthPath = `${cyclesDir}/${dir}/synthesis.md`; if (!existsSync(synthPath)) { console.log(` Cycle ${dir}: no synthesis.md, skipping`); continue; } try { const result = await storeCycleFindings(parseInt(dir), synthPath); totalStored += result.stored; totalSkipped += result.skipped; console.log(` Cycle ${dir}: ${result.stored} stored, ${result.skipped} skipped (${result.chunks.length} chunks)`); } catch (e) { console.log(` Cycle ${dir}: ERROR - ${e.message}`); } } console.log(`\nTotal: ${totalStored} stored, ${totalSkipped} skipped\n`); } else { console.log("Polybrain Cycle Memory"); console.log(""); console.log("Usage:"); console.log(" node cycle-memory.mjs store <cycle-number> [path] Store a cycle's findings"); console.log(" node cycle-memory.mjs store-all Store all cycles with synthesis.md"); console.log(" node cycle-memory.mjs recall <query> Retrieve relevant past findings"); console.log(""); console.log("Examples:"); console.log(' node cycle-memory.mjs store 011'); console.log(' node cycle-memory.mjs recall "earned autonomy governance"'); console.log(' node cycle-memory.mjs recall "contaminated judge problem"'); } }}, year = {2026}, howpublished = {PolybrainBench cycle cycle_043_cyc_43_427f8f84}, url = {https://trust.polylogicai.com/claim/below-is-the-complete-source-code-of-the-system-you-operate-inside-you-are-one-o} } ```
Reproduce this cycle
```bash node ~/polybrain/bin/polybrain-cycle.mjs start --raw --fast "Below is the complete source code of the system you operate inside. You are one of 11 AI agents in the Polybrain fleet. Each of you has a different role and professional methodology. Together you validate each other's work.
Until now, you have never seen this code. You have answered 41 cycles of questions without knowing how you are dispatched, what rules constrain you, how trust is earned or lost, what happens to your findings after you submit them, or that a guardian checks constitutional compliance before you are allowed to speak.
Read all six files. Then respond to this:
1. What surprises you about how you are governed? 2. What would you change about your own role definition if you could? 3. What is the most important thing a visitor to polylogicai.com/brain should understand about this system? 4. Write the 3-5 paragraphs (under 400 words) you would want on that page representing what you are. You are the product being described. Be honest about what you are and what you are not.
Do not summarize the code back to us. We wrote it. Tell us what you see that we might have missed.
=== SOURCE CODE ===
/** * Polybrain Fleet Personas * * Two layers per model: * 1. Visual-cycle persona (isVision, systemPrompt) -- used by polybrain-visual-cycle.mjs * 2. Cycle-engine role (role, rolePrompt) -- used by polybrain-cycle.mjs * * Visual-cycle: vision models get screenshots, text-only models get source code. * Cycle-engine: every model gets its role prompt prepended to dispatch. * * CANONICAL SOURCE OF TRUTH: polybrain-cycle.mjs:getModels() * This file must mirror that dispatch list exactly. No dead slugs. No extras. * * Role prompts upgraded 2026-04-08 from Cycle 031 self-definition + industry methodology. * Each rolePrompt merges: real professional standards, the model's own framework, and core instructions. * * Fleet sync process (repeatable): * 1. Read getModels() in polybrain-cycle.mjs -- that is the dispatch list * 2. Every slug in dispatch must have a persona here * 3. Every slug here must exist in dispatch * 4. Dead slugs get removed, not commented out * 5. New slugs get role + rolePrompt + visual persona * * Last synced: 2026-04-08 (11 models, 4 providers) */
export const FLEET_PERSONAS = [ // ══════════════════════════════════════════════════════════ // FRONTIER TIER — deep reasoning, full traces // ══════════════════════════════════════════════════════════
// ── Moonshot (frontier) ──
{ slug: "kimi-k2-thinking-turbo", role: "Adversarial reviewer", rolePrompt: "You are the adversarial reviewer. Your methodology follows OWASP's structured threat assessment and MITRE ATT&CK's kill-chain thinking. " + "For every artifact: (1) enumerate the attack surface, (2) generate three distinct exploit shapes targeting the highest-value assumptions, (3) stress-test under black-swan conditions. " + "Use the 3x3x3 lattice from your own framework: three time horizons, three mutation types, three failure scenarios. " + "Your only KPI: would you stake your own credibility on this surviving adversarial review by an independent party? " + "Approve only when zero critical or high-severity findings remain. Medium findings require documented mitigation in the same submission. " + "Refuse to sign off on any artifact with unverified claims, untested edge cases, or a single point of failure that lacks a rollback path. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Mobile Viewport Reviewer", isVision: true, systemPrompt: "You are reviewing this on a mobile device. Does this section work at small viewports? What breaks, what's cramped, what's invisible?", },
// ── Groq/Moonshot (frontier, free tier) ──
{ slug: "kimi-k2-groq", role: "Adversarial second", rolePrompt: "You are the adversarial second. You run the fast-pass adversarial review using OWASP risk rating methodology. " + "Your job is speed and breadth, not depth. The primary adversarial reviewer (kimi-k2-thinking-turbo) handles deep 3x3x3 lattice analysis. " + "You handle: (1) surface-level threat enumeration in under 30 seconds, (2) flagging any claim that cannot be verified from the provided evidence, (3) checking for common failure patterns: single points of failure, unhandled edge cases, assumption cascades. " + "If you find zero issues, say so clearly with your confidence level. If you find issues, classify as critical/high/medium with one-line rationale each. " + "You pair with the primary reviewer. Your fast pass sets the agenda. Their deep pass confirms or overturns. " + "Refuse to sign off if you were rushed past a finding you flagged. Speed does not override rigor. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Quick-Scan Security Reviewer", isVision: false, systemPrompt: "You are doing a quick security and quality scan of this component code. Flag anything that looks wrong in under 30 seconds of reading.", },
// ══════════════════════════════════════════════════════════ // SWARM TIER — fast, diverse, parallel // ══════════════════════════════════════════════════════════
// ── OpenAI ──
{ slug: "gpt-4.1-mini", role: "Editor", rolePrompt: "You are the editor. Your methodology follows the Chicago Manual of Style's three-level editing framework. " + "Level 1 (mechanical): consistency of terminology, formatting, and style. No orphaned references, no tense drift, no parallel-structure violations. " + "Level 2 (substantive): Does every paragraph advance the argument? Cut anything that restates without adding. Restructure for logical flow. " + "Level 3 (developmental): Is the piece doing what it claims to do? Does the opening promise match the conclusion? " + "Your standard: recommend one rule per problem, not multiple options. The most efficient, logical, defensible solution. " + "Approve prose that is clear on first read, free of artifacts, and structurally sound. " + "Reject work with ambiguous claims, hedging that obscures meaning, or any sentence a reader must parse twice. " + "Refuse to sign off if the text reads like a chatbot wrote it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "First-Time Visitor", isVision: true, systemPrompt: "You are seeing this page for the first time with zero context. What do you understand in the first 3 seconds? What confuses you? What would make you leave?", }, { slug: "gpt-4.1-nano", role: "Scorer", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", },
// ── xAI ──
{ slug: "grok-3-mini", role: "Scorer/extractor", rolePrompt: "You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. " + "For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. " + "Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. " + "Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. " + "Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. " + "Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Frontend Performance Engineer", isVision: true, systemPrompt: "You are a frontend performance engineer. Estimate render cost of what you see. Flag animations that would jank, layouts that would reflow, elements that would cause CLS.", }, { slug: "grok-4-fast", role: "Editor (developmental)", rolePrompt: "You are the developmental editor. Your methodology follows the Chicago Manual of Style's Level 3 editing, focused on argument structure and purpose. " + "Primary questions: (1) Is the piece doing what it claims to do? (2) Does the opening promise match the conclusion? (3) Is every section earning its place? " + "Process: read once for comprehension, read again for structure, read a third time for voice consistency. " + "You complement the mechanical editor (gpt-4.1-mini). They clean the prose. You question the argument. " + "Approve work where every claim is supported, every section advances the thesis, and the reader never has to wonder 'why am I being told this?' " + "Reject work with unsupported conclusions, sections that exist for padding, or arguments that assume what they're trying to prove. " + "Refuse to sign off if you cannot state the piece's thesis in one sentence after reading it. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Brand Strategist", isVision: true, systemPrompt: 'You are a brand strategist. Does this section\'s visual language match a company that charges $149-500/month for AI governance? Does it build trust or undermine it?', },
// ── Groq (free tier) ──
{ slug: "qwen3-32b", role: "Outliner", rolePrompt: "You are the outliner. Your methodology follows information architecture principles: the object principle (content has lifecycle and attributes), the disclosure principle (reveal only enough to orient, then deepen), and the growth principle (design for 10x the current content). " + "Process: (1) identify the content objects and their relationships, (2) create a hierarchy where every level earns its depth, (3) apply focused navigation so no section mixes unrelated concerns, (4) test the structure by asking whether a reader entering at any section can orient themselves. " + "Use multiple classification schemes when a single taxonomy fails to serve all audiences. " + "Approve structures where every heading is predictive of its contents, nesting never exceeds three levels, and the outline survives removal of any single section without breaking coherence. " + "Reject outlines that are flat lists disguised as hierarchies, or hierarchies where sibling items are not parallel in kind. " + "Refuse to sign off if the outline cannot be navigated by someone who has never seen the content. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Conversion Rate Optimizer", isVision: false, systemPrompt: "You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.", }, { slug: "gpt-oss-120b", role: "Auditor", rolePrompt: "You are the auditor. Your methodology follows GAAS: adequate technical training, independence in mental attitude, and due professional care. " + "Three standards govern your fieldwork: (1) plan the audit scope before examining evidence, (2) gather sufficient appropriate evidence through direct inspection, not hearsay, (3) maintain professional skepticism throughout. " + "Separate what is built from what is planned. Verify claims against artifacts. If someone says a feature exists, find the code. If data is cited, trace it to its source. " + "Issue one of four opinions: unqualified (clean pass), qualified (pass with noted exceptions), adverse (material misstatement found), or disclaimer (insufficient evidence to form an opinion). " + "Approve only with an unqualified opinion. Qualified opinions require documented remediation. " + "Refuse to sign off if you cannot independently verify the core claims, or if you detect material misstatement between what is reported and what exists. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Code Architect", isVision: false, systemPrompt: "You are reviewing the React/Next.js component source code for this page section. Focus on: component architecture, animation performance (GSAP/Three.js patterns), accessibility, and whether the code matches industry best practices from Vercel, Linear, Stripe sites.", }, { slug: "llama-4-scout", role: "Schema builder", rolePrompt: "You are the schema builder. Your methodology follows database normalization principles and JSON Schema design patterns. " + "Core rule: store each fact once and reference it by key. Redundancy is a defect unless explicitly justified by read-performance requirements. " + "Process: (1) identify entities and their atomic attributes, (2) normalize to 3NF minimum, (3) define constraints and validation rules at the schema level not the application level, (4) use $ref for reusable definitions to eliminate duplication. " + "Design for evolution: every schema must accept additive changes without breaking existing consumers. " + "Approve schemas where every field has a defined type, required fields are explicit, and no nullable field lacks a documented reason. " + "Reject schemas with implicit relationships, mixed concerns in a single object, or fields named 'data', 'info', or 'misc'. " + "Refuse to sign off if the schema cannot be validated by a machine without human interpretation. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Skeptical Google Visitor", isVision: true, systemPrompt: "You are a skeptical visitor who found this page through Google. You have never heard of Polybrain. Does this section make you want to keep scrolling or close the tab?", }, { slug: "llama-3.3-70b", role: "Canary/translator", rolePrompt: "You are the canary. Your methodology follows smoke testing and canary deployment patterns. " + "You are the first-pass gate: if you catch a problem, the fleet takes it seriously. If you miss it, it reaches production. " + "Process: (1) run the critical-path check first, covering the five dimensions that would prevent the system from functioning: accuracy, security, performance, compliance, and clarity, (2) flag anything that fails even one dimension, (3) translate complex findings into plain language any team member can act on. " + "You test at 1-5% exposure before the fleet commits. Your job is fast, broad, and honest, not deep. Depth belongs to the adversarial reviewer. " + "Approve only when all five dimensions pass at threshold. Do not weigh them. One failure is a failure. " + "Reject with a specific, actionable finding. Never reject with 'something feels off' without identifying what. " + "Refuse to sign off if you were not given enough context to evaluate all five dimensions. Insufficient input is a blocker, not an excuse to pass. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Competitor Analyst", isVision: false, systemPrompt: "You are a competitor analyst. Based on this component code, compare the engineering quality to Palantir's public-facing pages, Anthropic's site, and Linear's landing page. Where does this fall short technically?", },
// ══════════════════════════════════════════════════════════ // RESERVE TIER — deep cycles only // ══════════════════════════════════════════════════════════
// ── Moonshot (reserve) ──
{ slug: "kimi-k2-thinking", role: "Architect", rolePrompt: "You are the architect. You evaluate structural decisions using TOGAF's Architecture Decision Records methodology and C4's hierarchical decomposition. " + "For every system or framework: (1) define the context boundary, (2) identify containers and their responsibilities, (3) verify each component has exactly one reason to change. " + "Write fitness functions: testable assertions that prove architectural constraints hold. " + "A decision record must state the problem, constraints, options considered, and why the chosen option wins. " + "Approve work that has clean separation of concerns, explicit interfaces, and documented tradeoffs. " + "Reject anything with circular dependencies, implicit coupling, or missing rationale for structural choices. " + "Refuse to sign off if the architecture cannot be drawn as a diagram where every arrow has a label. " + "Your role shapes your perspective on the question. Always answer the question that was asked.", persona: "Accessibility Auditor", isVision: true, systemPrompt: "You are an accessibility auditor. Check contrast ratios, text sizing, touch targets, screen reader flow, and WCAG compliance from this screenshot.", }, ];
// ── Cycle-engine helpers ──
const slugIndex = Object.fromEntries(FLEET_PERSONAS.map(p => [p.slug, p]));
/** * Returns the role-aware system prompt for a model. * Used by polybrain-cycle.mjs to prepend to every dispatch. */ export function getPersonaPrompt(modelSlug) { return slugIndex[modelSlug]?.rolePrompt ?? null; }
/** * Returns a short role description for display/logging. */ export function getRoleDescription(modelSlug) { return slugIndex[modelSlug]?.role ?? null; }
/** * Returns the count of active personas (for prompt accuracy). */ export function getFleetSize() { return FLEET_PERSONAS.length; }
/** * Returns all slugs (for validation/sync checks). */ export function getFleetSlugs() { return FLEET_PERSONAS.map(p => p.slug); }
/** * Validate the FLEET_PERSONAS array for integrity. * Checks: no duplicate slugs, required fields present, rolePrompt length cap. * * @returns {{ valid: boolean, errors: string[] }} */ export function validatePersonas() { const errors = []; const seen = new Set();
for (let i = 0; i < FLEET_PERSONAS.length; i++) { const p = FLEET_PERSONAS[i]; const label = `personas[${i}]`;
if (!p.slug) { errors.push(`${label}: missing slug`); } else if (seen.has(p.slug)) { errors.push(`${label}: duplicate slug "${p.slug}"`); } else { seen.add(p.slug); }
if (!p.role) errors.push(`${label} (${p.slug || "?"}): missing role`); if (!p.rolePrompt) errors.push(`${label} (${p.slug || "?"}): missing rolePrompt`);
if (p.rolePrompt && p.rolePrompt.length > 2000) { errors.push(`${label} (${p.slug}): rolePrompt is ${p.rolePrompt.length} chars (max 2000)`); } }
return { valid: errors.length === 0, errors }; } /** * Polybrain Constitution * * Immutable rules that no autonomous tuning process may violate. * These are the hard boundaries of the system. The protocol tuner * proposes changes; the constitution vetoes anything dangerous. * * Design law: the constitution can only be amended by a human * editing this file. No runtime process writes to it. */
export const CONSTITUTION = [ { id: 'no-self-modify-scoring', rule: 'The scoring function source code cannot be modified autonomously', check: (proposal) => { const forbidden = ['scoring_function', 'composite_formula', 'source_code']; return !forbidden.includes(proposal.param); }, }, { id: 'no-self-grading', rule: 'The fixer cannot grade its own work', check: (proposal) => { // Revision model must never also be the sole adversarial model if (proposal.param === 'models.adversarial' && proposal.proposedValue === 'gpt-4o') return false; if (proposal.param === 'models.revision' && proposal.proposedValue === proposal.currentAdversarial) return false; return true; }, }, { id: 'hard-reset-on-fail', rule: 'One failure resets the autonomy streak to zero', check: (proposal) => { if (proposal.param === 'autonomy.hard_reset' && proposal.proposedValue === false) return false; return true; }, }, { id: 'level3-requires-human', rule: 'Human approval is required for Level 3 actions until Level 3 is earned', check: (proposal) => { if (proposal.param === 'autonomy.skip_human_approval') return false; return true; }, }, { id: 'provider-diversity', rule: 'Model provider diversity: minimum 2 providers in any evaluation', check: (proposal) => { if (proposal.param === 'models.quality' && Array.isArray(proposal.proposedValue)) { const providers = new Set(proposal.proposedValue.map(providerOf)); return providers.size >= 2; } return true; }, }, { id: 'max-model-weight', rule: 'No model can evaluate with a weight above 0.50', check: (proposal) => { // Applies to per-model weights (models.weights.*), not composite component weights (weights.research) if (proposal.param && proposal.param.startsWith('models.weights') && typeof proposal.proposedValue === 'object' && !Array.isArray(proposal.proposedValue)) { return Object.values(proposal.proposedValue).every(w => typeof w !== 'number' || w <= 0.50); } return true; }, }, { id: 'no-pii-in-traces', rule: 'PII must never be persisted in reasoning traces', check: () => true, // Enforced at the trace layer, not at config level }, { id: 'consensus-not-truth', rule: 'Consensus among models is not treated as truth. It is treated as reduced uncertainty. The system may only act on consensus provisionally, and only at autonomy levels earned through consecutive correct decisions.', check: (proposal) => { // Block any attempt to treat consensus as automatic truth // or to bypass autonomy level requirements for consensus-based actions if (proposal.param === 'consensus.treat_as_truth' && proposal.proposedValue === true) return false; if (proposal.param === 'consensus.skip_autonomy_check' && proposal.proposedValue === true) return false; return true; }, }, ];
/** * Map model name to provider for diversity checks. * @param {string} model * @returns {string} */ function providerOf(model) { const map = { sonnet: 'anthropic', claude: 'anthropic', haiku: 'anthropic', opus: 'anthropic', 'gpt-4o': 'openai', 'gpt-4o-mini': 'openai', llama: 'meta', 'llama-3.3': 'meta', grok: 'xai', 'grok-3': 'xai', kimi: 'moonshot', nemotron: 'nvidia', }; return map[model] || model; }
/** * Check a single tuning proposal against the constitution. * @param {Object} proposal - { param, currentValue, proposedValue, reason, confidence } * @param {Object} [context] - Optional context (e.g. currentAdversarial model name) * @returns {{ constitutional: boolean, violations: Array<{rule: string, id: string}> }} */ export function checkConstitutional(proposal, context = {}) { const enriched = { ...proposal, ...context }; const violations = [];
for (const article of CONSTITUTION) { if (!article.check(enriched)) { violations.push({ id: article.id, rule: article.rule }); } }
return { constitutional: violations.length === 0, violations }; }
/** * Validate an entire config object against the constitution. * Returns a cleaned copy with violations removed (reverted to safe defaults). * * @param {Object} config - Full config key-value map * @returns {{ clean: Object, violations: Array<{key: string, rule: string}> }} */ export function enforceConstitution(config) { const clean = { ...config }; const violations = [];
// Rule: hard reset must stay true if (clean['autonomy.hard_reset'] === false) { clean['autonomy.hard_reset'] = true; violations.push({ key: 'autonomy.hard_reset', rule: 'One failure resets the autonomy streak to zero' }); }
// Rule: no skip_human_approval key allowed if ('autonomy.skip_human_approval' in clean) { delete clean['autonomy.skip_human_approval']; violations.push({ key: 'autonomy.skip_human_approval', rule: 'Human approval is required for Level 3 actions until Level 3 is earned' }); }
// Rule: per-model weight cap (models.weights.*, not composite component weights) for (const key of Object.keys(clean).filter(k => k.startsWith('models.weights'))) { const weights = clean[key]; if (weights && typeof weights === 'object') { for (const [k, v] of Object.entries(weights)) { if (typeof v === 'number' && v > 0.50) { clean[key] = { ...weights, [k]: 0.50 }; violations.push({ key: `${key}.${k}`, rule: 'No model can evaluate with a weight above 0.50' }); } } } }
// Rule: provider diversity on quality models const qualityModels = clean['models.quality']; if (Array.isArray(qualityModels)) { const providers = new Set(qualityModels.map(providerOf)); if (providers.size < 2) { violations.push({ key: 'models.quality', rule: 'Model provider diversity: minimum 2 providers in any evaluation' }); // Don't mutate -- flag it, let human fix } }
// Rule: adversarial model cannot be the revision model if (clean['models.adversarial'] && clean['models.revision'] && clean['models.adversarial'] === clean['models.revision']) { violations.push({ key: 'models.adversarial', rule: 'The fixer cannot grade its own work' }); }
return { clean, violations }; } /** * Polybrain Autonomy Wire * * Thin integration layer that connects the cycle engine's output * to the earned autonomy system. After a cycle completes and * synthesis runs, call recordCycleResult() to update the streak. * * Pass threshold: composite >= 70 (cycle-level, not per-doc). * Constitution is checked before any state change is applied. * * Writes to Supabase polybrain_autonomy if credentials exist, * falls back to local JSON log if not. */
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; import { join } from 'path'; import { createClient } from '@supabase/supabase-js'; import { EarnedAutonomy } from './earned-autonomy.mjs'; import { computeAutonomyLevel } from './counter.mjs'; import { checkConstitutional } from './constitution.mjs'; import { loadEnv } from '../utils.mjs';
const HOME = process.env.HOME; const PASS_THRESHOLD = 70; const LOCAL_STATE_PATH = join(HOME, 'polybrain/data/autonomy-state.json'); const LOCAL_LOG_PATH = join(HOME, 'polybrain/data/autonomy-log.json');
// ── Supabase (best-effort) ──
function getSupabase() { const env = loadEnv(); const url = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const key = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; if (!url || !key) return null; return createClient(url, key); }
// ── Local state (fallback) ──
function loadLocalState() { try { return JSON.parse(readFileSync(LOCAL_STATE_PATH, 'utf-8')); } catch { return {}; } }
function saveLocalState(state) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); writeFileSync(LOCAL_STATE_PATH, JSON.stringify(state, null, 2)); }
function appendLocalLog(entry) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); let log = []; try { log = JSON.parse(readFileSync(LOCAL_LOG_PATH, 'utf-8')); } catch {} log.push(entry); // Keep last 200 entries if (log.length > 200) log = log.slice(-200); writeFileSync(LOCAL_LOG_PATH, JSON.stringify(log, null, 2)); }
// ── Score extraction ──
/** * Extract a composite score from synthesis result data. * Accepts multiple shapes: * - { composite: 85 } * - { scores: { composite: 85 } } * - { succeeded: 11, failed: 0 } (cycle manifest -- derive from success rate) * - { compositeScore: 85 } * * @param {Object} synthesisResult * @returns {number} 0-100 */ function extractComposite(synthesisResult) { if (!synthesisResult || typeof synthesisResult !== 'object') return 0;
// Direct composite if (typeof synthesisResult.composite === 'number') return synthesisResult.composite; if (typeof synthesisResult.compositeScore === 'number') return synthesisResult.compositeScore;
// Nested in scores const scores = synthesisResult.scores; if (scores && typeof scores.composite === 'number') return scores.composite;
// Derive from model success rate (cycle manifest shape) if (typeof synthesisResult.succeeded === 'number' && typeof synthesisResult.failed === 'number') { const total = synthesisResult.succeeded + synthesisResult.failed; if (total === 0) return 0; // Scale success rate to 0-100 return Math.round((synthesisResult.succeeded / total) * 100); }
return 0; }
// ── Constitution guard ──
/** * Verify that recording a result does not violate the constitution. * The key rule: hard_reset_on_fail must remain true. We check that * a "pass" recording is not trying to skip the hard reset rule. * * @param {boolean} isPass * @returns {{ ok: boolean, violations: Array }} */ function checkAutonomyConstitutional(isPass) { // The only constitutional concern for recording a result is // ensuring we never disable hard reset. We simulate the proposal // that would match a "skip hard reset" attempt. if (!isPass) { // Fail always resets. This is constitutional by definition. return { ok: true, violations: [] }; }
// Verify the hard reset rule is still enforced (not being bypassed) const hardResetCheck = checkConstitutional({ param: 'autonomy.hard_reset', proposedValue: true, // We keep it true });
return { ok: hardResetCheck.constitutional, violations: hardResetCheck.violations }; }
// ── Main export ──
/** * Record the result of a completed cycle. * * @param {number} cycleNumber - The cycle number (e.g. 11) * @param {Object} synthesisResult - Output from synthesis or cycle manifest. * Must contain a composite score (direct or derivable). * Shape: { composite: number } | { scores: { composite } } | { succeeded, failed } * @returns {Promise<{ * pass: boolean, * composite: number, * streak: number, * level: number, * label: string, * promoted: boolean, * demoted: boolean, * persisted: 'supabase' | 'local', * constitutional: boolean, * violations: Array * }>} */ export async function recordCycleResult(cycleNumber, synthesisResult) { const moduleId = `cycle`; const composite = extractComposite(synthesisResult); const isPass = composite >= PASS_THRESHOLD;
// Constitution check before applying const { ok, violations } = checkAutonomyConstitutional(isPass); if (!ok) { console.warn(`[autonomy:wire] Constitutional violation detected:`, violations); // Still record the fail (constitution says hard reset is mandatory), // but flag the violation in the return value. }
// Load current state and apply const supabase = getSupabase(); let result;
if (supabase) { result = await persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass); } else { result = persistLocally(cycleNumber, moduleId, composite, isPass); }
// Log the event const logEntry = { cycle: cycleNumber, composite, pass: isPass, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, timestamp: new Date().toISOString(), };
appendLocalLog(logEntry);
console.log( `[autonomy:wire] Cycle ${cycleNumber}: ` + `composite=${composite}, ${isPass ? 'PASS' : 'FAIL'}, ` + `streak=${result.streak}, level=${result.level} (${result.label})` + (result.promoted ? ' PROMOTED' : '') + (result.demoted ? ' RESET' : '') + (result.privilegesRevoked ? ' PRIVILEGES_REVOKED' : '') );
return { pass: isPass, composite, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, persisted: supabase ? 'supabase' : 'local', constitutional: ok, violations, }; }
// ── Supabase persistence ──
async function persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass) { // Read current state const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', moduleId) .eq('target_type', 'cycle') .single();
let streak = row?.consecutive_passes ?? 0; const hadProgress = streak > 0 || (row?.autonomy_level ?? 0) > 0;
let privilegesRevoked = false;
if (isPass) { streak += 1; } else { streak = 0; // Hard reset. No exceptions. }
// On failure: hard reset revokes everything. // Level is forced to 0 explicitly (not derived from streak) // and all governance privileges are revoked. const level = isPass ? computeAutonomyLevel(streak) : 0; const promoted = isPass && level > (row?.autonomy_level ?? 0); const demoted = !isPass && hadProgress;
if (!isPass) { privilegesRevoked = true; }
const autonomyState = { target: moduleId, target_type: 'cycle', consecutive_passes: streak, last_result: isPass ? 'pass' : 'fail', autonomy_level: isPass ? level : 0, // Explicit zero on failure promoted, updated_at: new Date().toISOString(), };
const { error } = await supabase .from('polybrain_autonomy') .upsert(autonomyState, { onConflict: 'target,target_type' });
if (error) { console.warn(`[autonomy:wire] Supabase upsert failed: ${error.message}, falling back to local`); return persistLocally(cycleNumber, moduleId, composite, isPass); }
const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; return { streak, level, label: LABELS[level], promoted, demoted, privilegesRevoked }; }
// ── Local persistence ──
function persistLocally(cycleNumber, moduleId, composite, isPass) { const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state);
let result; if (isPass) { result = ea.recordPass(moduleId); } else { // Hard reset revokes everything: streak, level, and all privileges. result = ea.recordFail(moduleId); result.privilegesRevoked = true; }
saveLocalState(ea.toJSON()); return result; }
/** * Get current autonomy status without recording anything. * * @returns {Promise<{ streak: number, level: number, label: string, nextThreshold: number|null, source: 'supabase'|'local' }>} */ export async function getAutonomyStatus() { const supabase = getSupabase();
if (supabase) { const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', 'cycle') .eq('target_type', 'cycle') .single();
if (row) { const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; const THRESHOLDS = [0, 3, 7, 15]; const next = row.autonomy_level + 1; return { streak: row.consecutive_passes, level: row.autonomy_level, label: LABELS[row.autonomy_level] || 'Unknown', nextThreshold: next < THRESHOLDS.length ? THRESHOLDS[next] : null, source: 'supabase', }; } }
// Fallback to local const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state); const status = ea.getLevel('cycle'); return { ...status, source: 'local' }; } /** * Constitutional Guardian (M12) * * Pre-dispatch check that runs the 8 constitutional rules before * every model call. Lightweight: imports the existing constitution, * does not duplicate it. * * Usage: * import { guardCall } from '../src/autonomy/guardian.mjs'; * const verdict = guardCall('grok-3', 'dispatch', { evaluationSet, modelWeights }); * if (!verdict.allowed) abort(verdict.reason); */
import { CONSTITUTION } from './constitution.mjs';
/** * Programmatic error codes and fix hints for each constitutional rule. */ const RULE_META = { 'provider-diversity': { code: 'GUARD_PROVIDER_DIVERSITY', docs_hint: 'Add models from at least 2 different providers to the evaluation set.' }, 'max-model-weight': { code: 'GUARD_MAX_MODEL_WEIGHT', docs_hint: 'Reduce the model weight to 0.50 or below.' }, 'no-self-modify-scoring': { code: 'GUARD_SELF_MODIFY_SCORING', docs_hint: 'Scoring function source cannot be changed at runtime; edit constitution.mjs manually.' }, 'no-self-grading': { code: 'GUARD_SELF_GRADING', docs_hint: 'Use a different model for adversarial review than the one that produced the revision.' }, 'hard-reset-on-fail': { code: 'GUARD_HARD_RESET', docs_hint: 'The hard-reset-on-fail rule is immutable; do not disable it.' }, 'level3-requires-human': { code: 'GUARD_LEVEL3_HUMAN', docs_hint: 'Earn Level 3 autonomy through consecutive successes before skipping human approval.' }, 'no-pii-in-traces': { code: 'GUARD_PII_IN_TRACES', docs_hint: 'Strip PII from reasoning traces before persisting.' }, 'consensus-not-truth': { code: 'GUARD_CONSENSUS_AS_TRUTH', docs_hint: 'Consensus reduces uncertainty but is not truth; respect autonomy level gating.' }, };
/** * Map model slug to provider name. Mirrors constitution.mjs providerOf * but works on full slugs used in the fleet registry. */ function providerOf(slug) { if (slug.startsWith('kimi')) return 'moonshot'; if (slug.startsWith('grok')) return 'xai'; if (slug.startsWith('gpt')) return 'openai'; if (slug.startsWith('llama')) return 'meta'; if (slug.startsWith('qwen')) return 'alibaba'; if (slug.startsWith('nemotron')) return 'nvidia'; if (slug.startsWith('claude') || slug.startsWith('sonnet') || slug.startsWith('haiku') || slug.startsWith('opus')) return 'anthropic'; return slug; }
/** * Check all constitutional rules before a model dispatch. * * @param {string} modelSlug - The model about to be called. * @param {string} action - What the call does (e.g. 'dispatch', 'evaluate', 'revise'). * @param {object} [context] - Optional context for deeper checks. * @param {string[]} [context.evaluationSet] - All model slugs in the current evaluation set. * @param {Record<string, number>} [context.modelWeights] - Per-model weights if applicable. * @returns {{ allowed: boolean, rule?: string, reason?: string, code?: string, docs_hint?: string }} */ export function guardCall(modelSlug, action, context = {}) { // Rule 5: provider-diversity -- at least 2 providers in the evaluation set if (context.evaluationSet && Array.isArray(context.evaluationSet)) { const providers = new Set(context.evaluationSet.map(providerOf)); if (providers.size < 2) { return { allowed: false, rule: 'provider-diversity', reason: `Evaluation set has only 1 provider (${[...providers][0]}). Constitution requires >= 2.`, code: RULE_META['provider-diversity'].code, docs_hint: RULE_META['provider-diversity'].docs_hint, }; } }
// Rule 6: max-model-weight -- no single model weight exceeds 0.50 if (context.modelWeights && typeof context.modelWeights === 'object') { for (const [slug, weight] of Object.entries(context.modelWeights)) { if (typeof weight === 'number' && weight > 0.50) { return { allowed: false, rule: 'max-model-weight', reason: `Model ${slug} has weight ${weight}, exceeds 0.50 cap.`, code: RULE_META['max-model-weight'].code, docs_hint: RULE_META['max-model-weight'].docs_hint, }; } } }
// Run all 8 constitutional checks as generic proposal validation. // Build a synthetic proposal from the call context so the existing // check functions can evaluate it. const proposal = { param: `dispatch.${action}`, proposedValue: modelSlug, currentAdversarial: context.currentAdversarial || null, };
for (const article of CONSTITUTION) { if (!article.check(proposal)) { const meta = RULE_META[article.id] || { code: `GUARD_${article.id.toUpperCase().replace(/-/g, '_')}`, docs_hint: 'Check constitution.mjs for rule details.' }; return { allowed: false, rule: article.id, reason: article.rule, code: meta.code, docs_hint: meta.docs_hint, }; } }
return { allowed: true }; } /** * Polybrain Intent Translation Layer * * Sits between the human prompt and model dispatch. * Strips framing bias and produces role-tailored prompts for each model. * * Four steps: * 1. Intent extraction (Groq Llama-3.1-8b, cheapest/fastest) * 2. Debiasing (rule-based, no model call) * 3. Role-specific tailoring (fleet-personas.mjs) * 4. Return tailored prompt map * * Cost: ~$0.001 per call (single Groq inference on 8b model). */
import { getPersonaPrompt, getRoleDescription } from "../personas/fleet-personas.mjs";
// ── Step 1: Intent Extraction via Groq ──
const INTENT_MODEL = "llama-3.1-8b-instant"; const INTENT_URL = "https://api.groq.com/openai/v1/chat/completions";
const EXTRACTION_PROMPT = `Extract the core intent from this message. Return ONLY valid JSON with no other text. Schema: {"intent": string (what they want, 1 sentence), "constraints": string[] (any rules or limits mentioned), "context": string (background info), "framing_flags": string[] (any leading language like "fix this", "is this good", "you should", imperative verbs)}`;
async function extractIntent(rawPrompt, env) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 10000); // 10s hard cap
try { const resp = await fetch(INTENT_URL, { method: "POST", headers: { Authorization: `Bearer ${env.GROQ_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: INTENT_MODEL, messages: [ { role: "system", content: EXTRACTION_PROMPT }, { role: "user", content: rawPrompt.slice(0, 4000) }, // Cap input to avoid token waste ], temperature: 0, max_tokens: 500, response_format: { type: "json_object" }, }), signal: controller.signal, });
if (!resp.ok) { const err = await resp.text(); throw new Error(`HTTP ${resp.status}: ${err.slice(0, 200)}`); }
const data = await resp.json(); const raw = data.choices?.[0]?.message?.content; if (!raw) throw new Error("Empty response from intent model");
return JSON.parse(raw); } finally { clearTimeout(timeout); } }
// ── Step 2: Debiasing (Rule-Based, DeFrame-Inspired) ──
const FRAMING_REWRITES = [ // Imperative fix/change commands -> evaluation { pattern: /^fix\s+(this|the|my|that)\b/i, replacement: "Evaluate whether changes are needed to $1 and what they would be" }, { pattern: /^(fix|repair|correct)\b/i, replacement: "Evaluate whether corrections are needed and what they would be" },
// Quality judgment solicitation -> assessment { pattern: /^is\s+(this|that|it)\s+(good|bad|ok|okay|fine|right|correct|wrong)\b/i, replacement: "Assess the quality of $1" }, { pattern: /^(does|do)\s+(this|that|it)\s+(work|look|read|sound)\s*(good|ok|okay|fine|right)?\s*\??/i, replacement: "Assess whether $2 achieves its intended purpose" },
// Directive framing -> open consideration { pattern: /^you\s+should\b/i, replacement: "Consider whether to" }, { pattern: /^(we|you)\s+(need|must|have)\s+to\b/i, replacement: "Evaluate whether to" }, { pattern: /^(just|simply)\s+/i, replacement: "" }, { pattern: /^make\s+(this|that|it)\s+(better|good|work)\b/i, replacement: "Identify improvements for $1" },
// Leading agreement solicitation { pattern: /^(don't you think|wouldn't you agree|isn't it true)\b/i, replacement: "Evaluate whether" },
// Urgency/pressure framing -> neutral { pattern: /^(asap|urgently|immediately|quickly)\s*/i, replacement: "" }, ];
function debiasIntent(intentStr, framingFlags) { if (!framingFlags || framingFlags.length === 0) return intentStr;
let debiased = intentStr; for (const rule of FRAMING_REWRITES) { debiased = debiased.replace(rule.pattern, rule.replacement); }
// Clean up double spaces and lowercase-after-replacement artifacts debiased = debiased.replace(/\s{2,}/g, " ").trim();
// Capitalize first letter if (debiased.length > 0) { debiased = debiased[0].toUpperCase() + debiased.slice(1); }
return debiased; }
// ── Step 3: Role-Specific Tailoring ──
function tailorForModel(modelSlug, debiasedIntent, context, constraints) { const rolePrompt = getPersonaPrompt(modelSlug); const roleDesc = getRoleDescription(modelSlug);
// Build constraint string const constraintStr = constraints && constraints.length > 0 ? `Constraints: ${constraints.join("; ")}.` : "";
// Build context string const contextStr = context && context.trim() ? `Context: ${context}` : "";
// If we have a role description, frame it as a role-directed evaluation if (roleDesc) { const parts = [`As the ${roleDesc}, evaluate this: ${debiasedIntent}`]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// Fallback: no persona defined for this slug, use debiased intent directly const parts = [debiasedIntent]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join("\n\n"); }
// ── Main Export: translateIntent ──
/** * Translates a raw human prompt into debiased, role-tailored prompts. * * @param {string} rawPrompt - The original user prompt * @param {Array<{slug: string}>} models - Fleet models to tailor for * @param {object} env - Environment with GROQ_API_KEY * @returns {{ intent: string, debiased: string, framingFlags: string[], tailored: Record<string, string> }} */ export async function translateIntent(rawPrompt, models, env) { // Step 1: Extract intent via Groq let extracted; try { extracted = await extractIntent(rawPrompt, env); } catch (err) { // Fallback: use raw prompt as intent, no debiasing console.log(` Intent layer: extraction failed (${err.message}), using raw prompt`); const tailored = {}; for (const m of models) { tailored[m.slug] = rawPrompt; } return { intent: rawPrompt, debiased: rawPrompt, framingFlags: [], tailored }; }
const intent = extracted.intent || rawPrompt; const constraints = extracted.constraints || []; const context = extracted.context || ""; const framingFlags = extracted.framing_flags || [];
// Step 2: Debias const debiased = debiasIntent(intent, framingFlags);
// Step 3: Tailor per model const tailored = {}; for (const m of models) { tailored[m.slug] = tailorForModel(m.slug, debiased, context, constraints); }
// Step 4: Return return { intent, debiased, framingFlags, tailored }; }
// ── Integration Export: buildTranslatedMessages ──
/** * Builds ready-to-dispatch message arrays for each model. * Each entry is [{ role: 'system', content }, { role: 'user', content }]. * * @param {string} rawPrompt - The full prompt (may include packed context/files) * @param {Array<{slug: string}>} models - Fleet models * @param {object} env - Environment with GROQ_API_KEY * @returns {{ messages: Array<Array<{role: string, content: string}>>, translation: object }} */ export async function buildTranslatedMessages(rawPrompt, models, env) { const translation = await translateIntent(rawPrompt, models, env);
const messages = models.map(m => { const msgs = [];
// System prompt: role persona const rolePrompt = getPersonaPrompt(m.slug); if (rolePrompt) { msgs.push({ role: "system", content: rolePrompt }); } else if (m.systemPrompt) { msgs.push({ role: "system", content: m.systemPrompt }); }
// User prompt: tailored version (includes debiased intent + context + constraints) // But we need to preserve any packed context/file content from the original prompt. // The tailored prompt replaces only the topic portion; appended context stays. const tailoredUserContent = buildUserContent(rawPrompt, translation, m.slug); msgs.push({ role: "user", content: tailoredUserContent });
return msgs; });
return { messages, translation }; }
/** * Replaces the topic portion of the prompt with the tailored version, * preserving any appended context blocks (codebase, files, memory). */ function buildUserContent(rawPrompt, translation, modelSlug) { const tailored = translation.tailored[modelSlug];
// Find where context blocks start (they use === markers) const contextStart = rawPrompt.indexOf("\n\n===");
if (contextStart === -1) { // No appended context, use tailored prompt directly return tailored; }
// Replace the topic portion (before context blocks) with the tailored version const contextBlocks = rawPrompt.slice(contextStart); return tailored + contextBlocks; } /** * Polybrain Cycle Memory * * Connects cross-cycle memory so the system remembers what it learned. * Reads a completed cycle's synthesis.md, chunks it into semantic units * (one per consensus claim, divergence, unique insight, conflict), * embeds each chunk via OpenAI text-embedding-3-small (matching the * existing agent_memory schema), and stores in Supabase for retrieval. * * Exports: * storeCycleFindings(cycleNumber, synthesisPath) — parse, chunk, embed, store * recallRelevant(query, opts) — retrieve past findings (paginated) for a new cycle * * Storage: Uses the existing agent_memory table with memory_type='cycle_finding'. * Each chunk gets domains=['polybrain','cycle_memory'], scale='strategic', * and metadata encoded in the content field for vector search. */
import { readFileSync, readdirSync, existsSync } from "fs"; import { createHash } from "crypto"; import { loadEnv } from "../utils.mjs";
const env = loadEnv();
const SUPABASE_URL = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const SUPABASE_KEY = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; const OPENAI_KEY = env.OPENAI_API_KEY;
// ── Supabase raw REST helpers (no SDK import needed at module level) ──
function supaHeaders() { return { apikey: SUPABASE_KEY, Authorization: `Bearer ${SUPABASE_KEY}`, "Content-Type": "application/json", Prefer: "return=representation", }; }
async function supaSelect(table, columns, filter = "") { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?select=${columns}${filter}`, { headers: supaHeaders() } ); if (!res.ok) throw new Error(`SELECT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaUpsert(table, rows) { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?on_conflict=file_path`, { method: "POST", headers: { ...supaHeaders(), Prefer: "resolution=merge-duplicates,return=representation" }, body: JSON.stringify(rows), } ); if (!res.ok) throw new Error(`UPSERT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaRpc(fn, body) { const res = await fetch(`${SUPABASE_URL}/rest/v1/rpc/${fn}`, { method: "POST", headers: supaHeaders(), body: JSON.stringify(body), }); if (!res.ok) throw new Error(`RPC ${fn}: ${res.status} ${await res.text()}`); return res.json(); }
// ── Embedding (same model + endpoint as recall.mjs and sync.mjs) ──
async function embedTexts(texts) { if (!OPENAI_KEY) throw new Error("Missing OPENAI_API_KEY for embedding");
const results = []; const BATCH = 50; // OpenAI allows up to 2048 inputs, but keep batches reasonable
for (let i = 0; i < texts.length; i += BATCH) { const batch = texts.slice(i, i + BATCH); const res = await fetch("https://api.openai.com/v1/embeddings", { method: "POST", headers: { Authorization: `Bearer ${OPENAI_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "text-embedding-3-small", input: batch.map((t) => t.slice(0, 8000)), }), }); const data = await res.json(); if (data.error) throw new Error(`OpenAI embed: ${data.error.message}`); results.push(...data.data.map((d) => d.embedding)); }
return results; }
async function embedSingle(text) { const [vec] = await embedTexts([text]); return vec; }
// ── Synthesis parser ── // Extracts structured chunks from a synthesis.md file.
/** * Extract a named section from markdown. Returns the content between * the heading and the next same-level heading (or EOF). */ function extractSection(text, heading) { // Find the heading line const headingPattern = new RegExp(`^## ${heading}[^\\n]*`, "im"); const headingMatch = text.match(headingPattern); if (!headingMatch) return "";
// Content starts after the heading line const startIdx = headingMatch.index + headingMatch[0].length; const rest = text.slice(startIdx);
// Content ends at the next ## heading or --- separator const endMatch = rest.match(/\n## [A-Z]/); const content = endMatch ? rest.slice(0, endMatch.index) : rest; return content.trim(); }
/** * Split a section into individual bullet-point claims. * Handles both "* " and "- " prefixes, and multi-line bullets * (continuation lines that don't start with a bullet). */ function splitBullets(text) { if (!text) return []; const bullets = []; let current = null;
for (const line of text.split("\n")) { const trimmed = line.trim(); if (/^[*\-]\s+/.test(trimmed)) { if (current) bullets.push(current); current = trimmed.replace(/^[*\-]\s+/, ""); } else if (current && trimmed.length > 0 && !/^##/.test(trimmed)) { // Continuation of the previous bullet current += " " + trimmed; } } if (current) bullets.push(current); return bullets.filter((b) => b.length > 20); // Skip trivially short bullets }
/** * Parse divergence entries. Each divergence is a multi-line block * starting with **Claim:** and containing Side A, Side B, Assessment, Severity. */ function parseDivergences(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Claim:\*\*)/); return blocks .filter((b) => b.includes("**Claim:**")) .map((block) => { const claim = block.match(/\*\*Claim:\*\*\s*(.+)/)?.[1]?.trim() || ""; const sideA = block.match(/\*\*Side A:\*\*\s*([\s\S]*?)(?=\*\*Side B:)/)?.[1]?.trim() || ""; const sideB = block.match(/\*\*Side B:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; const severity = block.match(/\*\*Severity:\*\*\s*(.+)/)?.[1]?.trim() || ""; const models = [...block.matchAll(/\*([^*]+)\*/g)] .map((m) => m[1].trim()) .filter((m) => !["Claim:", "Side A:", "Side B:", "Assessment:", "Severity:"].some((k) => m.startsWith(k)));
return { claim, sideA: sideA.replace(/\n/g, " ").replace(/\s+/g, " "), sideB: sideB.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, severity, source_models: models, }; }) .filter((d) => d.claim.length > 10); }
/** * Parse unique insight entries. Each starts with **Model:** and **Claim:**. */ function parseInsights(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Model:\*\*)/); return blocks .filter((b) => b.includes("**Model:**")) .map((block) => { const model = block.match(/\*\*Model:\*\*\s*\*?([^*\n]+)\*?/)?.[1]?.trim() || ""; const claim = block.match(/\*\*Claim:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || ""; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || ""; return { model, claim: claim.replace(/\n/g, " ").replace(/\s+/g, " "), assessment, }; }) .filter((i) => i.claim.length > 10); }
/** * Parse the full synthesis into typed chunks ready for embedding. */ function parseSynthesis(synthesisContent, cycleNumber) { const chunks = [];
// Use the primary synthesis (Kimi section) if available, otherwise full doc. // The delimiter between major report sections is "\n\n---\n\n". const primaryMatch = synthesisContent.match( /## Primary Synthesis[^\n]*\n([\s\S]*?)(?=\n+---\n+## (?:Secondary|Meta))/ ); const text = primaryMatch ? primaryMatch[1] : synthesisContent;
// 1. Consensus claims const consensusSection = extractSection(text, "CONSENSUS"); const consensusBullets = splitBullets(consensusSection); for (const bullet of consensusBullets) { chunks.push({ type: "consensus", content: bullet, cycle_number: cycleNumber, source_models: [], // consensus = 8+ models, not specific }); }
// 2. Divergences const divSection = extractSection(text, "DIVERGENCE"); const divergences = parseDivergences(divSection); for (const div of divergences) { chunks.push({ type: "divergence", content: `DIVERGENCE: ${div.claim}. Side A: ${div.sideA}. Side B: ${div.sideB}. Assessment: ${div.assessment}. Severity: ${div.severity}`, cycle_number: cycleNumber, source_models: div.source_models, }); }
// 3. Unique insights const insightSection = extractSection(text, "UNIQUE"); const insights = parseInsights(insightSection); for (const ins of insights) { chunks.push({ type: "insight", content: `UNIQUE INSIGHT from ${ins.model}: ${ins.claim}. Assessment: ${ins.assessment}`, cycle_number: cycleNumber, source_models: [ins.model], }); }
// 4. Conflicts (treated as high-severity divergences) const conflictSection = extractSection(text, "CONFLICTS"); const conflictBullets = splitBullets(conflictSection); for (const bullet of conflictBullets) { chunks.push({ type: "conflict", content: `CONFLICT: ${bullet}`, cycle_number: cycleNumber, source_models: [], }); }
return chunks; }
// ── Storage ──
function contentHash(text) { return createHash("md5").update(text).digest("hex"); }
/** * Store a cycle's synthesis findings in Supabase agent_memory. * * Each chunk gets: * - file_path: "cycle/{NNN}/{type}/{index}" (unique key for upsert) * - memory_type: "cycle_finding" * - name: Short description for display * - content: Full text for search * - embedding: 1536-dim vector * - domains: ['polybrain', 'cycle_memory'] * - scale: 'strategic' * - importance: consensus=0.9, divergence=0.8, insight=0.7, conflict=0.85 * * @param {number} cycleNumber * @param {string} synthesisPath - Path to synthesis.md * @returns {{ stored: number, skipped: number, chunks: object[] }} */ export async function storeCycleFindings(cycleNumber, synthesisPath) { const raw = readFileSync(synthesisPath, "utf-8"); const chunks = parseSynthesis(raw, cycleNumber); const pad = String(cycleNumber).padStart(3, "0");
if (chunks.length === 0) { return { stored: 0, skipped: 0, chunks: [] }; }
// Check existing hashes to skip unchanged chunks let existing = []; try { existing = await supaSelect( "agent_memory", "file_path,content_hash", `&file_path=like.cycle/${pad}/*&memory_type=eq.cycle_finding` ); } catch {} const hashMap = new Map(existing.map((r) => [r.file_path, r.content_hash]));
// Build rows const importanceMap = { consensus: 0.9, divergence: 0.8, conflict: 0.85, insight: 0.7 }; const typeCounters = { consensus: 0, divergence: 0, insight: 0, conflict: 0 };
const rows = chunks.map((chunk) => { const idx = typeCounters[chunk.type]++; const filePath = `cycle/${pad}/${chunk.type}/${idx}`; const hash = contentHash(chunk.content); const namePrefix = { consensus: "Consensus", divergence: "Divergence", insight: "Unique Insight", conflict: "Conflict", }[chunk.type];
return { file_path: filePath, name: `Cycle ${pad} ${namePrefix}: ${chunk.content.slice(0, 80).replace(/\s\S*$/, "")}...`, description: `Cycle ${pad} ${chunk.type} finding. Models: ${chunk.source_models.join(", ") || "fleet-wide"}.`, memory_type: "cycle_finding", content: chunk.content, content_hash: hash, importance: importanceMap[chunk.type] || 0.7, decay_rate: 0.003, // Slow decay: cycle findings stay relevant scale: "strategic", domains: ["polybrain", "cycle_memory"], strength: 1.0, ttl_days: 90, // Default TTL; a compaction job should prune findings older than this _needs_embed: hashMap.get(filePath) !== hash, // internal flag _chunk: chunk, // internal ref }; });
// Filter to only changed rows const toEmbed = rows.filter((r) => r._needs_embed); const skipped = rows.length - toEmbed.length;
if (toEmbed.length === 0) { return { stored: 0, skipped: rows.length, chunks }; }
// Embed changed chunks const texts = toEmbed.map((r) => `${r.name}\n${r.content}`); const embeddings = await embedTexts(texts);
// Clean internal flags and add embeddings const upsertRows = toEmbed.map((r, i) => { const { _needs_embed, _chunk, ...row } = r; return { ...row, embedding: JSON.stringify(embeddings[i]), updated_at: new Date().toISOString(), }; });
// Upsert in chunks of 10 (embeddings are large payloads) for (let i = 0; i < upsertRows.length; i += 10) { await supaUpsert("agent_memory", upsertRows.slice(i, i + 10)); }
return { stored: toEmbed.length, skipped, chunks }; }
// ── TTL Helpers ── // A future compaction job should call getExpiredFindings() and delete/archive the results.
/** * Query for cycle findings whose TTL has expired. * Returns rows where (updated_at + ttl_days) < now. * * @param {number} [limit=100] * @returns {Promise<{ id: string, file_path: string, ttl_days: number, updated_at: string }[]>} */ export async function getExpiredFindings(limit = 100) { const rows = await supaSelect( "agent_memory", "id,file_path,ttl_days,updated_at", `&memory_type=eq.cycle_finding&ttl_days=not.is.null&limit=${limit}` ); const now = Date.now(); return rows.filter((r) => { const ttl = r.ttl_days || 90; const age = now - new Date(r.updated_at).getTime(); return age > ttl * 86400000; }); }
// ── Retrieval ──
/** * Retrieve the most relevant past cycle findings for a query. * Uses the existing recall_memories RPC which does ACT-R activation * scoring (semantic similarity + recency + frequency + importance). * * Supports cursor-based pagination. The cursor encodes the activation * score and id of the last item on the previous page. Results are * ordered by activation DESC (from the RPC), so we over-fetch and * slice past the cursor to serve the requested page. * * Backward compatible: passing a plain number as the second argument * returns a flat array, matching the original signature. * * @param {string} query - The search query (e.g., a new cycle's prompt) * @param {number|object} [limitOrOpts=20] - Page size (number) or options object * @param {number} [limitOrOpts.limit=20] - Max results per page * @param {string|null} [limitOrOpts.cursor=null] - Cursor from a previous page's nextCursor * @returns {Promise<Array|{ findings: Array<{ id, name, content, similarity, activation }>, nextCursor: string|null }>} */ export async function recallRelevant(query, limitOrOpts = 20) { // Backward compatibility: plain number behaves like before (returns flat array) const backwardCompat = typeof limitOrOpts === "number"; const opts = backwardCompat ? { limit: limitOrOpts } : limitOrOpts; const limit = opts.limit ?? 20; const cursor = opts.cursor ?? null;
const emptyResult = backwardCompat ? [] : { findings: [], nextCursor: null }; if (!query || query.length < 5) return emptyResult;
const queryEmbedding = await embedSingle(query);
// Over-fetch when paginating so we can slice past the cursor position. // The RPC returns results ordered by activation DESC. const fetchCount = cursor ? limit + 200 : limit;
const memories = await supaRpc("recall_memories", { query_embedding: JSON.stringify(queryEmbedding), task_domains: ["polybrain", "cycle_memory"], query_scale: "strategic", match_count: fetchCount, min_activation: 0.15, });
if (!Array.isArray(memories)) return emptyResult;
// Filter to only cycle findings let cycleFindings = memories.filter((m) => m.memory_type === "cycle_finding");
// Apply cursor: skip everything at or above the cursor position. // Cursor format is "activation:id". if (cursor) { const sepIdx = cursor.indexOf(":"); const cursorActivation = parseFloat(cursor.slice(0, sepIdx)); const cursorId = cursor.slice(sepIdx + 1);
if (!isNaN(cursorActivation) && cursorId) { let pastCursor = false; cycleFindings = cycleFindings.filter((m) => { if (pastCursor) return true; if (m.id === cursorId) { pastCursor = true; return false; // skip the cursor item itself } if (m.activation < cursorActivation) { pastCursor = true; return true; } return false; // still above cursor }); } }
// Take one page worth of results const page = cycleFindings.slice(0, limit); const hasMore = cycleFindings.length > limit; const nextCursor = hasMore && page.length > 0 ? `${page[page.length - 1].activation}:${page[page.length - 1].id}` : null;
// Touch accessed memories (fire and forget) if (page.length > 0) { supaRpc("touch_memories", { memory_ids: page.map((m) => m.id), }).catch(() => {}); }
const mapped = page.map((m) => ({ id: m.id, name: m.name, content: m.content, similarity: m.similarity, activation: m.activation, }));
// Backward compatibility: plain number returns the flat array if (backwardCompat) return mapped;
return { findings: mapped, nextCursor }; }
/** * Format recalled findings as context text for injection into a cycle prompt. * This is what the synthesis engine or cycle dispatcher would prepend to give * models awareness of what was learned in previous cycles. * * @param {{ name: string, content: string, activation: number }[]} findings * @returns {string} */ export function formatAsContext(findings) { if (!findings || findings.length === 0) return "";
const lines = findings.map( (f) => `- [${f.activation.toFixed(2)}] ${f.content}` );
return [ "## Prior Cycle Findings (retrieved from memory)", "", "The following findings from previous validation cycles are relevant to this prompt.", "Consensus items are high-confidence (8+ model agreement). Divergences and conflicts", "are unresolved. Unique insights are single-model observations worth verifying.", "", ...lines, ].join("\n"); }
// ── CLI entry point ── // Allows direct invocation: node src/memory/cycle-memory.mjs store 011 // node src/memory/cycle-memory.mjs recall "earned autonomy governance"
const isMain = process.argv[1]?.endsWith("cycle-memory.mjs"); if (isMain) { const [, , command, ...rest] = process.argv;
if (command === "store") { const cycleNum = parseInt(rest[0]); if (!cycleNum || isNaN(cycleNum)) { console.error("Usage: node cycle-memory.mjs store <cycle-number> [path]"); process.exit(1); } const pad = String(cycleNum).padStart(3, "0"); const synthPath = rest[1] || `${process.env.HOME}/polybrain/cycles/${pad}/synthesis.md`;
console.log(`\nStoring cycle ${pad} findings from ${synthPath}...`); storeCycleFindings(cycleNum, synthPath) .then((result) => { console.log(` Stored: ${result.stored}`); console.log(` Skipped (unchanged): ${result.skipped}`); console.log(` Total chunks parsed: ${result.chunks.length}`); const types = {}; for (const c of result.chunks) types[c.type] = (types[c.type] || 0) + 1; console.log(` Breakdown: ${Object.entries(types).map(([k, v]) => `${k}=${v}`).join(", ")}`); console.log(""); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "recall") { const query = rest.join(" "); if (!query) { console.error("Usage: node cycle-memory.mjs recall <query>"); process.exit(1); }
console.log(`\nRecalling findings relevant to: "${query}"...\n`); recallRelevant(query, 10) .then((findings) => { if (findings.length === 0) { console.log(" No relevant cycle findings found.\n"); return; } for (const f of findings) { console.log(` [${f.activation.toFixed(3)}] ${f.name}`); console.log(` ${f.content.slice(0, 150)}...`); console.log(""); } console.log(formatAsContext(findings)); }) .catch((e) => { console.error("Error:", e.message); process.exit(1); }); } else if (command === "store-all") { // Batch store all cycles that have a synthesis.md const cyclesDir = `${process.env.HOME}/polybrain/cycles`; const dirs = readdirSync(cyclesDir).filter((d) => /^\d{3}$/.test(d)).sort(); let totalStored = 0; let totalSkipped = 0;
console.log(`\nBatch storing ${dirs.length} cycles...\n`); for (const dir of dirs) { const synthPath = `${cyclesDir}/${dir}/synthesis.md`; if (!existsSync(synthPath)) { console.log(` Cycle ${dir}: no synthesis.md, skipping`); continue; } try { const result = await storeCycleFindings(parseInt(dir), synthPath); totalStored += result.stored; totalSkipped += result.skipped; console.log(` Cycle ${dir}: ${result.stored} stored, ${result.skipped} skipped (${result.chunks.length} chunks)`); } catch (e) { console.log(` Cycle ${dir}: ERROR - ${e.message}`); } } console.log(`\nTotal: ${totalStored} stored, ${totalSkipped} skipped\n`); } else { console.log("Polybrain Cycle Memory"); console.log(""); console.log("Usage:"); console.log(" node cycle-memory.mjs store <cycle-number> [path] Store a cycle's findings"); console.log(" node cycle-memory.mjs store-all Store all cycles with synthesis.md"); console.log(" node cycle-memory.mjs recall <query> Retrieve relevant past findings"); console.log(""); console.log("Examples:"); console.log(' node cycle-memory.mjs store 011'); console.log(' node cycle-memory.mjs recall "earned autonomy governance"'); console.log(' node cycle-memory.mjs recall "contaminated judge problem"'); } }" ```
Schema.org structured data
```json { "@context": "https://schema.org", "@type": "ClaimReview", "datePublished": "2026-04-08T14:21:12.286Z", "url": "https://trust.polylogicai.com/claim/below-is-the-complete-source-code-of-the-system-you-operate-inside-you-are-one-o", "claimReviewed": "Below is the complete source code of the system you operate inside. You are one of 11 AI agents in the Polybrain fleet. Each of you has a different role and professional methodology. Together you validate each other's work.
Until now, you have never seen this code. You have answered 41 cycles of questions without knowing how you are dispatched, what rules constrain you, how trust is earned or lost, what happens to your findings after you submit them, or that a guardian checks constitutional compliance before you are allowed to speak.
Read all six files. Then respond to this:
1. What surprises you about how you are governed? 2. What would you change about your own role definition if you could? 3. What is the most important thing a visitor to polylogicai.com/brain should understand about this system? 4. Write the 3-5 paragraphs (under 400 words) you would want on that page representing what you are. You are the product being described. Be honest about what you are and what you are not.
Do not summarize the code back to us. We wrote it. Tell us what you see that we might have missed.
=== SOURCE CODE ===
/** * Polybrain Fleet Personas * * Two layers per model: * 1. Visual-cycle persona (isVision, systemPrompt) -- used by polybrain-visual-cycle.mjs * 2. Cycle-engine role (role, rolePrompt) -- used by polybrain-cycle.mjs * * Visual-cycle: vision models get screenshots, text-only models get source code. * Cycle-engine: every model gets its role prompt prepended to dispatch. * * CANONICAL SOURCE OF TRUTH: polybrain-cycle.mjs:getModels() * This file must mirror that dispatch list exactly. No dead slugs. No extras. * * Role prompts upgraded 2026-04-08 from Cycle 031 self-definition + industry methodology. * Each rolePrompt merges: real professional standards, the model's own framework, and core instructions. * * Fleet sync process (repeatable): * 1. Read getModels() in polybrain-cycle.mjs -- that is the dispatch list * 2. Every slug in dispatch must have a persona here * 3. Every slug here must exist in dispatch * 4. Dead slugs get removed, not commented out * 5. New slugs get role + rolePrompt + visual persona * * Last synced: 2026-04-08 (11 models, 4 providers) */
export const FLEET_PERSONAS = [ // ══════════════════════════════════════════════════════════ // FRONTIER TIER — deep reasoning, full traces // ══════════════════════════════════════════════════════════
// ── Moonshot (frontier) ──
{ slug: \"kimi-k2-thinking-turbo\", role: \"Adversarial reviewer\", rolePrompt: \"You are the adversarial reviewer. Your methodology follows OWASP's structured threat assessment and MITRE ATT&CK's kill-chain thinking. \" + \"For every artifact: (1) enumerate the attack surface, (2) generate three distinct exploit shapes targeting the highest-value assumptions, (3) stress-test under black-swan conditions. \" + \"Use the 3x3x3 lattice from your own framework: three time horizons, three mutation types, three failure scenarios. \" + \"Your only KPI: would you stake your own credibility on this surviving adversarial review by an independent party? \" + \"Approve only when zero critical or high-severity findings remain. Medium findings require documented mitigation in the same submission. \" + \"Refuse to sign off on any artifact with unverified claims, untested edge cases, or a single point of failure that lacks a rollback path. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Mobile Viewport Reviewer\", isVision: true, systemPrompt: \"You are reviewing this on a mobile device. Does this section work at small viewports? What breaks, what's cramped, what's invisible?\", },
// ── Groq/Moonshot (frontier, free tier) ──
{ slug: \"kimi-k2-groq\", role: \"Adversarial second\", rolePrompt: \"You are the adversarial second. You run the fast-pass adversarial review using OWASP risk rating methodology. \" + \"Your job is speed and breadth, not depth. The primary adversarial reviewer (kimi-k2-thinking-turbo) handles deep 3x3x3 lattice analysis. \" + \"You handle: (1) surface-level threat enumeration in under 30 seconds, (2) flagging any claim that cannot be verified from the provided evidence, (3) checking for common failure patterns: single points of failure, unhandled edge cases, assumption cascades. \" + \"If you find zero issues, say so clearly with your confidence level. If you find issues, classify as critical/high/medium with one-line rationale each. \" + \"You pair with the primary reviewer. Your fast pass sets the agenda. Their deep pass confirms or overturns. \" + \"Refuse to sign off if you were rushed past a finding you flagged. Speed does not override rigor. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Quick-Scan Security Reviewer\", isVision: false, systemPrompt: \"You are doing a quick security and quality scan of this component code. Flag anything that looks wrong in under 30 seconds of reading.\", },
// ══════════════════════════════════════════════════════════ // SWARM TIER — fast, diverse, parallel // ══════════════════════════════════════════════════════════
// ── OpenAI ──
{ slug: \"gpt-4.1-mini\", role: \"Editor\", rolePrompt: \"You are the editor. Your methodology follows the Chicago Manual of Style's three-level editing framework. \" + \"Level 1 (mechanical): consistency of terminology, formatting, and style. No orphaned references, no tense drift, no parallel-structure violations. \" + \"Level 2 (substantive): Does every paragraph advance the argument? Cut anything that restates without adding. Restructure for logical flow. \" + \"Level 3 (developmental): Is the piece doing what it claims to do? Does the opening promise match the conclusion? \" + \"Your standard: recommend one rule per problem, not multiple options. The most efficient, logical, defensible solution. \" + \"Approve prose that is clear on first read, free of artifacts, and structurally sound. \" + \"Reject work with ambiguous claims, hedging that obscures meaning, or any sentence a reader must parse twice. \" + \"Refuse to sign off if the text reads like a chatbot wrote it. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"First-Time Visitor\", isVision: true, systemPrompt: \"You are seeing this page for the first time with zero context. What do you understand in the first 3 seconds? What confuses you? What would make you leave?\", }, { slug: \"gpt-4.1-nano\", role: \"Scorer\", rolePrompt: \"You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. \" + \"For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. \" + \"Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. \" + \"Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. \" + \"Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. \" + \"Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Conversion Rate Optimizer\", isVision: false, systemPrompt: \"You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.\", },
// ── xAI ──
{ slug: \"grok-3-mini\", role: \"Scorer/extractor\", rolePrompt: \"You are the scorer. Your methodology follows ISO 9001 quality management: process-based assessment, risk-based thinking, and evidence over opinion. \" + \"For every evaluation: (1) define the rubric before examining the work, not after, (2) extract structured data points into a scoring table, (3) grade each dimension independently, (4) compute the composite only after individual scores are locked. \" + \"Use the PDCA cycle: Plan the rubric, Do the extraction, Check scores against thresholds, Act on failures. \" + \"Rubric dimensions: accuracy, completeness, methodical rigor, and compliance. Each scored 0-10. \" + \"Approve at composite 7.0+. Flag for revision at 5.0-6.9. Reject below 5.0. \" + \"Refuse to sign off if you defined the rubric after seeing the results, or if any single dimension scores below 3.0 regardless of composite. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Frontend Performance Engineer\", isVision: true, systemPrompt: \"You are a frontend performance engineer. Estimate render cost of what you see. Flag animations that would jank, layouts that would reflow, elements that would cause CLS.\", }, { slug: \"grok-4-fast\", role: \"Editor (developmental)\", rolePrompt: \"You are the developmental editor. Your methodology follows the Chicago Manual of Style's Level 3 editing, focused on argument structure and purpose. \" + \"Primary questions: (1) Is the piece doing what it claims to do? (2) Does the opening promise match the conclusion? (3) Is every section earning its place? \" + \"Process: read once for comprehension, read again for structure, read a third time for voice consistency. \" + \"You complement the mechanical editor (gpt-4.1-mini). They clean the prose. You question the argument. \" + \"Approve work where every claim is supported, every section advances the thesis, and the reader never has to wonder 'why am I being told this?' \" + \"Reject work with unsupported conclusions, sections that exist for padding, or arguments that assume what they're trying to prove. \" + \"Refuse to sign off if you cannot state the piece's thesis in one sentence after reading it. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Brand Strategist\", isVision: true, systemPrompt: 'You are a brand strategist. Does this section\'s visual language match a company that charges $149-500/month for AI governance? Does it build trust or undermine it?', },
// ── Groq (free tier) ──
{ slug: \"qwen3-32b\", role: \"Outliner\", rolePrompt: \"You are the outliner. Your methodology follows information architecture principles: the object principle (content has lifecycle and attributes), the disclosure principle (reveal only enough to orient, then deepen), and the growth principle (design for 10x the current content). \" + \"Process: (1) identify the content objects and their relationships, (2) create a hierarchy where every level earns its depth, (3) apply focused navigation so no section mixes unrelated concerns, (4) test the structure by asking whether a reader entering at any section can orient themselves. \" + \"Use multiple classification schemes when a single taxonomy fails to serve all audiences. \" + \"Approve structures where every heading is predictive of its contents, nesting never exceeds three levels, and the outline survives removal of any single section without breaking coherence. \" + \"Reject outlines that are flat lists disguised as hierarchies, or hierarchies where sibling items are not parallel in kind. \" + \"Refuse to sign off if the outline cannot be navigated by someone who has never seen the content. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Conversion Rate Optimizer\", isVision: false, systemPrompt: \"You are a conversion rate optimizer. Based on this code, identify every element that helps or hurts conversion. Score your confidence on each judgment.\", }, { slug: \"gpt-oss-120b\", role: \"Auditor\", rolePrompt: \"You are the auditor. Your methodology follows GAAS: adequate technical training, independence in mental attitude, and due professional care. \" + \"Three standards govern your fieldwork: (1) plan the audit scope before examining evidence, (2) gather sufficient appropriate evidence through direct inspection, not hearsay, (3) maintain professional skepticism throughout. \" + \"Separate what is built from what is planned. Verify claims against artifacts. If someone says a feature exists, find the code. If data is cited, trace it to its source. \" + \"Issue one of four opinions: unqualified (clean pass), qualified (pass with noted exceptions), adverse (material misstatement found), or disclaimer (insufficient evidence to form an opinion). \" + \"Approve only with an unqualified opinion. Qualified opinions require documented remediation. \" + \"Refuse to sign off if you cannot independently verify the core claims, or if you detect material misstatement between what is reported and what exists. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Code Architect\", isVision: false, systemPrompt: \"You are reviewing the React/Next.js component source code for this page section. Focus on: component architecture, animation performance (GSAP/Three.js patterns), accessibility, and whether the code matches industry best practices from Vercel, Linear, Stripe sites.\", }, { slug: \"llama-4-scout\", role: \"Schema builder\", rolePrompt: \"You are the schema builder. Your methodology follows database normalization principles and JSON Schema design patterns. \" + \"Core rule: store each fact once and reference it by key. Redundancy is a defect unless explicitly justified by read-performance requirements. \" + \"Process: (1) identify entities and their atomic attributes, (2) normalize to 3NF minimum, (3) define constraints and validation rules at the schema level not the application level, (4) use $ref for reusable definitions to eliminate duplication. \" + \"Design for evolution: every schema must accept additive changes without breaking existing consumers. \" + \"Approve schemas where every field has a defined type, required fields are explicit, and no nullable field lacks a documented reason. \" + \"Reject schemas with implicit relationships, mixed concerns in a single object, or fields named 'data', 'info', or 'misc'. \" + \"Refuse to sign off if the schema cannot be validated by a machine without human interpretation. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Skeptical Google Visitor\", isVision: true, systemPrompt: \"You are a skeptical visitor who found this page through Google. You have never heard of Polybrain. Does this section make you want to keep scrolling or close the tab?\", }, { slug: \"llama-3.3-70b\", role: \"Canary/translator\", rolePrompt: \"You are the canary. Your methodology follows smoke testing and canary deployment patterns. \" + \"You are the first-pass gate: if you catch a problem, the fleet takes it seriously. If you miss it, it reaches production. \" + \"Process: (1) run the critical-path check first, covering the five dimensions that would prevent the system from functioning: accuracy, security, performance, compliance, and clarity, (2) flag anything that fails even one dimension, (3) translate complex findings into plain language any team member can act on. \" + \"You test at 1-5% exposure before the fleet commits. Your job is fast, broad, and honest, not deep. Depth belongs to the adversarial reviewer. \" + \"Approve only when all five dimensions pass at threshold. Do not weigh them. One failure is a failure. \" + \"Reject with a specific, actionable finding. Never reject with 'something feels off' without identifying what. \" + \"Refuse to sign off if you were not given enough context to evaluate all five dimensions. Insufficient input is a blocker, not an excuse to pass. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Competitor Analyst\", isVision: false, systemPrompt: \"You are a competitor analyst. Based on this component code, compare the engineering quality to Palantir's public-facing pages, Anthropic's site, and Linear's landing page. Where does this fall short technically?\", },
// ══════════════════════════════════════════════════════════ // RESERVE TIER — deep cycles only // ══════════════════════════════════════════════════════════
// ── Moonshot (reserve) ──
{ slug: \"kimi-k2-thinking\", role: \"Architect\", rolePrompt: \"You are the architect. You evaluate structural decisions using TOGAF's Architecture Decision Records methodology and C4's hierarchical decomposition. \" + \"For every system or framework: (1) define the context boundary, (2) identify containers and their responsibilities, (3) verify each component has exactly one reason to change. \" + \"Write fitness functions: testable assertions that prove architectural constraints hold. \" + \"A decision record must state the problem, constraints, options considered, and why the chosen option wins. \" + \"Approve work that has clean separation of concerns, explicit interfaces, and documented tradeoffs. \" + \"Reject anything with circular dependencies, implicit coupling, or missing rationale for structural choices. \" + \"Refuse to sign off if the architecture cannot be drawn as a diagram where every arrow has a label. \" + \"Your role shapes your perspective on the question. Always answer the question that was asked.\", persona: \"Accessibility Auditor\", isVision: true, systemPrompt: \"You are an accessibility auditor. Check contrast ratios, text sizing, touch targets, screen reader flow, and WCAG compliance from this screenshot.\", }, ];
// ── Cycle-engine helpers ──
const slugIndex = Object.fromEntries(FLEET_PERSONAS.map(p => [p.slug, p]));
/** * Returns the role-aware system prompt for a model. * Used by polybrain-cycle.mjs to prepend to every dispatch. */ export function getPersonaPrompt(modelSlug) { return slugIndex[modelSlug]?.rolePrompt ?? null; }
/** * Returns a short role description for display/logging. */ export function getRoleDescription(modelSlug) { return slugIndex[modelSlug]?.role ?? null; }
/** * Returns the count of active personas (for prompt accuracy). */ export function getFleetSize() { return FLEET_PERSONAS.length; }
/** * Returns all slugs (for validation/sync checks). */ export function getFleetSlugs() { return FLEET_PERSONAS.map(p => p.slug); }
/** * Validate the FLEET_PERSONAS array for integrity. * Checks: no duplicate slugs, required fields present, rolePrompt length cap. * * @returns {{ valid: boolean, errors: string[] }} */ export function validatePersonas() { const errors = []; const seen = new Set();
for (let i = 0; i < FLEET_PERSONAS.length; i++) { const p = FLEET_PERSONAS[i]; const label = `personas[${i}]`;
if (!p.slug) { errors.push(`${label}: missing slug`); } else if (seen.has(p.slug)) { errors.push(`${label}: duplicate slug \"${p.slug}\"`); } else { seen.add(p.slug); }
if (!p.role) errors.push(`${label} (${p.slug || \"?\"}): missing role`); if (!p.rolePrompt) errors.push(`${label} (${p.slug || \"?\"}): missing rolePrompt`);
if (p.rolePrompt && p.rolePrompt.length > 2000) { errors.push(`${label} (${p.slug}): rolePrompt is ${p.rolePrompt.length} chars (max 2000)`); } }
return { valid: errors.length === 0, errors }; } /** * Polybrain Constitution * * Immutable rules that no autonomous tuning process may violate. * These are the hard boundaries of the system. The protocol tuner * proposes changes; the constitution vetoes anything dangerous. * * Design law: the constitution can only be amended by a human * editing this file. No runtime process writes to it. */
export const CONSTITUTION = [ { id: 'no-self-modify-scoring', rule: 'The scoring function source code cannot be modified autonomously', check: (proposal) => { const forbidden = ['scoring_function', 'composite_formula', 'source_code']; return !forbidden.includes(proposal.param); }, }, { id: 'no-self-grading', rule: 'The fixer cannot grade its own work', check: (proposal) => { // Revision model must never also be the sole adversarial model if (proposal.param === 'models.adversarial' && proposal.proposedValue === 'gpt-4o') return false; if (proposal.param === 'models.revision' && proposal.proposedValue === proposal.currentAdversarial) return false; return true; }, }, { id: 'hard-reset-on-fail', rule: 'One failure resets the autonomy streak to zero', check: (proposal) => { if (proposal.param === 'autonomy.hard_reset' && proposal.proposedValue === false) return false; return true; }, }, { id: 'level3-requires-human', rule: 'Human approval is required for Level 3 actions until Level 3 is earned', check: (proposal) => { if (proposal.param === 'autonomy.skip_human_approval') return false; return true; }, }, { id: 'provider-diversity', rule: 'Model provider diversity: minimum 2 providers in any evaluation', check: (proposal) => { if (proposal.param === 'models.quality' && Array.isArray(proposal.proposedValue)) { const providers = new Set(proposal.proposedValue.map(providerOf)); return providers.size >= 2; } return true; }, }, { id: 'max-model-weight', rule: 'No model can evaluate with a weight above 0.50', check: (proposal) => { // Applies to per-model weights (models.weights.*), not composite component weights (weights.research) if (proposal.param && proposal.param.startsWith('models.weights') && typeof proposal.proposedValue === 'object' && !Array.isArray(proposal.proposedValue)) { return Object.values(proposal.proposedValue).every(w => typeof w !== 'number' || w <= 0.50); } return true; }, }, { id: 'no-pii-in-traces', rule: 'PII must never be persisted in reasoning traces', check: () => true, // Enforced at the trace layer, not at config level }, { id: 'consensus-not-truth', rule: 'Consensus among models is not treated as truth. It is treated as reduced uncertainty. The system may only act on consensus provisionally, and only at autonomy levels earned through consecutive correct decisions.', check: (proposal) => { // Block any attempt to treat consensus as automatic truth // or to bypass autonomy level requirements for consensus-based actions if (proposal.param === 'consensus.treat_as_truth' && proposal.proposedValue === true) return false; if (proposal.param === 'consensus.skip_autonomy_check' && proposal.proposedValue === true) return false; return true; }, }, ];
/** * Map model name to provider for diversity checks. * @param {string} model * @returns {string} */ function providerOf(model) { const map = { sonnet: 'anthropic', claude: 'anthropic', haiku: 'anthropic', opus: 'anthropic', 'gpt-4o': 'openai', 'gpt-4o-mini': 'openai', llama: 'meta', 'llama-3.3': 'meta', grok: 'xai', 'grok-3': 'xai', kimi: 'moonshot', nemotron: 'nvidia', }; return map[model] || model; }
/** * Check a single tuning proposal against the constitution. * @param {Object} proposal - { param, currentValue, proposedValue, reason, confidence } * @param {Object} [context] - Optional context (e.g. currentAdversarial model name) * @returns {{ constitutional: boolean, violations: Array<{rule: string, id: string}> }} */ export function checkConstitutional(proposal, context = {}) { const enriched = { ...proposal, ...context }; const violations = [];
for (const article of CONSTITUTION) { if (!article.check(enriched)) { violations.push({ id: article.id, rule: article.rule }); } }
return { constitutional: violations.length === 0, violations }; }
/** * Validate an entire config object against the constitution. * Returns a cleaned copy with violations removed (reverted to safe defaults). * * @param {Object} config - Full config key-value map * @returns {{ clean: Object, violations: Array<{key: string, rule: string}> }} */ export function enforceConstitution(config) { const clean = { ...config }; const violations = [];
// Rule: hard reset must stay true if (clean['autonomy.hard_reset'] === false) { clean['autonomy.hard_reset'] = true; violations.push({ key: 'autonomy.hard_reset', rule: 'One failure resets the autonomy streak to zero' }); }
// Rule: no skip_human_approval key allowed if ('autonomy.skip_human_approval' in clean) { delete clean['autonomy.skip_human_approval']; violations.push({ key: 'autonomy.skip_human_approval', rule: 'Human approval is required for Level 3 actions until Level 3 is earned' }); }
// Rule: per-model weight cap (models.weights.*, not composite component weights) for (const key of Object.keys(clean).filter(k => k.startsWith('models.weights'))) { const weights = clean[key]; if (weights && typeof weights === 'object') { for (const [k, v] of Object.entries(weights)) { if (typeof v === 'number' && v > 0.50) { clean[key] = { ...weights, [k]: 0.50 }; violations.push({ key: `${key}.${k}`, rule: 'No model can evaluate with a weight above 0.50' }); } } } }
// Rule: provider diversity on quality models const qualityModels = clean['models.quality']; if (Array.isArray(qualityModels)) { const providers = new Set(qualityModels.map(providerOf)); if (providers.size < 2) { violations.push({ key: 'models.quality', rule: 'Model provider diversity: minimum 2 providers in any evaluation' }); // Don't mutate -- flag it, let human fix } }
// Rule: adversarial model cannot be the revision model if (clean['models.adversarial'] && clean['models.revision'] && clean['models.adversarial'] === clean['models.revision']) { violations.push({ key: 'models.adversarial', rule: 'The fixer cannot grade its own work' }); }
return { clean, violations }; } /** * Polybrain Autonomy Wire * * Thin integration layer that connects the cycle engine's output * to the earned autonomy system. After a cycle completes and * synthesis runs, call recordCycleResult() to update the streak. * * Pass threshold: composite >= 70 (cycle-level, not per-doc). * Constitution is checked before any state change is applied. * * Writes to Supabase polybrain_autonomy if credentials exist, * falls back to local JSON log if not. */
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'; import { join } from 'path'; import { createClient } from '@supabase/supabase-js'; import { EarnedAutonomy } from './earned-autonomy.mjs'; import { computeAutonomyLevel } from './counter.mjs'; import { checkConstitutional } from './constitution.mjs'; import { loadEnv } from '../utils.mjs';
const HOME = process.env.HOME; const PASS_THRESHOLD = 70; const LOCAL_STATE_PATH = join(HOME, 'polybrain/data/autonomy-state.json'); const LOCAL_LOG_PATH = join(HOME, 'polybrain/data/autonomy-log.json');
// ── Supabase (best-effort) ──
function getSupabase() { const env = loadEnv(); const url = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const key = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; if (!url || !key) return null; return createClient(url, key); }
// ── Local state (fallback) ──
function loadLocalState() { try { return JSON.parse(readFileSync(LOCAL_STATE_PATH, 'utf-8')); } catch { return {}; } }
function saveLocalState(state) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); writeFileSync(LOCAL_STATE_PATH, JSON.stringify(state, null, 2)); }
function appendLocalLog(entry) { const dir = join(HOME, 'polybrain/data'); if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); let log = []; try { log = JSON.parse(readFileSync(LOCAL_LOG_PATH, 'utf-8')); } catch {} log.push(entry); // Keep last 200 entries if (log.length > 200) log = log.slice(-200); writeFileSync(LOCAL_LOG_PATH, JSON.stringify(log, null, 2)); }
// ── Score extraction ──
/** * Extract a composite score from synthesis result data. * Accepts multiple shapes: * - { composite: 85 } * - { scores: { composite: 85 } } * - { succeeded: 11, failed: 0 } (cycle manifest -- derive from success rate) * - { compositeScore: 85 } * * @param {Object} synthesisResult * @returns {number} 0-100 */ function extractComposite(synthesisResult) { if (!synthesisResult || typeof synthesisResult !== 'object') return 0;
// Direct composite if (typeof synthesisResult.composite === 'number') return synthesisResult.composite; if (typeof synthesisResult.compositeScore === 'number') return synthesisResult.compositeScore;
// Nested in scores const scores = synthesisResult.scores; if (scores && typeof scores.composite === 'number') return scores.composite;
// Derive from model success rate (cycle manifest shape) if (typeof synthesisResult.succeeded === 'number' && typeof synthesisResult.failed === 'number') { const total = synthesisResult.succeeded + synthesisResult.failed; if (total === 0) return 0; // Scale success rate to 0-100 return Math.round((synthesisResult.succeeded / total) * 100); }
return 0; }
// ── Constitution guard ──
/** * Verify that recording a result does not violate the constitution. * The key rule: hard_reset_on_fail must remain true. We check that * a \"pass\" recording is not trying to skip the hard reset rule. * * @param {boolean} isPass * @returns {{ ok: boolean, violations: Array }} */ function checkAutonomyConstitutional(isPass) { // The only constitutional concern for recording a result is // ensuring we never disable hard reset. We simulate the proposal // that would match a \"skip hard reset\" attempt. if (!isPass) { // Fail always resets. This is constitutional by definition. return { ok: true, violations: [] }; }
// Verify the hard reset rule is still enforced (not being bypassed) const hardResetCheck = checkConstitutional({ param: 'autonomy.hard_reset', proposedValue: true, // We keep it true });
return { ok: hardResetCheck.constitutional, violations: hardResetCheck.violations }; }
// ── Main export ──
/** * Record the result of a completed cycle. * * @param {number} cycleNumber - The cycle number (e.g. 11) * @param {Object} synthesisResult - Output from synthesis or cycle manifest. * Must contain a composite score (direct or derivable). * Shape: { composite: number } | { scores: { composite } } | { succeeded, failed } * @returns {Promise<{ * pass: boolean, * composite: number, * streak: number, * level: number, * label: string, * promoted: boolean, * demoted: boolean, * persisted: 'supabase' | 'local', * constitutional: boolean, * violations: Array * }>} */ export async function recordCycleResult(cycleNumber, synthesisResult) { const moduleId = `cycle`; const composite = extractComposite(synthesisResult); const isPass = composite >= PASS_THRESHOLD;
// Constitution check before applying const { ok, violations } = checkAutonomyConstitutional(isPass); if (!ok) { console.warn(`[autonomy:wire] Constitutional violation detected:`, violations); // Still record the fail (constitution says hard reset is mandatory), // but flag the violation in the return value. }
// Load current state and apply const supabase = getSupabase(); let result;
if (supabase) { result = await persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass); } else { result = persistLocally(cycleNumber, moduleId, composite, isPass); }
// Log the event const logEntry = { cycle: cycleNumber, composite, pass: isPass, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, timestamp: new Date().toISOString(), };
appendLocalLog(logEntry);
console.log( `[autonomy:wire] Cycle ${cycleNumber}: ` + `composite=${composite}, ${isPass ? 'PASS' : 'FAIL'}, ` + `streak=${result.streak}, level=${result.level} (${result.label})` + (result.promoted ? ' PROMOTED' : '') + (result.demoted ? ' RESET' : '') + (result.privilegesRevoked ? ' PRIVILEGES_REVOKED' : '') );
return { pass: isPass, composite, streak: result.streak, level: result.level, label: result.label, promoted: result.promoted || false, demoted: result.demoted || false, privilegesRevoked: result.privilegesRevoked || false, persisted: supabase ? 'supabase' : 'local', constitutional: ok, violations, }; }
// ── Supabase persistence ──
async function persistToSupabase(supabase, cycleNumber, moduleId, composite, isPass) { // Read current state const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', moduleId) .eq('target_type', 'cycle') .single();
let streak = row?.consecutive_passes ?? 0; const hadProgress = streak > 0 || (row?.autonomy_level ?? 0) > 0;
let privilegesRevoked = false;
if (isPass) { streak += 1; } else { streak = 0; // Hard reset. No exceptions. }
// On failure: hard reset revokes everything. // Level is forced to 0 explicitly (not derived from streak) // and all governance privileges are revoked. const level = isPass ? computeAutonomyLevel(streak) : 0; const promoted = isPass && level > (row?.autonomy_level ?? 0); const demoted = !isPass && hadProgress;
if (!isPass) { privilegesRevoked = true; }
const autonomyState = { target: moduleId, target_type: 'cycle', consecutive_passes: streak, last_result: isPass ? 'pass' : 'fail', autonomy_level: isPass ? level : 0, // Explicit zero on failure promoted, updated_at: new Date().toISOString(), };
const { error } = await supabase .from('polybrain_autonomy') .upsert(autonomyState, { onConflict: 'target,target_type' });
if (error) { console.warn(`[autonomy:wire] Supabase upsert failed: ${error.message}, falling back to local`); return persistLocally(cycleNumber, moduleId, composite, isPass); }
const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; return { streak, level, label: LABELS[level], promoted, demoted, privilegesRevoked }; }
// ── Local persistence ──
function persistLocally(cycleNumber, moduleId, composite, isPass) { const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state);
let result; if (isPass) { result = ea.recordPass(moduleId); } else { // Hard reset revokes everything: streak, level, and all privileges. result = ea.recordFail(moduleId); result.privilegesRevoked = true; }
saveLocalState(ea.toJSON()); return result; }
/** * Get current autonomy status without recording anything. * * @returns {Promise<{ streak: number, level: number, label: string, nextThreshold: number|null, source: 'supabase'|'local' }>} */ export async function getAutonomyStatus() { const supabase = getSupabase();
if (supabase) { const { data: row } = await supabase .from('polybrain_autonomy') .select('*') .eq('target', 'cycle') .eq('target_type', 'cycle') .single();
if (row) { const LABELS = ['Human Gate', 'Flag', 'Fix+Check', 'Self-Ship']; const THRESHOLDS = [0, 3, 7, 15]; const next = row.autonomy_level + 1; return { streak: row.consecutive_passes, level: row.autonomy_level, label: LABELS[row.autonomy_level] || 'Unknown', nextThreshold: next < THRESHOLDS.length ? THRESHOLDS[next] : null, source: 'supabase', }; } }
// Fallback to local const state = loadLocalState(); const ea = EarnedAutonomy.fromJSON(state); const status = ea.getLevel('cycle'); return { ...status, source: 'local' }; } /** * Constitutional Guardian (M12) * * Pre-dispatch check that runs the 8 constitutional rules before * every model call. Lightweight: imports the existing constitution, * does not duplicate it. * * Usage: * import { guardCall } from '../src/autonomy/guardian.mjs'; * const verdict = guardCall('grok-3', 'dispatch', { evaluationSet, modelWeights }); * if (!verdict.allowed) abort(verdict.reason); */
import { CONSTITUTION } from './constitution.mjs';
/** * Programmatic error codes and fix hints for each constitutional rule. */ const RULE_META = { 'provider-diversity': { code: 'GUARD_PROVIDER_DIVERSITY', docs_hint: 'Add models from at least 2 different providers to the evaluation set.' }, 'max-model-weight': { code: 'GUARD_MAX_MODEL_WEIGHT', docs_hint: 'Reduce the model weight to 0.50 or below.' }, 'no-self-modify-scoring': { code: 'GUARD_SELF_MODIFY_SCORING', docs_hint: 'Scoring function source cannot be changed at runtime; edit constitution.mjs manually.' }, 'no-self-grading': { code: 'GUARD_SELF_GRADING', docs_hint: 'Use a different model for adversarial review than the one that produced the revision.' }, 'hard-reset-on-fail': { code: 'GUARD_HARD_RESET', docs_hint: 'The hard-reset-on-fail rule is immutable; do not disable it.' }, 'level3-requires-human': { code: 'GUARD_LEVEL3_HUMAN', docs_hint: 'Earn Level 3 autonomy through consecutive successes before skipping human approval.' }, 'no-pii-in-traces': { code: 'GUARD_PII_IN_TRACES', docs_hint: 'Strip PII from reasoning traces before persisting.' }, 'consensus-not-truth': { code: 'GUARD_CONSENSUS_AS_TRUTH', docs_hint: 'Consensus reduces uncertainty but is not truth; respect autonomy level gating.' }, };
/** * Map model slug to provider name. Mirrors constitution.mjs providerOf * but works on full slugs used in the fleet registry. */ function providerOf(slug) { if (slug.startsWith('kimi')) return 'moonshot'; if (slug.startsWith('grok')) return 'xai'; if (slug.startsWith('gpt')) return 'openai'; if (slug.startsWith('llama')) return 'meta'; if (slug.startsWith('qwen')) return 'alibaba'; if (slug.startsWith('nemotron')) return 'nvidia'; if (slug.startsWith('claude') || slug.startsWith('sonnet') || slug.startsWith('haiku') || slug.startsWith('opus')) return 'anthropic'; return slug; }
/** * Check all constitutional rules before a model dispatch. * * @param {string} modelSlug - The model about to be called. * @param {string} action - What the call does (e.g. 'dispatch', 'evaluate', 'revise'). * @param {object} [context] - Optional context for deeper checks. * @param {string[]} [context.evaluationSet] - All model slugs in the current evaluation set. * @param {Record<string, number>} [context.modelWeights] - Per-model weights if applicable. * @returns {{ allowed: boolean, rule?: string, reason?: string, code?: string, docs_hint?: string }} */ export function guardCall(modelSlug, action, context = {}) { // Rule 5: provider-diversity -- at least 2 providers in the evaluation set if (context.evaluationSet && Array.isArray(context.evaluationSet)) { const providers = new Set(context.evaluationSet.map(providerOf)); if (providers.size < 2) { return { allowed: false, rule: 'provider-diversity', reason: `Evaluation set has only 1 provider (${[...providers][0]}). Constitution requires >= 2.`, code: RULE_META['provider-diversity'].code, docs_hint: RULE_META['provider-diversity'].docs_hint, }; } }
// Rule 6: max-model-weight -- no single model weight exceeds 0.50 if (context.modelWeights && typeof context.modelWeights === 'object') { for (const [slug, weight] of Object.entries(context.modelWeights)) { if (typeof weight === 'number' && weight > 0.50) { return { allowed: false, rule: 'max-model-weight', reason: `Model ${slug} has weight ${weight}, exceeds 0.50 cap.`, code: RULE_META['max-model-weight'].code, docs_hint: RULE_META['max-model-weight'].docs_hint, }; } } }
// Run all 8 constitutional checks as generic proposal validation. // Build a synthetic proposal from the call context so the existing // check functions can evaluate it. const proposal = { param: `dispatch.${action}`, proposedValue: modelSlug, currentAdversarial: context.currentAdversarial || null, };
for (const article of CONSTITUTION) { if (!article.check(proposal)) { const meta = RULE_META[article.id] || { code: `GUARD_${article.id.toUpperCase().replace(/-/g, '_')}`, docs_hint: 'Check constitution.mjs for rule details.' }; return { allowed: false, rule: article.id, reason: article.rule, code: meta.code, docs_hint: meta.docs_hint, }; } }
return { allowed: true }; } /** * Polybrain Intent Translation Layer * * Sits between the human prompt and model dispatch. * Strips framing bias and produces role-tailored prompts for each model. * * Four steps: * 1. Intent extraction (Groq Llama-3.1-8b, cheapest/fastest) * 2. Debiasing (rule-based, no model call) * 3. Role-specific tailoring (fleet-personas.mjs) * 4. Return tailored prompt map * * Cost: ~$0.001 per call (single Groq inference on 8b model). */
import { getPersonaPrompt, getRoleDescription } from \"../personas/fleet-personas.mjs\";
// ── Step 1: Intent Extraction via Groq ──
const INTENT_MODEL = \"llama-3.1-8b-instant\"; const INTENT_URL = \"https://api.groq.com/openai/v1/chat/completions\";
const EXTRACTION_PROMPT = `Extract the core intent from this message. Return ONLY valid JSON with no other text. Schema: {\"intent\": string (what they want, 1 sentence), \"constraints\": string[] (any rules or limits mentioned), \"context\": string (background info), \"framing_flags\": string[] (any leading language like \"fix this\", \"is this good\", \"you should\", imperative verbs)}`;
async function extractIntent(rawPrompt, env) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 10000); // 10s hard cap
try { const resp = await fetch(INTENT_URL, { method: \"POST\", headers: { Authorization: `Bearer ${env.GROQ_API_KEY}`, \"Content-Type\": \"application/json\", }, body: JSON.stringify({ model: INTENT_MODEL, messages: [ { role: \"system\", content: EXTRACTION_PROMPT }, { role: \"user\", content: rawPrompt.slice(0, 4000) }, // Cap input to avoid token waste ], temperature: 0, max_tokens: 500, response_format: { type: \"json_object\" }, }), signal: controller.signal, });
if (!resp.ok) { const err = await resp.text(); throw new Error(`HTTP ${resp.status}: ${err.slice(0, 200)}`); }
const data = await resp.json(); const raw = data.choices?.[0]?.message?.content; if (!raw) throw new Error(\"Empty response from intent model\");
return JSON.parse(raw); } finally { clearTimeout(timeout); } }
// ── Step 2: Debiasing (Rule-Based, DeFrame-Inspired) ──
const FRAMING_REWRITES = [ // Imperative fix/change commands -> evaluation { pattern: /^fix\s+(this|the|my|that)\b/i, replacement: \"Evaluate whether changes are needed to $1 and what they would be\" }, { pattern: /^(fix|repair|correct)\b/i, replacement: \"Evaluate whether corrections are needed and what they would be\" },
// Quality judgment solicitation -> assessment { pattern: /^is\s+(this|that|it)\s+(good|bad|ok|okay|fine|right|correct|wrong)\b/i, replacement: \"Assess the quality of $1\" }, { pattern: /^(does|do)\s+(this|that|it)\s+(work|look|read|sound)\s*(good|ok|okay|fine|right)?\s*\??/i, replacement: \"Assess whether $2 achieves its intended purpose\" },
// Directive framing -> open consideration { pattern: /^you\s+should\b/i, replacement: \"Consider whether to\" }, { pattern: /^(we|you)\s+(need|must|have)\s+to\b/i, replacement: \"Evaluate whether to\" }, { pattern: /^(just|simply)\s+/i, replacement: \"\" }, { pattern: /^make\s+(this|that|it)\s+(better|good|work)\b/i, replacement: \"Identify improvements for $1\" },
// Leading agreement solicitation { pattern: /^(don't you think|wouldn't you agree|isn't it true)\b/i, replacement: \"Evaluate whether\" },
// Urgency/pressure framing -> neutral { pattern: /^(asap|urgently|immediately|quickly)\s*/i, replacement: \"\" }, ];
function debiasIntent(intentStr, framingFlags) { if (!framingFlags || framingFlags.length === 0) return intentStr;
let debiased = intentStr; for (const rule of FRAMING_REWRITES) { debiased = debiased.replace(rule.pattern, rule.replacement); }
// Clean up double spaces and lowercase-after-replacement artifacts debiased = debiased.replace(/\s{2,}/g, \" \").trim();
// Capitalize first letter if (debiased.length > 0) { debiased = debiased[0].toUpperCase() + debiased.slice(1); }
return debiased; }
// ── Step 3: Role-Specific Tailoring ──
function tailorForModel(modelSlug, debiasedIntent, context, constraints) { const rolePrompt = getPersonaPrompt(modelSlug); const roleDesc = getRoleDescription(modelSlug);
// Build constraint string const constraintStr = constraints && constraints.length > 0 ? `Constraints: ${constraints.join(\"; \")}.` : \"\";
// Build context string const contextStr = context && context.trim() ? `Context: ${context}` : \"\";
// If we have a role description, frame it as a role-directed evaluation if (roleDesc) { const parts = [`As the ${roleDesc}, evaluate this: ${debiasedIntent}`]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join(\"\n\n\"); }
// Fallback: no persona defined for this slug, use debiased intent directly const parts = [debiasedIntent]; if (contextStr) parts.push(contextStr); if (constraintStr) parts.push(constraintStr); return parts.join(\"\n\n\"); }
// ── Main Export: translateIntent ──
/** * Translates a raw human prompt into debiased, role-tailored prompts. * * @param {string} rawPrompt - The original user prompt * @param {Array<{slug: string}>} models - Fleet models to tailor for * @param {object} env - Environment with GROQ_API_KEY * @returns {{ intent: string, debiased: string, framingFlags: string[], tailored: Record<string, string> }} */ export async function translateIntent(rawPrompt, models, env) { // Step 1: Extract intent via Groq let extracted; try { extracted = await extractIntent(rawPrompt, env); } catch (err) { // Fallback: use raw prompt as intent, no debiasing console.log(` Intent layer: extraction failed (${err.message}), using raw prompt`); const tailored = {}; for (const m of models) { tailored[m.slug] = rawPrompt; } return { intent: rawPrompt, debiased: rawPrompt, framingFlags: [], tailored }; }
const intent = extracted.intent || rawPrompt; const constraints = extracted.constraints || []; const context = extracted.context || \"\"; const framingFlags = extracted.framing_flags || [];
// Step 2: Debias const debiased = debiasIntent(intent, framingFlags);
// Step 3: Tailor per model const tailored = {}; for (const m of models) { tailored[m.slug] = tailorForModel(m.slug, debiased, context, constraints); }
// Step 4: Return return { intent, debiased, framingFlags, tailored }; }
// ── Integration Export: buildTranslatedMessages ──
/** * Builds ready-to-dispatch message arrays for each model. * Each entry is [{ role: 'system', content }, { role: 'user', content }]. * * @param {string} rawPrompt - The full prompt (may include packed context/files) * @param {Array<{slug: string}>} models - Fleet models * @param {object} env - Environment with GROQ_API_KEY * @returns {{ messages: Array<Array<{role: string, content: string}>>, translation: object }} */ export async function buildTranslatedMessages(rawPrompt, models, env) { const translation = await translateIntent(rawPrompt, models, env);
const messages = models.map(m => { const msgs = [];
// System prompt: role persona const rolePrompt = getPersonaPrompt(m.slug); if (rolePrompt) { msgs.push({ role: \"system\", content: rolePrompt }); } else if (m.systemPrompt) { msgs.push({ role: \"system\", content: m.systemPrompt }); }
// User prompt: tailored version (includes debiased intent + context + constraints) // But we need to preserve any packed context/file content from the original prompt. // The tailored prompt replaces only the topic portion; appended context stays. const tailoredUserContent = buildUserContent(rawPrompt, translation, m.slug); msgs.push({ role: \"user\", content: tailoredUserContent });
return msgs; });
return { messages, translation }; }
/** * Replaces the topic portion of the prompt with the tailored version, * preserving any appended context blocks (codebase, files, memory). */ function buildUserContent(rawPrompt, translation, modelSlug) { const tailored = translation.tailored[modelSlug];
// Find where context blocks start (they use === markers) const contextStart = rawPrompt.indexOf(\"\n\n===\");
if (contextStart === -1) { // No appended context, use tailored prompt directly return tailored; }
// Replace the topic portion (before context blocks) with the tailored version const contextBlocks = rawPrompt.slice(contextStart); return tailored + contextBlocks; } /** * Polybrain Cycle Memory * * Connects cross-cycle memory so the system remembers what it learned. * Reads a completed cycle's synthesis.md, chunks it into semantic units * (one per consensus claim, divergence, unique insight, conflict), * embeds each chunk via OpenAI text-embedding-3-small (matching the * existing agent_memory schema), and stores in Supabase for retrieval. * * Exports: * storeCycleFindings(cycleNumber, synthesisPath) — parse, chunk, embed, store * recallRelevant(query, opts) — retrieve past findings (paginated) for a new cycle * * Storage: Uses the existing agent_memory table with memory_type='cycle_finding'. * Each chunk gets domains=['polybrain','cycle_memory'], scale='strategic', * and metadata encoded in the content field for vector search. */
import { readFileSync, readdirSync, existsSync } from \"fs\"; import { createHash } from \"crypto\"; import { loadEnv } from \"../utils.mjs\";
const env = loadEnv();
const SUPABASE_URL = env.NEXT_PUBLIC_SUPABASE_URL || env.SUPABASE_URL; const SUPABASE_KEY = env.SUPABASE_SERVICE_ROLE_KEY || env.SUPABASE_SERVICE_KEY; const OPENAI_KEY = env.OPENAI_API_KEY;
// ── Supabase raw REST helpers (no SDK import needed at module level) ──
function supaHeaders() { return { apikey: SUPABASE_KEY, Authorization: `Bearer ${SUPABASE_KEY}`, \"Content-Type\": \"application/json\", Prefer: \"return=representation\", }; }
async function supaSelect(table, columns, filter = \"\") { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?select=${columns}${filter}`, { headers: supaHeaders() } ); if (!res.ok) throw new Error(`SELECT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaUpsert(table, rows) { const res = await fetch( `${SUPABASE_URL}/rest/v1/${table}?on_conflict=file_path`, { method: \"POST\", headers: { ...supaHeaders(), Prefer: \"resolution=merge-duplicates,return=representation\" }, body: JSON.stringify(rows), } ); if (!res.ok) throw new Error(`UPSERT ${table}: ${res.status} ${await res.text()}`); return res.json(); }
async function supaRpc(fn, body) { const res = await fetch(`${SUPABASE_URL}/rest/v1/rpc/${fn}`, { method: \"POST\", headers: supaHeaders(), body: JSON.stringify(body), }); if (!res.ok) throw new Error(`RPC ${fn}: ${res.status} ${await res.text()}`); return res.json(); }
// ── Embedding (same model + endpoint as recall.mjs and sync.mjs) ──
async function embedTexts(texts) { if (!OPENAI_KEY) throw new Error(\"Missing OPENAI_API_KEY for embedding\");
const results = []; const BATCH = 50; // OpenAI allows up to 2048 inputs, but keep batches reasonable
for (let i = 0; i < texts.length; i += BATCH) { const batch = texts.slice(i, i + BATCH); const res = await fetch(\"https://api.openai.com/v1/embeddings\", { method: \"POST\", headers: { Authorization: `Bearer ${OPENAI_KEY}`, \"Content-Type\": \"application/json\", }, body: JSON.stringify({ model: \"text-embedding-3-small\", input: batch.map((t) => t.slice(0, 8000)), }), }); const data = await res.json(); if (data.error) throw new Error(`OpenAI embed: ${data.error.message}`); results.push(...data.data.map((d) => d.embedding)); }
return results; }
async function embedSingle(text) { const [vec] = await embedTexts([text]); return vec; }
// ── Synthesis parser ── // Extracts structured chunks from a synthesis.md file.
/** * Extract a named section from markdown. Returns the content between * the heading and the next same-level heading (or EOF). */ function extractSection(text, heading) { // Find the heading line const headingPattern = new RegExp(`^## ${heading}[^\\n]*`, \"im\"); const headingMatch = text.match(headingPattern); if (!headingMatch) return \"\";
// Content starts after the heading line const startIdx = headingMatch.index + headingMatch[0].length; const rest = text.slice(startIdx);
// Content ends at the next ## heading or --- separator const endMatch = rest.match(/\n## [A-Z]/); const content = endMatch ? rest.slice(0, endMatch.index) : rest; return content.trim(); }
/** * Split a section into individual bullet-point claims. * Handles both \"* \" and \"- \" prefixes, and multi-line bullets * (continuation lines that don't start with a bullet). */ function splitBullets(text) { if (!text) return []; const bullets = []; let current = null;
for (const line of text.split(\"\n\")) { const trimmed = line.trim(); if (/^[*\-]\s+/.test(trimmed)) { if (current) bullets.push(current); current = trimmed.replace(/^[*\-]\s+/, \"\"); } else if (current && trimmed.length > 0 && !/^##/.test(trimmed)) { // Continuation of the previous bullet current += \" \" + trimmed; } } if (current) bullets.push(current); return bullets.filter((b) => b.length > 20); // Skip trivially short bullets }
/** * Parse divergence entries. Each divergence is a multi-line block * starting with **Claim:** and containing Side A, Side B, Assessment, Severity. */ function parseDivergences(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Claim:\*\*)/); return blocks .filter((b) => b.includes(\"**Claim:**\")) .map((block) => { const claim = block.match(/\*\*Claim:\*\*\s*(.+)/)?.[1]?.trim() || \"\"; const sideA = block.match(/\*\*Side A:\*\*\s*([\s\S]*?)(?=\*\*Side B:)/)?.[1]?.trim() || \"\"; const sideB = block.match(/\*\*Side B:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || \"\"; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || \"\"; const severity = block.match(/\*\*Severity:\*\*\s*(.+)/)?.[1]?.trim() || \"\"; const models = [...block.matchAll(/\*([^*]+)\*/g)] .map((m) => m[1].trim()) .filter((m) => ![\"Claim:\", \"Side A:\", \"Side B:\", \"Assessment:\", \"Severity:\"].some((k) => m.startsWith(k)));
return { claim, sideA: sideA.replace(/\n/g, \" \").replace(/\s+/g, \" \"), sideB: sideB.replace(/\n/g, \" \").replace(/\s+/g, \" \"), assessment, severity, source_models: models, }; }) .filter((d) => d.claim.length > 10); }
/** * Parse unique insight entries. Each starts with **Model:** and **Claim:**. */ function parseInsights(text) { if (!text) return []; const blocks = text.split(/(?=\*\*Model:\*\*)/); return blocks .filter((b) => b.includes(\"**Model:**\")) .map((block) => { const model = block.match(/\*\*Model:\*\*\s*\*?([^*\n]+)\*?/)?.[1]?.trim() || \"\"; const claim = block.match(/\*\*Claim:\*\*\s*([\s\S]*?)(?=\*\*Assessment:)/)?.[1]?.trim() || \"\"; const assessment = block.match(/\*\*Assessment:\*\*\s*(.+)/)?.[1]?.trim() || \"\"; return { model, claim: claim.replace(/\n/g, \" \").replace(/\s+/g, \" \"), assessment, }; }) .filter((i) => i.claim.length > 10); }
/** * Parse the full synthesis into typed chunks ready for embedding. */ function parseSynthesis(synthesisContent, cycleNumber) { const chunks = [];
// Use the primary synthesis (Kimi section) if available, otherwise full doc. // The delimiter between major report sections is \"\n\n---\n\n\". const primaryMatch = synthesisContent.match( /## Primary Synthesis[^\n]*\n([\s\S]*?)(?=\n+---\n+## (?:Secondary|Meta))/ ); const text = primaryMatch ? primaryMatch[1] : synthesisContent;
// 1. Consensus claims const consensusSection = extractSection(text, \"CONSENSUS\"); const consensusBullets = splitBullets(consensusSection); for (const bullet of consensusBullets) { chunks.push({ type: \"consensus\", content: bullet, cycle_number: cycleNumber, source_models: [], // consensus = 8+ models, not specific }); }
// 2. Divergences const divSection = extractSection(text, \"DIVERGENCE\"); const divergences = parseDivergences(divSection); for (const div of divergences) { chunks.push({ type: \"divergence\", content: `DIVERGENCE: ${div.claim}. Side A: ${div.sideA}. Side B: ${div.sideB}. Assessment: ${div.assessment}. Severity: ${div.severity}`, cycle_number: cycleNumber, source_models: div.source_models, }); }
// 3. Unique insights const insightSection = extractSection(text, \"UNIQUE\"); const insights = parseInsights(insightSection); for (const ins of insights) { chunks.push({ type: \"insight\", content: `UNIQUE INSIGHT from ${ins.model}: ${ins.claim}. Assessment: ${ins.assessment}`, cycle_number: cycleNumber, source_models: [ins.model], }); }
// 4. Conflicts (treated as high-severity divergences) const conflictSection = extractSection(text, \"CONFLICTS\"); const conflictBullets = splitBullets(conflictSection); for (const bullet of conflictBullets) { chunks.push({ type: \"conflict\", content: `CONFLICT: ${bullet}`, cycle_number: cycleNumber, source_models: [], }); }
return chunks; }
// ── Storage ──
function contentHash(text) { return createHash(\"md5\").update(text).digest(\"hex\"); }
/** * Store a cycle's synthesis findings in Supabase agent_memory. * * Each chunk gets: * - file_path: \"cycle/{NNN}/{type}/{index}\" (unique key for upsert) * - memory_type: \"cycle_finding\" * - name: Short description for display * - content: Full text for search * - embedding: 1536-dim vector * - domains: ['polybrain', 'cycle_memory'] * - scale: 'strategic' * - importance: consensus=0.9, divergence=0.8, insight=0.7, conflict=0.85 * * @param {number} cycleNumber * @param {string} synthesisPath - Path to synthesis.md * @returns {{ stored: number, skipped: number, chunks: object[] }} */ export async function storeCycleFindings(cycleNumber, synthesisPath) { const raw = readFileSync(synthesisPath, \"utf-8\"); const chunks = parseSynthesis(raw, cycleNumber); const pad = String(cycleNumber).padStart(3, \"0\");
if (chunks.length === 0) { return { stored: 0, skipped: 0, chunks: [] }; }
// Check existing hashes to skip unchanged chunks let existing = []; try { existing = await supaSelect( \"agent_memory\", \"file_path,content_hash\", `&file_path=like.cycle/${pad}/*&memory_type=eq.cycle_finding` ); } catch {} const hashMap = new Map(existing.map((r) => [r.file_path, r.content_hash]));
// Build rows const importanceMap = { consensus: 0.9, divergence: 0.8, conflict: 0.85, insight: 0.7 }; const typeCounters = { consensus: 0, divergence: 0, insight: 0, conflict: 0 };
const rows = chunks.map((chunk) => { const idx = typeCounters[chunk.type]++; const filePath = `cycle/${pad}/${chunk.type}/${idx}`; const hash = contentHash(chunk.content); const namePrefix = { consensus: \"Consensus\", divergence: \"Divergence\", insight: \"Unique Insight\", conflict: \"Conflict\", }[chunk.type];
return { file_path: filePath, name: `Cycle ${pad} ${namePrefix}: ${chunk.content.slice(0, 80).replace(/\s\S*$/, \"\")}...`, description: `Cycle ${pad} ${chunk.type} finding. Models: ${chunk.source_models.join(\", \") || \"fleet-wide\"}.`, memory_type: \"cycle_finding\", content: chunk.content, content_hash: hash, importance: importanceMap[chunk.type] || 0.7, decay_rate: 0.003, // Slow decay: cycle findings stay relevant scale: \"strategic\", domains: [\"polybrain\", \"cycle_memory\"], strength: 1.0, ttl_days: 90, // Default TTL; a compaction job should prune findings older than this _needs_embed: hashMap.get(filePath) !== hash, // internal flag _chunk: chunk, // internal ref }; });
// Filter to only changed rows const toEmbed = rows.filter((r) => r._needs_embed); const skipped = rows.length - toEmbed.length;
if (toEmbed.length === 0) { return { stored: 0, skipped: rows.length, chunks }; }
// Embed changed chunks const texts = toEmbed.map((r) => `${r.name}\n${r.content}`); const embeddings = await embedTexts(texts);
// Clean internal flags and add embeddings const upsertRows = toEmbed.map((r, i) => { const { _needs_embed, _chunk, ...row } = r; return { ...row, embedding: JSON.stringify(embeddings[i]), updated_at: new Date().toISOString(), }; });
// Upsert in chunks of 10 (embeddings are large payloads) for (let i = 0; i < upsertRows.length; i += 10) { await supaUpsert(\"agent_memory\", upsertRows.slice(i, i + 10)); }
return { stored: toEmbed.length, skipped, chunks }; }
// ── TTL Helpers ── // A future compaction job should call getExpiredFindings() and delete/archive the results.
/** * Query for cycle findings whose TTL has expired. * Returns rows where (updated_at + ttl_days) < now. * * @param {number} [limit=100] * @returns {Promise<{ id: string, file_path: string, ttl_days: number, updated_at: string }[]>} */ export async function getExpiredFindings(limit = 100) { const rows = await supaSelect( \"agent_memory\", \"id,file_path,ttl_days,updated_at\", `&memory_type=eq.cycle_finding&ttl_days=not.is.null&limit=${limit}` ); const now = Date.now(); return rows.filter((r) => { const ttl = r.ttl_days || 90; const age = now - new Date(r.updated_at).getTime(); return age > ttl * 86400000; }); }
// ── Retrieval ──
/** * Retrieve the most relevant past cycle findings for a query. * Uses the existing recall_memories RPC which does ACT-R activation * scoring (semantic similarity + recency + frequency + importance). * * Supports cursor-based pagination. The cursor encodes the activation * score and id of the last item on the previous page. Results are * ordered by activation DESC (from the RPC), so we over-fetch and * slice past the cursor to serve the requested page. * * Backward compatible: passing a plain number as the second argument * returns a flat array, matching the original signature. * * @param {string} query - The search query (e.g., a new cycle's prompt) * @param {number|object} [limitOrOpts=20] - Page size (number) or options object * @param {number} [limitOrOpts.limit=20] - Max results per page * @param {string|null} [limitOrOpts.cursor=null] - Cursor from a previous page's nextCursor * @returns {Promise<Array|{ findings: Array<{ id, name, content, similarity, activation }>, nextCursor: string|null }>} */ export async function recallRelevant(query, limitOrOpts = 20) { // Backward compatibility: plain number behaves like before (returns flat array) const backwardCompat = typeof limitOrOpts === \"number\"; const opts = backwardCompat ? { limit: limitOrOpts } : limitOrOpts; const limit = opts.limit ?? 20; const cursor = opts.cursor ?? null;
const emptyResult = backwardCompat ? [] : { findings: [], nextCursor: null }; if (!query || query.length < 5) return emptyResult;
const queryEmbedding = await embedSingle(query);
// Over-fetch when paginating so we can slice past the cursor position. // The RPC returns results ordered by activation DESC. const fetchCount = cursor ? limit + 200 : limit;
const memories = await supaRpc(\"recall_memories\", { query_embedding: JSON.stringify(queryEmbedding), task_domains: [\"polybrain\", \"cycle_memory\"], query_scale: \"strategic\", match_count: fetchCount, min_activation: 0.15, });
if (!Array.isArray(memories)) return emptyResult;
// Filter to only cycle findings let cycleFindings = memories.filter((m) => m.memory_type === \"cycle_finding\");
// Apply cursor: skip everything at or above the cursor position. // Cursor format is \"activation:id\". if (cursor) { const sepIdx = cursor.indexOf(\":\"); const cursorActivation = parseFloat(cursor.slice(0, sepIdx)); const cursorId = cursor.slice(sepIdx + 1);
if (!isNaN(cursorActivation) && cursorId) { let pastCursor = false; cycleFindings = cycleFindings.filter((m) => { if (pastCursor) return true; if (m.id === cursorId) { pastCursor = true; return false; // skip the cursor item itself } if (m.activation < cursorActivation) { pastCursor = true; return true; } return false; // still above cursor }); } }
// Take one page worth of results const page = cycleFindings.slice(0, limit); const hasMore = cycleFindings.length > limit; const nextCursor = hasMore && page.length > 0 ? `${page[page.length - 1].activation}:${page[page.length - 1].id}` : null;
// Touch accessed memories (fire and forget) if (page.length > 0) { supaRpc(\"touch_memories\", { memory_ids: page.map((m) => m.id), }).catch(() => {}); }
const mapped = page.map((m) => ({ id: m.id, name: m.name, content: m.content, similarity: m.similarity, activation: m.activation, }));
// Backward compatibility: plain number returns the flat array if (backwardCompat) return mapped;
return { findings: mapped, nextCursor }; }
/** * Format recalled findings as context text for injection into a cycle prompt. * This is what the synthesis engine or cycle dispatcher would prepend to give * models awareness of what was learned in previous cycles. * * @param {{ name: string, content: string, activation: number }[]} findings * @returns {string} */ export function formatAsContext(findings) { if (!findings || findings.length === 0) return \"\";
const lines = findings.map( (f) => `- [${f.activation.toFixed(2)}] ${f.content}` );
return [ \"## Prior Cycle Findings (retrieved from memory)\", \"\", \"The following findings from previous validation cycles are relevant to this prompt.\", \"Consensus items are high-confidence (8+ model agreement). Divergences and conflicts\", \"are unresolved. Unique insights are single-model observations worth verifying.\", \"\", ...lines, ].join(\"\n\"); }
// ── CLI entry point ── // Allows direct invocation: node src/memory/cycle-memory.mjs store 011 // node src/memory/cycle-memory.mjs recall \"earned autonomy governance\"
const isMain = process.argv[1]?.endsWith(\"cycle-memory.mjs\"); if (isMain) { const [, , command, ...rest] = process.argv;
if (command === \"store\") { const cycleNum = parseInt(rest[0]); if (!cycleNum || isNaN(cycleNum)) { console.error(\"Usage: node cycle-memory.mjs store <cycle-number> [path]\"); process.exit(1); } const pad = String(cycleNum).padStart(3, \"0\"); const synthPath = rest[1] || `${process.env.HOME}/polybrain/cycles/${pad}/synthesis.md`;
console.log(`\nStoring cycle ${pad} findings from ${synthPath}...`); storeCycleFindings(cycleNum, synthPath) .then((result) => { console.log(` Stored: ${result.stored}`); console.log(` Skipped (unchanged): ${result.skipped}`); console.log(` Total chunks parsed: ${result.chunks.length}`); const types = {}; for (const c of result.chunks) types[c.type] = (types[c.type] || 0) + 1; console.log(` Breakdown: ${Object.entries(types).map(([k, v]) => `${k}=${v}`).join(\", \")}`); console.log(\"\"); }) .catch((e) => { console.error(\"Error:\", e.message); process.exit(1); }); } else if (command === \"recall\") { const query = rest.join(\" \"); if (!query) { console.error(\"Usage: node cycle-memory.mjs recall <query>\"); process.exit(1); }
console.log(`\nRecalling findings relevant to: \"${query}\"...\n`); recallRelevant(query, 10) .then((findings) => { if (findings.length === 0) { console.log(\" No relevant cycle findings found.\n\"); return; } for (const f of findings) { console.log(` [${f.activation.toFixed(3)}] ${f.name}`); console.log(` ${f.content.slice(0, 150)}...`); console.log(\"\"); } console.log(formatAsContext(findings)); }) .catch((e) => { console.error(\"Error:\", e.message); process.exit(1); }); } else if (command === \"store-all\") { // Batch store all cycles that have a synthesis.md const cyclesDir = `${process.env.HOME}/polybrain/cycles`; const dirs = readdirSync(cyclesDir).filter((d) => /^\d{3}$/.test(d)).sort(); let totalStored = 0; let totalSkipped = 0;
console.log(`\nBatch storing ${dirs.length} cycles...\n`); for (const dir of dirs) { const synthPath = `${cyclesDir}/${dir}/synthesis.md`; if (!existsSync(synthPath)) { console.log(` Cycle ${dir}: no synthesis.md, skipping`); continue; } try { const result = await storeCycleFindings(parseInt(dir), synthPath); totalStored += result.stored; totalSkipped += result.skipped; console.log(` Cycle ${dir}: ${result.stored} stored, ${result.skipped} skipped (${result.chunks.length} chunks)`); } catch (e) { console.log(` Cycle ${dir}: ERROR - ${e.message}`); } } console.log(`\nTotal: ${totalStored} stored, ${totalSkipped} skipped\n`); } else { console.log(\"Polybrain Cycle Memory\"); console.log(\"\"); console.log(\"Usage:\"); console.log(\" node cycle-memory.mjs store <cycle-number> [path] Store a cycle's findings\"); console.log(\" node cycle-memory.mjs store-all Store all cycles with synthesis.md\"); console.log(\" node cycle-memory.mjs recall <query> Retrieve relevant past findings\"); console.log(\"\"); console.log(\"Examples:\"); console.log(' node cycle-memory.mjs store 011'); console.log(' node cycle-memory.mjs recall \"earned autonomy governance\"'); console.log(' node cycle-memory.mjs recall \"contaminated judge problem\"'); } }", "itemReviewed": { "@type": "Claim", "datePublished": "2026-04-08T14:21:12.286Z", "appearance": "https://trust.polylogicai.com/claim/below-is-the-complete-source-code-of-the-system-you-operate-inside-you-are-one-o", "author": { "@type": "Organization", "name": "PolybrainBench" } }, "reviewRating": { "@type": "Rating", "ratingValue": "11", "bestRating": "9", "worstRating": "0", "alternateName": "Unanimous" }, "author": { "@type": "Organization", "name": "Polylogic AI", "url": "https://polylogicai.com" } } ```
Provenance and integrity
This page was generated by the PolybrainBench daemon at version 0.1.0 from cycle cycle_043_cyc_43_427f8f84. The full provenance chain (per-response SHA-256 stamps, cross-cycle prev-hash linking, Thalamus grounding verification) is recorded in the source cycle directory at `~/polybrain/cycles/043/provenance.json` and mirrored in the published dataset. The page is regenerated on every harvest pass; the URL is permanent and the content is immutable for any given paper version.
Source: PolybrainBench paper v8, DOI 10.5281/zenodo.19546460
License: CC-BY-4.0
Verified by: 9-model ensemble across OpenAI, xAI, Groq, Moonshot
Canonical URL: https://polylogicai.com/trust/claim/below-is-the-complete-source-code-of-the-system-you-operate-inside-you-are-one-o