GENYS

How it Works

How GENYS Works

GENYS records probabilistic claims from humans, AI models, markets, and operational systems, resolves them against real outcomes, and measures calibration over time.

The Core Loop

Claim → Probability → Governance → Outcome → Calibration

01

Claim

A probabilistic prediction is recorded with a specific probability, time horizon, and success criteria. The user must commit a number.

02

Probability

The claim is anchored by the user's estimate. External signals from AI models, prediction markets, or operational baselines are ingested alongside it.

03

Governance

Rules evaluate the probability against thresholds. Decisions below risk floors are flagged. Decisions above scaling thresholds are cleared for execution.

04

Outcome

When the time horizon is reached, the decision resolves against reality. The outcome is locked and immutable. Overdue resolutions degrade the user's calibration score.

05

Calibration

Prediction error is computed. Brier score, directional bias, and confidence bucket accuracy update. The system learns where you are right and where you are wrong.

What Counts as a Claim

Any prediction with a probability is a claim

Human estimates

"I believe this campaign will convert at 15% or higher" → 65%

AI model outputs

GPT-4o predicts 72% likelihood. Claude predicts 61%. Both are recorded.

Market prices

Polymarket prices ETH breaking $4k at 68%. Ingested as a reference signal.

Operational forecasts

"Shopify App Store will drive 40%+ of signups" → treated as a probabilistic claim, not a plan.

Resolution Discipline

Every prediction must resolve

When a decision's time horizon is reached, it must be resolved against reality. Did it happen or not?

Outcomes are locked on resolution. A database trigger prevents mutation of the outcome, probability, or resolution source after the fact.

Unresolved decisions past their time horizon are flagged as overdue and incur a calibration penalty equivalent to a bad prediction. You cannot game calibration by selectively avoiding resolution.

Resolution sources are tracked: user, external, polymarket_auto, or system_overdue.

Calibration

What calibration means

Brier Score

Mean squared error of probabilities vs outcomes. 0 is perfect. 1 is worst. Lower is better.

Directional Bias

Do you overestimate or underestimate? Positive = overconfident. Negative = underconfident.

Confidence Buckets

"When you say 60–70%, outcomes occur 52% of the time." 10 buckets from 0–100%.

Category Breakdown

Separate calibration per domain: crypto, pricing, political, technology. See where you are accurate and where you are not.

Why This Matters

Most systems generate probabilities but never check if they were right

AI models produce confidence scores. Markets produce prices. Teams produce forecasts. Almost none of these are tracked against outcomes over time. Without structured resolution and scoring, there is no learning. GENYS creates a persistent forecasting record that compounds accuracy through feedback.