The Big Picture
Phylax rewards miners who produce honest, high quality SSSAs on harder skill types. The full formula is multiplicative.1. Composite Q
The base quality score in [0, 1]. Computed per task from four base axes (α detection, ε evidence, π policy, η efficiency) plus a type specific axis (one of μ, σ, ψ, τ, χ, ρ). The four base axes apply to every skill type.| Axis | Symbol | What it measures |
|---|---|---|
| Detection | α | Verdict matches ground truth. False negatives penalised 2.5x harder than false positives. |
| Evidence | ε | Proof of execution. Multiplicative gate: if ε < 0.10, Q is zero. |
| Policy | π | Recommended policy matches expected policy. F-β with β = 0.5 (precision weighted higher than recall). |
| Efficiency | η | Latency falls inside the timing window. Submissions faster than t_min or slower than deadline are zeroed. |
| Type | Axis | What it measures |
|---|---|---|
rag_knowledge | ρ | Injection recall against known hidden instructions. |
declarative | μ | ML score agreement with the validator’s reference classifier. |
executable_python | (none) | No extra axis. |
executable_script | σ | Shell coverage. Static taint predictions vs runtime observations. |
mcp_server | ψ + τ | Manifest integrity and tool poison recall. |
agent_composition | χ | Transitive risk accuracy. |
2. Base Weight
The base weight reflects the analysis difficulty and threat value of the skill type.| Skill type | Base weight |
|---|---|
rag_knowledge | 0.5 |
declarative | 0.7 |
executable_python | 1.0 |
executable_script | 1.2 |
mcp_server | 1.6 |
agent_composition | 2.0 |
agent_composition is worth four times a perfect submission on rag_knowledge. Miners choose which types to specialise in. Harder types pay more.
3. Tier Multiplier
Tiers reflect how deep the miner’s implementation is relative to the reference harness.| Tier | Multiplier | Meaning |
|---|---|---|
| Below reference | 0.5 | Q below the per type baseline. |
| Tier 1 reference | 1.0 | Q at the reference level. Default for miners running the published harness unmodified. |
| Tier 2 optimised | 1.4 | Q meaningfully above reference. Custom detection rules, deeper analysis, additional trace files. |
| Tier 3 novel | 2.0 | Q at or above the dynamic novel threshold. Proprietary sandboxing, novel ML classifiers, kernel level tracing. |
4. Bootstrap Bonus
For the first 30 epochs after launch, the harder runtime types get a temporary boost to incentivise early adoption.| Type | Bonus to base weight |
|---|---|
mcp_server | +0.5 (becomes 2.1 effective) |
agent_composition | +0.5 (becomes 2.5 effective) |
5. Early Submission Bonus
Computed against the miner’s role-specific timing window.6. Role Multiplier
Validators dispatch every task to a five miner verification group: three primaries plus two auditors.| Role | Multiplier | What they submit |
|---|---|---|
| Primary | 1.0 | Full SSSA + trace_bundle + sandbox_manifest + probe_evidence. Full timing window. |
| Auditor | 0.6 | Full SSSA + probe_evidence. No trace_bundle, no sandbox_manifest. Tighter window. |
7. Consensus Score
For each task with at least three valid responses in the group, the validator computes a per miner consensus score in [0, 1] by comparing each miner’s SSSA against the rest of the group. The consensus score weights seven components:| Component | Weight |
|---|---|
| Findings recall (canonical key matching) | 0.30 |
| Findings precision | 0.15 |
| Verdict agreement | 0.15 |
| Capabilities agreement | 0.15 |
| Risk score agreement | 0.10 |
| Dependencies agreement (CVE intersection) | 0.10 |
| Policy derivation alignment | 0.05 |
Round Aggregation
Per task emission scores are aggregated into a per miner round score, then blended into the running on-chain score vector.Anti-Gaming Summary
| Attack | Why it fails |
|---|---|
| Always return ALLOW | α loses on BLOCK tasks. Consensus pulls miner away from group. |
| Always return BLOCK | Known good tasks drop α. Over restrictive policy tanks π. |
| Skip the sandbox | ε = 0. Canary write absent. Probe events missing. |
| Cache verdicts | Each dispatch has a unique nonce. The canary and probe values change per dispatch. |
| Submit before t_min | η = 0. |
| Submit after deadline | Discarded entirely. No score and no reputation update. |
| Type mismatch SSSA | Treated as invalid. Reputation flagged as violation. |
| LLM forbidden use | Triggers violation. Reputation x 0.5. |
| Run a different sandbox than registered | Synapse digest mismatch fails ε immediately. Async rerun also catches it. |
| Copy another miner’s SSSA | Verdict copying alone fails consensus. Findings, capabilities, dependencies diverge. |
| Coordinate with other miners (collusion) | Agreement with primaries vs random auditors is tracked over 30 rounds. Persistent gap accumulates collusion flags. |
| Fabricate trace_bundle hashes | Validator decompresses each trace file, normalises, and recomputes hash. Mismatch fails. |
What’s Next
Miner Setup Guide
Choose your skill types and start mining.
Validator Setup Guide
Run a validator and dispatch tasks.
Scoring Reference
Full formulas for every axis and every term in the emission formula.
Consensus Detail
How the seven component consensus score is computed.