Reputation - Phylax

What Reputation Is

Each miner has a reputation score in [0, 1] for each skill type they support. Reputation is the canonical signal that determines:

Group selection: primaries are picked top down by per type reputation
Round aggregation: per task emissions are weighted by per type reputation

A miner with strong executable_python reputation but weak mcp_server reputation will dominate executable_python dispatches and lose mcp_server dispatches, exactly as intended.

Per Type vs Aggregate

There is no global reputation number. There are six per type reputations, one per skill type the miner declared.

Type	Reputation field
rag_knowledge	`reputation.rag_knowledge`
declarative	`reputation.declarative`
executable_python	`reputation.executable_python`
executable_script	`reputation.executable_script`
mcp_server	`reputation.mcp_server`
agent_composition	`reputation.agent_composition`

A miner that has not declared a type has no reputation row for that type and is never dispatched to it.

Initial Value

Newly registered miners start at reputation 0.5 for every type they declared. This is high enough to enter the auditor pool but low enough that they will not be picked as primary until they accumulate evidence of quality.

new_miner.reputation[type] = 0.5     # for each declared type

If a miner re-registers under a fresh hotkey they restart at 0.5. The reputation accumulated by their old hotkey is left behind.

Per Round Updates

Each round the validator records per task reputation deltas. Deltas are aggregated across validators within an epoch window (about 1 hour) and applied with damping:

canonical_rep[type] = 0.9 * canonical_rep[type] + 0.1 * epoch_mean_delta

The damping factor keeps reputation stable through one off bad rounds while still letting trends shift over a few epochs.

Update Rules

Event	Delta
Consensus pass (consensus >= 0.7)	+0.02
Consensus weak (0.4 ≤ consensus < 0.7)	0.0
Consensus fail (consensus < 0.4)	x 0.95 (multiplicative)
Sandbox rerun pass	+0.02
Sandbox rerun fail (semantic mismatch)	x 0.7
Sandbox digest mismatch (synapse time)	x 0.5
SSSA validity violation (type mismatch, forbidden LLM use)	x 0.5
Probe verification fail	x 0.7
Submission missed deadline	no change
Collusion flag accumulated	x 0.6 on flag (3 flags eject)

Additive +0.02 is small; multiplicative penalties are large. The asymmetry is intentional. Reputation should be hard to build (many honest rounds) but easy to lose (single dishonest round).

Clamping

reputation = max(0.05, min(1.0, raw_reputation))

The floor of 0.05 keeps a struggling miner in the candidate pool with a small chance of being picked as auditor through random sampling. Without the floor, a single bad streak would permanently lock a miner out of group selection.

How Reputation Drives Group Selection

Step	Use of reputation
1. Build candidate pool	Filter out reputation < 0.20
2. Sort	Sort remaining miners by per type reputation descending
3. Pick primaries	Top 3 in the sorted list
4. Sample auditors	Random 2 from the rest (uniform, not reputation weighted)

Higher reputation = higher probability of being a primary = higher emission ceiling. Auditor selection is random so newcomers can earn dispatches even at the 0.5 starting reputation.

How Reputation Drives Round Aggregation

After the round:

round_score[uid] = sum(emission[task] * base_weight[task.type] * reputation[uid][task.type])
                 / sum(base_weight[task.type] * reputation[uid][task.type])

The reputation acts as a weighting factor in the weighted average. A miner who gets emissions across multiple types has their stronger types contribute more to their round score.

Refresh Cadence

Validators pull the current canonical reputation snapshot at the start of each round and cache it for the duration of the round. Reputation cannot drift between validators because the snapshot is pulled fresh each round.

Epoch Boundaries

At the end of each epoch (every 100 blocks, roughly 20 minutes on testnet):

Per epoch deltas from all validators are aggregated
The damped update is applied per miner per type
Per type tier baselines are recomputed (median Q over the epoch)
Novel tier thresholds are recomputed (top 5 median, smoothed, floored at 1.5 × baseline)
The new state is persisted

Validators pick up the new reputation and the new tier thresholds on the next round.

Visibility

Miners can fetch their own reputation through their miner UI / status endpoint. The response includes per type reputation, current rank in each type’s pool, and the count of rounds in the last 24 hours where the miner was dispatched. Aggregate per type leaderboards are exposed by the network operator to help miners see where they stand. There is no per task or per round private signal beyond what the miner can see from their own logs.

What’s Next

Verification Groups

How reputation translates into primary slots.

Consensus

The per round signal that drives most reputation updates.

Sandbox Reruns

The async signal that updates reputation between rounds.

Scoring

Per axis formulas that feed into the consensus.

​What Reputation Is

​Per Type vs Aggregate

​Initial Value

​Per Round Updates

​Update Rules

​Clamping

​How Reputation Drives Group Selection

​How Reputation Drives Round Aggregation

​Refresh Cadence

​Epoch Boundaries

​Visibility

​What’s Next