Reliability, Coverage & Method

This note explains what the public dataset is, how its coverage should be read, and how to interpret its reliability metrics.

1. Purpose

This note explains what the public dataset is, how to interpret its reliability metrics, and how coverage should be read.

It focuses on the parts that matter to a technical buyer: product scope, coverage, market surface, and reliability signals.

2. What the product is

Vector Neutral publishes one official daily Excel workbook for the active UTC+0 date.

The product is not a pick sheet, not a betting advisory product, and not an automated execution layer. It is a structured pre-match probability surface intended for independent analysis, modeling, and downstream system building.

3. Coverage model

Coverage is not defined by a raw upstream feed alone. The public universe starts from the daily pre-match fixture intake and then passes through product-date filtering in UTC+0, semantic deduplication, probabilistic consistency checks, and publication quality control.

Reference coverage: the cleaned reference coverage file currently contains 183 geography or competition-group rows and 643 unique league or competition references across domestic, regional, and international groups.

Important boundary: this reference describes the upstream coverage universe, not a publication guarantee. The public dataset includes only matches that pass the product-date filter, deduplication, probabilistic consistency checks, and publication QC before release.

4. Current public reliability snapshot

Latest audited reliability snapshot: 2026-04-25. Counts are shown as operational references, not perpetual guarantees.

370,998 matches in the audited probability snapshot

2,135 incoming matches in the audited intake date

1,091 publishable matches after semantic deduplication

This section tracks the latest public reliability evidence. It is separate from the commercial active date shown at checkout, which controls the workbook a buyer receives.

5. Public market surface

The public workbook is designed around a coherent market surface per match.

1X2Home, draw, or away result.

DCDouble Chance: two result outcomes grouped together.

OUOver/Under totals by line.

BTSBoth Teams to Score.

AHAsian Handicap by line.

Binary BundleCurated yes/no market combinations.

Visible scope: 1X2, DC, OU, BTS, AH, Binary Bundle
Core families: 1X2, DC, BTS
Complementary families: OU, AH
Curated layer: Binary Bundle is a curated public layer, not an exhaustive combinatorial universe

6. How probabilities are synthesized

At a high level, the public probabilities are produced through four stages:

pre-match probabilistic inference
calibration
structural synthesis across related market families
coherence checks before publication

This means the workbook should be read as a unified probabilistic surface by fixture, not as a pile of unrelated scores.

7. Why Brier, Log-loss, and ECE are shown together

Each metric captures a different failure mode:

Brier Score: aggregate probabilistic error
Log-loss: penalty for assigning confidence to outcomes that do not occur
ECE: calibration drift between predicted confidence and realized frequency

Lower is better in all three. Showing the three together is more informative than showing only one, because a model can look acceptable on one metric while still being poorly calibrated or overconfident.

8. What the metrics do not mean

They do not imply guaranteed profitability.
They do not imply guaranteed downstream edge.
They do not imply guaranteed future performance.
They do not imply guaranteed suitability for a specific staking or trading method.

They are validation signals about probability quality, not commercial promises.

9. Publication discipline

One official workbook per UTC+0 date
All buyers of that date receive the same official edition
The public workbook excludes internal technical metadata
The public workbook is separated from internal control artifacts

10. Method boundary

This public note explains product behavior, coverage, market definitions, and reliability signals.

It does not document implementation internals, operational controls, or proprietary decision rules.