Seller Metrics Engine
YAML-declared metrics and dimensions, executed as a planner/fetcher/joiner/aggregator pipeline over rpt_* report tables and dynamic_attributes_values. Surfaced to the LLM via metrics-engine-local-claude-code (and prod) MCP servers.
Where things live
| Path | Purpose |
|---|---|
mcp_server/seller_metrics_engine/definitions/metrics.yaml | Every metric — source/formula/grain/category |
mcp_server/seller_metrics_engine/definitions/dimensions.yaml | Every dimension — table_column_map, optional classification or enricher binding |
mcp_server/seller_metrics_engine/definitions/__init__.py | YAML loader — loaded once at import |
mcp_server/seller_metrics_engine/tools/query.py | MCP tool entrypoints (query_metrics, compare_periods) |
app/services/seller_metrics_engine/engine/ | Resolver, planner, joiner, aggregator, executor |
app/services/seller_metrics_engine/engine/enricher/ | Stage-typed enricher framework — see Enrichers |
app/services/seller_metrics_engine/strategies/ | report_table.py (rpt_* fetch) + dynamic_attribute.py (EAV pivot) |
Pipeline at a glance
Every stage transition triggers run_enrichers_at(stage, ctx) — see Enrichers for hooks.
Required-metric validation
A universal pipeline step (between post_join enrichers and the internal-dim collapse). It scans resolved.all_metrics for any with required: true and checks the joined DataFrame for null rows in the metric's column. If any are present, it raises RequiredMetricMissingError naming the affected entities (ASIN / SKU / parent ASIN — whichever ID column the DF carries).
The step runs at source-row grain — before the collapse step drops the asin column — so the error can identify which specific entities are missing data.
# In the executor pipeline, after post_join enrichers:
self._validate_required_metrics(ctx.df, resolved)
The MCP layer maps the exception to a structured required_metric_missing error code with metric + missing_entities fields, so the LLM can tell the user precisely what needs to be configured (COGS uploaded for those ASINs, MB commission configured, etc.) instead of returning a silent null cascade through margin formulas.
To make a metric required, add required: true to its YAML entry. No registration step — the validator is part of the engine, not an enricher.
Defining a metric
Two flavors:
L1 — sourced directly from a column
- key: pnl_sales
display_name: 'PNL: Sales'
description: Gross sales totals from SKU Economics.
sources:
- alias: pnl_sales
table: rpt_pnl_sku_economics
column: sales
metric_refs: null
formula: null
supported_granularities: [weekly, monthly, quarterly]
aggregation_rule: SUM
required: true
format: $
category: PNL
hierarchy_level: L1
L1 — SQL expression (computed per source row)
For per-row arithmetic that must aggregate as SUM(a*b) rather than SUM(a)*SUM(b):
sources:
- alias: pnl_landed_cogs_amt
table: rpt_pnl_sku_economics
expr: cost_of_goods_sold_per_unit * units_sold
Engine emits SUM(cost_of_goods_sold_per_unit * units_sold) in the SELECT, so per-row arithmetic is preserved before aggregation. YAML is trusted — same blast radius as Python formula: strings.
L2/L3 — computed from other metrics
- key: pnl_pre_ad_margin
sources: null
metric_refs: [pnl_sales, pnl_landed_cogs_amt, pnl_referral_fee_amt, pnl_fba_fee, pnl_mb_commission_amt]
formula: pnl_sales - pnl_landed_cogs_amt - pnl_referral_fee_amt - pnl_fba_fee - pnl_mb_commission_amt
aggregation_rule: Recalculate
Recalculate means: SUM the source components first during aggregation, then re-evaluate the formula. Use it for any ratio/total whose pieces must be summed before division (ACoS, margin %, etc.).
Computed by enricher (no formula, no source)
- key: pnl_mb_commission_amt
sources: null
metric_refs: [pnl_sales] # forces rpt_pnl_sku_economics fetch
formula: null
enricher: mb_commission # ← Python-implemented
aggregation_rule: SUM
required: true
See Enrichers for how this is wired.
Field reference
| Field | Purpose |
|---|---|
key | Unique identifier; what the LLM passes in metrics: [...] |
display_name, description | LLM-facing |
sources | List of {alias, table, column} or {alias, table, expr} |
metric_refs | Transitive deps — pulled into the plan even if not user-requested |
formula | Python expression evaluated on the polars DF (column refs by alias) |
enricher | Name of a registered Python enricher that fills this metric |
aggregation_rule | SUM / Average / Recalculate / null |
required | If true, the engine raises RequiredMetricMissingError on null source data — see Required-metric validation |
supported_granularities | Which granularity query values are valid |
format | $ / % / int — output formatting |
category, hierarchy_level | Browsing/grouping in list_metrics |
Defining a dimension
- key: sku
display_name: SKU
description: Seller-assigned SKU. Column name varies per report — engine resolves the right one for each table.
table_column_map:
- {table: rpt_pnl_sku_economics, column: msku}
- {table: rpt_restock_inventory, column: msku}
- {table: rpt_br_detail_page_sales_traffic, column: sku}
- {table: rpt_all_listings, column: seller_sku}
- {table: rpt_sponsored_products_advertised_product, column: advertised_sku}
data_type: string
category: Product
One dim, N tables. The planner picks the right column per table.
Three resolution paths (in order)
- Explicit YAML —
table_column_mapentry matches a queried table. - Enricher — dim is registered with an
Enrichersubclass that adds the column post-fetch (e.g.parent_asinlooked up vianew_asins). - Native column fallback — dim key is not in YAML at all, but a queried table has a column with that exact name. Engine accepts it implicitly. Example:
dimensions=["seller_id"]works on everyrpt_*table without a YAML entry.
If none match, the planner raises DimensionTableMismatchError with the queried table list.
Classification dimensions (derived from a formula)
- key: campaign_type
description: Auto vs Manual based on targeting type.
table_column_map:
- {table: rpt_sponsored_products_targeting, column: targeting}
classification_formula: "'Auto' if targeting in ('close-match','loose-match','substitutes','complements') else 'Manual'"
Aggregator evaluates the formula on the raw column post-fetch, then re-aggregates by the derived value.
Query flow
QueryRequest(
metrics=["pnl_sales", "pnl_pre_ad_margin", "pnl_pre_ad_margin_pct"],
dimensions=["sku"],
filters={"child_asin": ["B0X...", "B0Y..."]},
granularity="monthly",
date_range_start="2026-04-01", date_range_end="2026-04-30",
seller_id="AGBDRQ8IA12WY",
marketplace="US",
)
The MCP wrapper validates seller authorization, parses the date range (date strings, never inferred), and hands off to QueryExecutor.execute(request). Result envelope:
{
"results": {"format": "csv", "schema": {...}, "data": "<csv>"},
"trace": {
"tables_queried": "...",
"sql_queries": {csv-envelope},
"rows_scanned": {csv-envelope},
"aggregation_rules": {csv-envelope},
"formulas_applied": {csv-envelope},
"warnings": "pipe-joined"
}
}
Conservation of mass
Every dimension grain should return the same total for additive metrics. Quick sanity check when adding metrics or enrichers:
dimensions= | rows | total pnl_sales |
|---|---|---|
[] | 1 | 12,809.59 |
[seller_id] | 1 | 12,809.59 |
[parent_asin] | 5 | 12,809.59 |
[child_asin] | 9 | 12,809.59 |
[sku] | 15 | 12,809.59 |
If a new metric breaks this, suspect the aggregator: the enricher is probably running post-collapse instead of pre-collapse, or a formula is summing a non-additive value. For a deeper walkthrough of how enrichers preserve this property — including a worked example through SKU/ASIN/seller grain — see Enrichers — How dimensionality is handled.
Adding a metric — checklist
- Pick the right hierarchy level (L1 source-backed / L2 formula / L3 with ad spend).
- Decide aggregation rule: raw
SUMfor L1 totals,Recalculatefor L2/L3 ratios. - Mark
required: trueif downstream margins would silently break with null inputs. - Add to
metrics.yaml; restart the MCP server (YAML loads once at import). - Smoke-test conservation across at least three grains.
Adding a dimension — checklist
- Identify the column name on every relevant report table.
- If the column name is identical across tables, consider relying on native-column fallback instead of YAML.
- Otherwise add a
table_column_mapentry per table. - If the dim needs a lookup (e.g. parent ASIN from child), add an enricher and bind it with
enricher: <name>.
Related
- Metrics Engine Enrichers — stage-typed framework for custom column computation
- PnL Metrics — PnL-specific source mapping, formulas, and required-data semantics
- Reporting Structure — upstream rpt_* table architecture