user_pnl_summary¶
One row per user. Wide table holding the user's terminal PnL across five methodological variants — base, resolved-only, no-fee, spread-adjusted, spread-adjusted resolved-only — each with a total and a per-category breakdown over the seven canonical categories.
This is the right table when you need a single number per user.
Layout¶
| Path | Format |
|---|---|
user_pnl_summary.parquet |
Single parquet file |
Load¶
Variants¶
| Prefix | Filter | Spread-adjusted? |
|---|---|---|
pnl_* |
Base — all trades, all markets | No |
pnl_resolved_* |
Markets settled by sample end (winner not null) |
No |
pnl_no_fee_* |
Markets with has_fee = False (equivalent to taker_base_fee == 0) |
No |
pnl_spread_adj_* |
Base | Yes — half-spread 0.005 |
pnl_spread_adj_resolved_* |
Resolved | Yes — half-spread 0.005 |
For each variant, you get a _total and a per-category column.
Columns¶
user_address
pnl_total, portfolio_value, usdc_balance pnl_resolved_total
pnl_sports pnl_resolved_sports
pnl_crypto pnl_resolved_crypto
pnl_finance pnl_resolved_finance
pnl_politics pnl_resolved_politics
pnl_tech pnl_resolved_tech
pnl_culture pnl_resolved_culture
pnl_weather pnl_resolved_weather
pnl_no_fee_total pnl_spread_adj_total
pnl_no_fee_sports pnl_spread_adj_sports
pnl_no_fee_crypto pnl_spread_adj_crypto
pnl_no_fee_finance pnl_spread_adj_finance
pnl_no_fee_politics pnl_spread_adj_politics
pnl_no_fee_tech pnl_spread_adj_tech
pnl_no_fee_culture pnl_spread_adj_culture
pnl_no_fee_weather pnl_spread_adj_weather
pnl_spread_adj_resolved_total
pnl_spread_adj_resolved_sports
pnl_spread_adj_resolved_crypto
pnl_spread_adj_resolved_finance
pnl_spread_adj_resolved_politics
pnl_spread_adj_resolved_tech
pnl_spread_adj_resolved_culture
pnl_spread_adj_resolved_weather
Base-variant PnL decomposition¶
The base variant additionally carries:
| Column | Description |
|---|---|
portfolio_value |
Mark-to-market value of the user's open positions at the sample end |
usdc_balance |
The user's USDC cash account at the sample end |
By construction portfolio_value + usdc_balance == pnl_total. This lets
you split realized (usdc_balance) from unrealized (portfolio_value)
without joining against another table. The decomposition is only
emitted for the base variant; the resolved and spread-adjusted variants
would carry differently-defined values that don't recombine cleanly to
the corresponding totals.
Filling semantics¶
A user absent from a given variant (e.g., never traded a resolved market)
gets 0.0 in that variant's columns — not null. This is also how
"untagged" trading shows up: pnl_total can exceed pnl_sports + … +
pnl_weather because untagged markets are counted in the total but not in
any category.
Examples¶
import polars as pl
users = pl.read_parquet("user_pnl_summary.parquet")
# Top 10 by base terminal PnL
users.sort("pnl_total", descending=True).head(10)
# How much PnL is from sports vs everything else, on resolved markets only?
users.select(
pl.col("pnl_resolved_sports").sum().alias("sports"),
(pl.col("pnl_resolved_total") - pl.col("pnl_resolved_sports")).sum().alias("non_sports"),
)
# Spread cost: gross vs net PnL on resolved markets
users.with_columns(
(pl.col("pnl_resolved_total") - pl.col("pnl_spread_adj_resolved_total"))
.alias("spread_cost")
).head(10)