Skip to content

user_pnl_summary

One row per user. Wide table holding the user's terminal PnL across five methodological variants — base, resolved-only, no-fee, spread-adjusted, spread-adjusted resolved-only — each with a total and a per-category breakdown over the seven canonical categories.

This is the right table when you need a single number per user.

Layout

Path Format
user_pnl_summary.parquet Single parquet file

Load

import polars as pl
users = pl.read_parquet("user_pnl_summary.parquet")

Variants

Prefix Filter Spread-adjusted?
pnl_* Base — all trades, all markets No
pnl_resolved_* Markets settled by sample end (winner not null) No
pnl_no_fee_* Markets with has_fee = False (equivalent to taker_base_fee == 0) No
pnl_spread_adj_* Base Yes — half-spread 0.005
pnl_spread_adj_resolved_* Resolved Yes — half-spread 0.005

For each variant, you get a _total and a per-category column.

Columns

user_address
pnl_total, portfolio_value, usdc_balance     pnl_resolved_total
pnl_sports                                   pnl_resolved_sports
pnl_crypto                                   pnl_resolved_crypto
pnl_finance                                  pnl_resolved_finance
pnl_politics                                 pnl_resolved_politics
pnl_tech                                     pnl_resolved_tech
pnl_culture                                  pnl_resolved_culture
pnl_weather                                  pnl_resolved_weather
pnl_no_fee_total                             pnl_spread_adj_total
pnl_no_fee_sports                            pnl_spread_adj_sports
pnl_no_fee_crypto                            pnl_spread_adj_crypto
pnl_no_fee_finance                           pnl_spread_adj_finance
pnl_no_fee_politics                          pnl_spread_adj_politics
pnl_no_fee_tech                              pnl_spread_adj_tech
pnl_no_fee_culture                           pnl_spread_adj_culture
pnl_no_fee_weather                           pnl_spread_adj_weather
pnl_spread_adj_resolved_total
pnl_spread_adj_resolved_sports
pnl_spread_adj_resolved_crypto
pnl_spread_adj_resolved_finance
pnl_spread_adj_resolved_politics
pnl_spread_adj_resolved_tech
pnl_spread_adj_resolved_culture
pnl_spread_adj_resolved_weather

Base-variant PnL decomposition

The base variant additionally carries:

Column Description
portfolio_value Mark-to-market value of the user's open positions at the sample end
usdc_balance The user's USDC cash account at the sample end

By construction portfolio_value + usdc_balance == pnl_total. This lets you split realized (usdc_balance) from unrealized (portfolio_value) without joining against another table. The decomposition is only emitted for the base variant; the resolved and spread-adjusted variants would carry differently-defined values that don't recombine cleanly to the corresponding totals.

Filling semantics

A user absent from a given variant (e.g., never traded a resolved market) gets 0.0 in that variant's columns — not null. This is also how "untagged" trading shows up: pnl_total can exceed pnl_sports + … + pnl_weather because untagged markets are counted in the total but not in any category.

Examples

import polars as pl
users = pl.read_parquet("user_pnl_summary.parquet")

# Top 10 by base terminal PnL
users.sort("pnl_total", descending=True).head(10)

# How much PnL is from sports vs everything else, on resolved markets only?
users.select(
    pl.col("pnl_resolved_sports").sum().alias("sports"),
    (pl.col("pnl_resolved_total") - pl.col("pnl_resolved_sports")).sum().alias("non_sports"),
)

# Spread cost: gross vs net PnL on resolved markets
users.with_columns(
    (pl.col("pnl_resolved_total") - pl.col("pnl_spread_adj_resolved_total"))
    .alias("spread_cost")
).head(10)