Skip to content

user_features

One row per user with ~87 behavioral features computed over the full sample. Covers activity, maker/taker share, category concentration, trading regularity, holding duration, and position management style.

Layout

Path Format
user_features.parquet Single parquet file

Load

import polars as pl
features = pl.read_parquet("user_features.parquet")

Schema

Identifier

Column Type Description
user_address str End-user wallet (reconciled from on-chain proxy/safe pattern)

Activity counts

Column Type Description
n_trades uint32 Lifetime trade count
n_transactions uint32 Lifetime transaction count (one transaction can produce several fills)
n_maker_trades int64 Trades in which the user was the maker
n_maker_transactions uint32 Transactions in which the user was the maker
n_markets uint32 Distinct markets the user traded
n_events uint32 Distinct parent events the user traded
n_categories uint32 Distinct canonical categories the user traded
n_counterparties uint32 Distinct trading counterparties

Volume and trade-size distribution

Column Type Description
total_volume float64 Lifetime trading volume (USDC contract notional)
avg_trade_volume float64 Mean trade size (USDC)
median_trade_volume float64 Median trade size (USDC)
max_trade_volume float64 Largest single trade (USDC)
volume_std float64 Standard deviation of trade sizes
trade_size_cv float64 Coefficient of variation of trade sizes (volume_std / avg_trade_volume)
volume_gini float64 Gini coefficient of the user's trade-volume distribution
max_trade_frac float64 Largest single trade as a fraction of total_volume

Price behaviour

Column Type Description
avg_price_traded float64 Volume-weighted mean trade price
price_std float64 Standard deviation of trade prices
frac_extreme_price float64 Fraction of trades at extreme prices (close to 0 or 1)
frac_midrange float64 Fraction of trades at mid-range prices
frac_longshot float64 Fraction of trades on the long-shot side (very low priced outcome)
frac_sureshot float64 Fraction of trades on the sure-shot side (very high priced outcome)

Maker / taker split

Column Type Description
frac_maker float64 Fraction of trades in which the user was the maker
maker_volume float64 Lifetime volume traded as maker (USDC)
frac_maker_volume float64 Maker share of lifetime volume
maker_txn_to_trade_ratio float64 Ratio of maker transactions to maker trades

Buy / sell split

Column Type Description
frac_buys float64 Fraction of trades that were buys (taker bought conditional tokens)
buy_volume float64 Lifetime buy volume (USDC)
sell_volume float64 Lifetime sell volume (USDC)
net_volume float64 buy_volume - sell_volume
buy_sell_ratio float64 buy_volume / sell_volume
avg_trade_imbalance float64 Mean per-trade buy/sell imbalance

Activity dates and span

Column Type Description
first_trade datetime[ns, UTC] Timestamp of the user's first trade
last_trade datetime[ns, UTC] Timestamp of the user's last trade
first_trade_date date Calendar date of first_trade (UTC)
last_trade_date date Calendar date of last_trade (UTC)
active_days uint32 Number of distinct UTC calendar days on which the user traded
trading_span_days int64 Days between first_trade_date and last_trade_date
active_day_ratio float64 active_days / trading_span_days
trades_per_week float64 Lifetime trades per week (over the active span)

Schedule / hour of day

Column Type Description
frac_night_trading float64 Fraction of trades during nighttime hours (UTC)
frac_weekend_trading float64 Fraction of trades on Saturday or Sunday (UTC)
frac_day_trading float64 Fraction of trades during daytime hours (UTC)
peak_trading_hour int8 Hour of day with the most trades (0–23 UTC)

Trading rhythm

Column Type Description
median_time_between_trades float64 Median seconds between consecutive trades
trading_regularity float64 Measure of how evenly spaced trades are (higher = more regular)
burst_trading_score float64 Burstiness score — higher means trades cluster into short bursts

Market-life timing

Column Type Description
avg_time_to_resolution float64 Mean seconds from trade timestamp to the parent market's resolution
avg_market_age_at_trade float64 Mean seconds since market start at trade time
frac_early_trader float64 Fraction of trades placed in the early life of the market
frac_late_trader float64 Fraction of trades placed late in the market's life
frac_new_market_trades float64 Fraction of trades in markets that opened recently
frac_closing_market_trades float64 Fraction of trades in markets close to resolution

Market mix

Column Type Description
frac_binary_markets float64 Fraction of trades in binary markets
frac_multi_outcome float64 Fraction of trades in markets with more than two outcomes
avg_market_volume float64 Mean total volume of the markets the user traded
avg_market_liquidity float64 Mean liquidity proxy of the markets the user traded
avg_trades_per_market float64 Mean number of the user's trades per traded market
avg_volume_per_market float64 Mean of the user's volume per traded market (USDC)

Concentration (Herfindahl indices)

Column Type Description
category_hhi float64 HHI over the user's category volume shares
category_entropy float64 Shannon entropy of the user's category volume distribution
category_hhi_inverse float64 1 / category_hhi (effective number of categories)
counterparty_hhi float64 HHI over the user's counterparty volume shares
counterparty_hhi_inverse float64 1 / counterparty_hhi
counterparty_ratio float64 Counterparties per trade ratio
repeat_counterparty_rate float64 Fraction of trades against a previously-seen counterparty
market_hhi float64 HHI over the user's market volume shares
market_hhi_inverse float64 1 / market_hhi

Position management

Column Type Description
oneshot_ratio float64 Fraction of market positions opened and closed in a single trade
frac_both_sides float64 Fraction of markets in which the user traded both Yes and No
avg_positions_per_market float64 Mean number of distinct open positions per market
position_turnover float64 Volume-weighted measure of how quickly positions are turned over
round_trip_rate float64 Fraction of trades that complete a round trip (open + close)
frac_held_to_resolution float64 Fraction of positions held until the market resolved
avg_holding_duration float64 Mean seconds a position is held before being closed or resolved

Per-category activity

Column Type Description
frac_politics float64 Volume share traded in Politics
frac_sports float64 Volume share traded in Sports
frac_crypto float64 Volume share traded in Crypto
frac_other_categories float64 Volume share in categories other than the three above
traded_sports int8 1 if the user has any trades in Sports, else 0
traded_crypto int8 1 if the user has any trades in Crypto, else 0
traded_finance int8 1 if the user has any trades in Finance, else 0
traded_politics int8 1 if the user has any trades in Politics, else 0
traded_tech int8 1 if the user has any trades in Tech, else 0
traded_culture int8 1 if the user has any trades in Culture, else 0
traded_weather int8 1 if the user has any trades in Weather, else 0

Wash-trader interaction

Wash trading is detected via counterparty HHI (see Methodology) but features are computed over all activity, including suspected wash trades. If you want to exclude suspected wash traders, filter on a threshold of counterparty_hhi (the paper uses >= 0.5).