user_features
One row per user with ~87 behavioral features computed over the full
sample. Covers activity, maker/taker share, category concentration,
trading regularity, holding duration, and position management style.
Layout
| Path |
Format |
user_features.parquet |
Single parquet file |
Load
import polars as pl
features = pl.read_parquet("user_features.parquet")
Schema
Identifier
| Column |
Type |
Description |
user_address |
str |
End-user wallet (reconciled from on-chain proxy/safe pattern) |
Activity counts
| Column |
Type |
Description |
n_trades |
uint32 |
Lifetime trade count |
n_transactions |
uint32 |
Lifetime transaction count (one transaction can produce several fills) |
n_maker_trades |
int64 |
Trades in which the user was the maker |
n_maker_transactions |
uint32 |
Transactions in which the user was the maker |
n_markets |
uint32 |
Distinct markets the user traded |
n_events |
uint32 |
Distinct parent events the user traded |
n_categories |
uint32 |
Distinct canonical categories the user traded |
n_counterparties |
uint32 |
Distinct trading counterparties |
Volume and trade-size distribution
| Column |
Type |
Description |
total_volume |
float64 |
Lifetime trading volume (USDC contract notional) |
avg_trade_volume |
float64 |
Mean trade size (USDC) |
median_trade_volume |
float64 |
Median trade size (USDC) |
max_trade_volume |
float64 |
Largest single trade (USDC) |
volume_std |
float64 |
Standard deviation of trade sizes |
trade_size_cv |
float64 |
Coefficient of variation of trade sizes (volume_std / avg_trade_volume) |
volume_gini |
float64 |
Gini coefficient of the user's trade-volume distribution |
max_trade_frac |
float64 |
Largest single trade as a fraction of total_volume |
Price behaviour
| Column |
Type |
Description |
avg_price_traded |
float64 |
Volume-weighted mean trade price |
price_std |
float64 |
Standard deviation of trade prices |
frac_extreme_price |
float64 |
Fraction of trades at extreme prices (close to 0 or 1) |
frac_midrange |
float64 |
Fraction of trades at mid-range prices |
frac_longshot |
float64 |
Fraction of trades on the long-shot side (very low priced outcome) |
frac_sureshot |
float64 |
Fraction of trades on the sure-shot side (very high priced outcome) |
Maker / taker split
| Column |
Type |
Description |
frac_maker |
float64 |
Fraction of trades in which the user was the maker |
maker_volume |
float64 |
Lifetime volume traded as maker (USDC) |
frac_maker_volume |
float64 |
Maker share of lifetime volume |
maker_txn_to_trade_ratio |
float64 |
Ratio of maker transactions to maker trades |
Buy / sell split
| Column |
Type |
Description |
frac_buys |
float64 |
Fraction of trades that were buys (taker bought conditional tokens) |
buy_volume |
float64 |
Lifetime buy volume (USDC) |
sell_volume |
float64 |
Lifetime sell volume (USDC) |
net_volume |
float64 |
buy_volume - sell_volume |
buy_sell_ratio |
float64 |
buy_volume / sell_volume |
avg_trade_imbalance |
float64 |
Mean per-trade buy/sell imbalance |
Activity dates and span
| Column |
Type |
Description |
first_trade |
datetime[ns, UTC] |
Timestamp of the user's first trade |
last_trade |
datetime[ns, UTC] |
Timestamp of the user's last trade |
first_trade_date |
date |
Calendar date of first_trade (UTC) |
last_trade_date |
date |
Calendar date of last_trade (UTC) |
active_days |
uint32 |
Number of distinct UTC calendar days on which the user traded |
trading_span_days |
int64 |
Days between first_trade_date and last_trade_date |
active_day_ratio |
float64 |
active_days / trading_span_days |
trades_per_week |
float64 |
Lifetime trades per week (over the active span) |
Schedule / hour of day
| Column |
Type |
Description |
frac_night_trading |
float64 |
Fraction of trades during nighttime hours (UTC) |
frac_weekend_trading |
float64 |
Fraction of trades on Saturday or Sunday (UTC) |
frac_day_trading |
float64 |
Fraction of trades during daytime hours (UTC) |
peak_trading_hour |
int8 |
Hour of day with the most trades (0–23 UTC) |
Trading rhythm
| Column |
Type |
Description |
median_time_between_trades |
float64 |
Median seconds between consecutive trades |
trading_regularity |
float64 |
Measure of how evenly spaced trades are (higher = more regular) |
burst_trading_score |
float64 |
Burstiness score — higher means trades cluster into short bursts |
Market-life timing
| Column |
Type |
Description |
avg_time_to_resolution |
float64 |
Mean seconds from trade timestamp to the parent market's resolution |
avg_market_age_at_trade |
float64 |
Mean seconds since market start at trade time |
frac_early_trader |
float64 |
Fraction of trades placed in the early life of the market |
frac_late_trader |
float64 |
Fraction of trades placed late in the market's life |
frac_new_market_trades |
float64 |
Fraction of trades in markets that opened recently |
frac_closing_market_trades |
float64 |
Fraction of trades in markets close to resolution |
Market mix
| Column |
Type |
Description |
frac_binary_markets |
float64 |
Fraction of trades in binary markets |
frac_multi_outcome |
float64 |
Fraction of trades in markets with more than two outcomes |
avg_market_volume |
float64 |
Mean total volume of the markets the user traded |
avg_market_liquidity |
float64 |
Mean liquidity proxy of the markets the user traded |
avg_trades_per_market |
float64 |
Mean number of the user's trades per traded market |
avg_volume_per_market |
float64 |
Mean of the user's volume per traded market (USDC) |
Concentration (Herfindahl indices)
| Column |
Type |
Description |
category_hhi |
float64 |
HHI over the user's category volume shares |
category_entropy |
float64 |
Shannon entropy of the user's category volume distribution |
category_hhi_inverse |
float64 |
1 / category_hhi (effective number of categories) |
counterparty_hhi |
float64 |
HHI over the user's counterparty volume shares |
counterparty_hhi_inverse |
float64 |
1 / counterparty_hhi |
counterparty_ratio |
float64 |
Counterparties per trade ratio |
repeat_counterparty_rate |
float64 |
Fraction of trades against a previously-seen counterparty |
market_hhi |
float64 |
HHI over the user's market volume shares |
market_hhi_inverse |
float64 |
1 / market_hhi |
Position management
| Column |
Type |
Description |
oneshot_ratio |
float64 |
Fraction of market positions opened and closed in a single trade |
frac_both_sides |
float64 |
Fraction of markets in which the user traded both Yes and No |
avg_positions_per_market |
float64 |
Mean number of distinct open positions per market |
position_turnover |
float64 |
Volume-weighted measure of how quickly positions are turned over |
round_trip_rate |
float64 |
Fraction of trades that complete a round trip (open + close) |
frac_held_to_resolution |
float64 |
Fraction of positions held until the market resolved |
avg_holding_duration |
float64 |
Mean seconds a position is held before being closed or resolved |
Per-category activity
| Column |
Type |
Description |
frac_politics |
float64 |
Volume share traded in Politics |
frac_sports |
float64 |
Volume share traded in Sports |
frac_crypto |
float64 |
Volume share traded in Crypto |
frac_other_categories |
float64 |
Volume share in categories other than the three above |
traded_sports |
int8 |
1 if the user has any trades in Sports, else 0 |
traded_crypto |
int8 |
1 if the user has any trades in Crypto, else 0 |
traded_finance |
int8 |
1 if the user has any trades in Finance, else 0 |
traded_politics |
int8 |
1 if the user has any trades in Politics, else 0 |
traded_tech |
int8 |
1 if the user has any trades in Tech, else 0 |
traded_culture |
int8 |
1 if the user has any trades in Culture, else 0 |
traded_weather |
int8 |
1 if the user has any trades in Weather, else 0 |
Wash-trader interaction
Wash trading is detected via counterparty HHI (see
Methodology) but features are computed over all
activity, including suspected wash trades. If you want to exclude
suspected wash traders, filter on a threshold of counterparty_hhi
(the paper uses >= 0.5).