Skip to content

markets

One row per Polymarket market (a "market" is a single question with one or more outcome tokens). Cleaned metadata: classifier-derived category, fee flags, resolution outcome, and lifecycle timestamps.

Layout

Path Format
markets.parquet Single parquet file

Load

import polars as pl
markets = pl.read_parquet("markets.parquet")
from datasets import load_dataset
markets = load_dataset("vgregoire/polymarket-users", "markets")

Schema

Column Type Description
market_id int64 Market identifier (matches predictions.market_id and trades.market_id)
question str Market question text as displayed on Polymarket
category str Canonical category (one of Sports, Crypto, Finance, Politics, Tech, Culture, Weather, or Untagged). Tag-based mapping inherited from the parent event
category_original str Original Polymarket platform tag before mapping to the canonical set
event_id str Parent event identifier (matches events.event_id)
outcome_yes bool True for binary markets where this row represents the "Yes" side
market_start_time datetime[ns, UTC] Trading start time
close_time datetime[ns, UTC] When trading closed (resolution time when settled)
expiration_time datetime[ns, UTC] Scheduled expiration / settlement time
can_close_early bool Whether the market is allowed to close before its scheduled expiration
taker_base_fee int64 Taker base fee in basis points (0 before the Q4 2024 fee introduction)
has_fee bool Convenience flag: True iff taker_base_fee > 0

Untagged markets

Markets whose original Polymarket tag did not map to one of the seven canonical categories are kept with category = "Untagged". They are not dropped from markets or trades, but are excluded from the per-category PnL tables.