Prediction Market Signals — Plan¶
Motivation¶
Political insiders — fed chairs, cabinet members, and their associates — may be operating pseudonymous wallets on on-chain prediction markets like Polymarket. Because all Polymarket trades are recorded on the Polygon blockchain, wallet activity is fully public. A wallet that consistently wins on low-probability US geopolitical events (tariffs, sanctions, rate decisions, military actions) at above-market accuracy is statistically anomalous — and that edge is most plausibly explained by informational advantage.
This plan builds two complementary signal layers:
- Polymarket wallet tracker — identifies and scores wallets by geopolitical win rate, watches their open positions, and maps those positions to affected tickers in our universe.
- Kalshi price signal layer — monitors aggregate market price movements on Kalshi as a corroborating signal (Kalshi is CFTC-regulated and centralized, so individual account tracking is not possible — but sudden price moves are actionable on their own).
Together, these feed a new PredictionMarketSignal into the fintrack pipeline alongside existing sentiment and political trade signals.
Scope¶
In scope: - Polymarket CLOB API integration: fetch markets, trades, positions by wallet - Wallet scoring system: win rate on geopolitical markets, edge-weighted, recency-decayed - Flagged wallet registry: persisted list of persons-of-interest wallets with score history - Kalshi REST API integration: market price polling for geopolitical event markets - Geopolitical market filter: classify Polymarket/Kalshi markets as geopolitical or not (LLM-assisted) - Position → ticker mapping: map open prediction market positions to affected stocks/sectors - Scheduler jobs for both layers - API endpoints exposing signals and flagged wallets - Dashboard tab: "Prediction Markets" with wallet leaderboard and signal feed
Out of scope: - Deanonymizing wallet owners beyond behavioral scoring (we label by behavior, not identity) - Trading automation or order execution - Augur or other prediction markets (can add later via same abstraction) - Manifold/Metaculus (no real money = weaker signal)
Architecture Overview¶
Polymarket CLOB API ──► wallet_tracker.py ──► score wallets ──► flagged_wallets table
└──► open positions ──┐
▼
Kalshi REST API ──────► kalshi_monitor.py ──► price spikes ────► position_mapper.py
│
▼
PredictionMarketSignal table
│
▼
API endpoint ──► Dashboard
Approach¶
1. DB Models — db/models.py¶
PolymarketWallet — registry of scored wallets:
class PolymarketWallet(Base):
__tablename__ = "polymarket_wallets"
id = Column(Integer, primary_key=True)
address = Column(String(42), unique=True, nullable=False) # 0x...
label = Column(String(100)) # human label if known, else None
geo_win_rate = Column(Float) # win rate on geopolitical markets
geo_edge = Column(Float) # avg (outcome - implied_prob) on wins
geo_trade_count = Column(Integer, default=0) # total geopolitical trades scored
total_volume = Column(Float) # USD volume across all markets
score = Column(Float) # composite score (see scoring section)
flagged = Column(Boolean, default=False)
first_seen_at = Column(DateTime, default=utcnow)
last_scored_at = Column(DateTime)
notes = Column(Text)
__table_args__ = (Index("ix_polymarket_wallets_score", "score"),)
PredictionMarketSignal — actionable signals mapped to tickers:
class PredictionMarketSignal(Base):
__tablename__ = "prediction_market_signals"
id = Column(Integer, primary_key=True)
source = Column(String(20), nullable=False) # "polymarket" | "kalshi"
market_id = Column(String(200), nullable=False) # platform market ID/slug
market_title = Column(Text, nullable=False)
signal_type = Column(String(20), nullable=False) # "wallet_open" | "price_spike" | "convergence"
direction = Column(String(10)) # "yes" | "no"
wallet_address = Column(String(42)) # null for Kalshi signals
wallet_score = Column(Float) # score of wallet at signal time
market_price = Column(Float) # implied probability at signal time
price_delta = Column(Float) # price change that triggered (Kalshi)
affected_tickers = Column(ARRAY(String)) # tickers this maps to
affected_sectors = Column(ARRAY(String)) # sectors
conviction = Column(String(10)) # "low" | "medium" | "high"
created_at = Column(DateTime, default=utcnow)
expires_at = Column(DateTime) # market resolution date
__table_args__ = (
Index("ix_pms_created", "created_at"),
Index("ix_pms_source_market", "source", "market_id"),
)
KalshiMarket — tracked Kalshi markets cache:
class KalshiMarket(Base):
__tablename__ = "kalshi_markets"
id = Column(Integer, primary_key=True)
market_id = Column(String(100), unique=True, nullable=False)
title = Column(Text)
category = Column(String(50)) # "geopolitical" | "other"
yes_price = Column(Float) # last known yes price (0–1)
prev_yes_price = Column(Float) # price from previous poll
volume_24h = Column(Float)
close_time = Column(DateTime)
last_fetched_at = Column(DateTime, default=utcnow)
2. Config — config.py¶
# Prediction market settings
POLYMARKET_API_BASE = "https://clob.polymarket.com"
KALSHI_API_BASE = "https://trading-api.kalshi.com/trade-api/v2"
# Wallet scoring thresholds
WALLET_FLAG_SCORE_THRESHOLD = 0.65 # minimum composite score to flag
WALLET_MIN_GEO_TRADES = 5 # ignore wallets with fewer geopolitical trades
# Signal thresholds
KALSHI_SPIKE_THRESHOLD = 0.08 # price move >= 8% triggers a signal
WALLET_OPEN_MIN_SCORE = 0.60 # minimum wallet score to emit a signal
# Geopolitical keyword filter (used to pre-filter markets before LLM classification)
GEO_KEYWORDS = [
"tariff", "sanction", "fed rate", "federal reserve", "nato",
"trade war", "executive order", "china", "russia", "iran",
"interest rate", "treasury", "secretary", "congress", "legislation",
]
# Seed wallets to bootstrap scoring (known high-accuracy addresses from public research)
POLYMARKET_SEED_WALLETS: list[str] = [] # populate via discovery job
3. Polymarket Wallet Tracker — scraper/polymarket.py¶
fetch_geopolitical_markets() -> list[dict]
- GET {POLYMARKET_API_BASE}/markets with pagination
- Pre-filter by GEO_KEYWORDS in title (fast, no API cost)
- LLM secondary classification via _classify_market_geo(title) — batched, cached 24h
- Return list of {market_id, title, question, end_date, yes_price}
fetch_market_trades(market_id: str, lookback_days: int = 90) -> list[dict]
- GET {POLYMARKET_API_BASE}/trades?market={market_id}
- Paginate; filter to lookback_days
- Return: {wallet, side, price, size, timestamp, outcome} — outcome filled in after resolution
fetch_wallet_positions(address: str) -> list[dict]
- GET {POLYMARKET_API_BASE}/positions?user={address}
- Returns current open positions: {market_id, side, avg_price, size, market_title}
score_wallet(address: str, geo_trades: list[dict]) -> dict
- Only considers resolved geopolitical markets
- win_rate = wins / total
- edge = mean(outcome_price - entry_price) on winning trades — penalizes betting heavy favorites
- recency_weight: trades in last 30d weighted 3×, 31–90d weighted 1×
- volume_weight: log-scale total volume — avoids flagging tiny accounts
- composite_score = 0.4 * win_rate + 0.4 * edge_score + 0.2 * volume_weight
- Minimum WALLET_MIN_GEO_TRADES to produce a score
discover_wallets(market_ids: list[str]) -> list[str]
- Fetches all unique wallet addresses from trades on geopolitical markets
- Feeds into the scoring pipeline — how the wallet set grows organically
upsert_wallet_score(wallet: dict, db: Session)
- Insert or update PolymarketWallet
- Set flagged = True if composite_score >= WALLET_FLAG_SCORE_THRESHOLD
4. Kalshi Monitor — scraper/kalshi.py¶
fetch_kalshi_markets() -> list[dict]
- GET {KALSHI_API_BASE}/markets with category=geopolitics filter
- Also fetch general markets and apply GEO_KEYWORDS filter
- Cache market metadata in KalshiMarket table
poll_kalshi_prices(db: Session) -> list[dict]
- For each tracked KalshiMarket, fetch current yes_price
- Compute delta = yes_price - prev_yes_price
- If abs(delta) >= KALSHI_SPIKE_THRESHOLD → emit a spike signal
- Update prev_yes_price in DB
- Return list of spike events
No authentication required for market data on Kalshi's public API tier.
5. Position → Ticker Mapper — scraper/prediction_market_mapper.py¶
map_market_to_tickers(market_title: str) -> dict
LLM-assisted mapping (Claude Haiku, cached per market title):
prompt = f"""
Given this prediction market title, list which stock tickers and sectors
from the following universe would be most affected if this market resolves YES.
Market: "{market_title}"
Stock universe: {ALL_US_TICKERS + ALL_KR_TICKERS}
Sectors: Technology, Energy, Defense, Finance, Healthcare, Consumer
Return JSON: {{"tickers": [...], "sectors": [...], "rationale": "..."}}
Only include tickers with a clear, direct connection. Empty list if none.
"""
Results cached in a simple JSON file model_cache/market_ticker_map.json — re-used across runs, keyed by normalized market title.
build_signal(source, market, wallet, direction, db) -> PredictionMarketSignal
- Calls map_market_to_tickers()
- Sets conviction:
- "high" if wallet score ≥ 0.75 AND Kalshi corroborates same direction
- "medium" if wallet score ≥ 0.65 OR Kalshi spike alone
- "low" otherwise
- Persists to prediction_market_signals
6. Scheduler Jobs — scheduler/jobs.py¶
| Job ID | Interval | What it does |
|---|---|---|
discover_polymarket_wallets |
24h | Crawls recent geo market trades, scores all new wallets found |
score_polymarket_wallets |
6h | Re-scores flagged wallets + any wallet seen in last 7 days |
fetch_polymarket_positions |
30 min | Fetches open positions for all flagged wallets → emits wallet_open signals |
poll_kalshi_prices |
15 min | Polls Kalshi geo markets for price spikes → emits price_spike signals |
detect_signal_convergence |
30 min | Cross-checks: if flagged wallet position AND Kalshi spike on same market → emit convergence signal (highest conviction) |
7. API Endpoints — api/routes/prediction_markets.py¶
GET /api/prediction-markets/signals
?source=polymarket|kalshi|all # default all
?conviction=high|medium|low # filter
?ticker=AAPL # filter by affected ticker
?days=7 # lookback (default 7)
?limit=50
GET /api/prediction-markets/wallets
?flagged=true # only flagged wallets
?min_score=0.6
?limit=100
GET /api/prediction-markets/wallets/{address}
# full score history + current open positions for one wallet
Mount under api/main.py.
8. Dashboard Tab — frontend/src/¶
New page: Prediction Markets
Wallet Leaderboard panel (left): - Table: Address (truncated), Geo Win Rate, Edge Score, Composite Score, # Trades, Flagged badge - Sort by composite score desc - Click row → expand to show open positions and which tickers they map to
Signal Feed panel (right):
- Chronological feed of recent signals
- Each card: source badge (Polymarket / Kalshi), market title, direction (YES/NO), conviction badge (color-coded), affected tickers as chips, timestamp
- convergence signals pinned to top and highlighted
Kalshi Heatmap (bottom): - Grid of tracked Kalshi markets, colored by recent price movement - Green = rising YES price, red = falling YES price - Click cell → market detail with price history sparkline
Integration with existing stock detail view:
- In the ticker detail panel, add a "Prediction Market Activity" section
- Shows any active signals where this ticker is in affected_tickers
- Pulls from GET /api/prediction-markets/signals?ticker={symbol}
Wallet Scoring Deep Dive¶
The composite score is designed to resist two failure modes:
-
Favorite-betting bias — a wallet that only bets on 90% favorites will have a high win rate but zero edge. The
edgecomponent (avg entry-to-outcome delta) corrects for this. A wallet buying "US raises tariffs" at 0.2 and winning scores far better than one buying at 0.85. -
Small sample noise —
WALLET_MIN_GEO_TRADES = 5prevents a wallet with one lucky bet from being flagged. The volume weight further depresses scores for low-volume accounts.
Score components:
win_rate = resolved_wins / resolved_geo_trades # 0–1
edge_score = clip(mean(win_price - entry_price), 0, 1) # 0–1, clipped
volume_weight = log10(total_usd_volume + 1) / log10(1e7) # normalized to ~$10M cap
recency_boost = upweight trades in last 30d by 3×
composite = 0.4 * win_rate + 0.4 * edge_score + 0.2 * volume_weight
Flag threshold 0.65 means a wallet needs both high win rate AND meaningful edge — betting $100 on favorites won't get you there.
Open Questions¶
- Polymarket API auth — the CLOB API public endpoints (markets, trades) appear unauthenticated. Verify rate limits before setting polling intervals; may need to add delays.
- LLM classification cost — classifying thousands of Polymarket markets with Haiku would cost roughly $0.01–0.05/day at current volume; cache aggressively. Consider regex-only classification as a fallback if cost becomes an issue.
- Market resolution lag — Polymarket markets can take days to resolve after the event. Wallet scoring should only count resolved markets; need to handle the
unresolvedstate cleanly. - Kalshi auth — the Kalshi public market data API is currently unauthenticated for reads; confirm this holds for the markets/prices endpoints we need.
- Ticker mapping accuracy — LLM mapping will have false positives (e.g. "Will NATO expand?" mapping to defense stocks broadly). Consider adding a confidence threshold from the mapping response and only including tickers with
confidence >= 0.7. - ARRAY column type — Use
Column(JSON)foraffected_tickersandaffected_sectorsinstead ofARRAY(String). JSON works on both Supabase (Postgres) and SQLite local dev without any environment branching, and SQLAlchemy serializes/deserializes it transparently. No special handling needed in the API layer. - Wallet discovery cold start —
POLYMARKET_SEED_WALLETSstarts empty. Thediscover_polymarket_walletsjob will bootstrap from market trade history, but the first run will take longer. Consider seeding with known high-volume addresses from public Polymarket analytics sites.
Status¶
draft