Skip to content

Political Insider Trades Scraper — Plan

Motivation

High-profile political insiders — particularly those in or aligned with the current administration — have a documented history of trading stocks in sectors directly affected by their policy decisions. Tracking these trades gives us an additional signal that is uncorrelated with traditional sentiment or momentum features. The goal is to ingest disclosed trades from a curated watchlist of politicians and associates, surface them in the dashboard, and optionally feed ticker overlap into the ML model as a feature.

Scope

In scope: - Scrape Capitol Trades (capitoltrades.com) for a configurable watchlist of politicians - Pull SEC EDGAR Form 4 filings for non-congressional insiders (e.g. Elon Musk as Tesla filer) - Store trades in a new PoliticalTrade DB model - Scheduler job polling every 6 hours - New API endpoint exposing recent trades, filterable by ticker - Dashboard tab: "Political Trades" — timeline of trades, ticker overlap with our universe, trader profile - Flag ticker overlap: if a tracked insider buys/sells a ticker we cover, annotate it in the dashboard

Out of scope: - OGE executive branch disclosures (complex PDF parsing, low signal-to-noise) - Trading analysis or ML feature integration (follow-on task once data is stable) - Real-time trade alerts (STOCK Act allows 45-day filing lag — data is inherently delayed) - Tracking trades by people not on the configured watchlist

Watchlist

Defined in config.py as POLITICAL_WATCHLIST — a list of dicts with name, capitol_trades_slug, and optional edgar_cik. Starting roster:

Name Role Source
Nancy Pelosi House (D-CA) Capitol Trades
Paul Pelosi Spouse of above Capitol Trades (filed under Nancy)
Mitch McConnell Senate (R-KY) Capitol Trades
Marjorie Taylor Greene House (R-GA) Capitol Trades
Jim Jordan House (R-OH) Capitol Trades
Mike Johnson House Speaker (R-LA) Capitol Trades
Tommy Tuberville Senate (R-AL) Capitol Trades
Dan Crenshaw House (R-TX) Capitol Trades
Scott Bessent Treasury Secretary Capitol Trades (if filed as member)
Elon Musk DOGE / Tesla insider SEC EDGAR Form 4 (CIK: 0001494730)

Watchlist is intentionally bipartisan (Pelosi included) — the signal is political access, not party affiliation.

Approach

1. DB model — db/models.py

Add PoliticalTrade:

class PoliticalTrade(Base):
    __tablename__ = "political_trades"

    id           = Column(Integer, primary_key=True, index=True)
    trader_name  = Column(String(100), nullable=False)
    trader_role  = Column(String(100))
    ticker       = Column(String(20), nullable=False)
    market       = Column(String(5), default="US")
    trade_type   = Column(String(10), nullable=False)   # "buy" | "sell"
    amount_low   = Column(Float)                         # lower bound of disclosed range (USD)
    amount_high  = Column(Float)                         # upper bound
    traded_at    = Column(DateTime, nullable=False)      # date of trade (UTC midnight)
    filed_at     = Column(DateTime)                      # date of STOCK Act filing
    source       = Column(String(50), nullable=False)    # "capitol_trades" | "edgar_form4"
    source_url   = Column(Text)
    fetched_at   = Column(DateTime, default=utcnow)

    __table_args__ = (
        UniqueConstraint("trader_name", "ticker", "traded_at", "trade_type", name="uq_political_trade"),
        Index("ix_political_trades_ticker_traded", "ticker", "traded_at"),
        Index("ix_political_trades_trader", "trader_name"),
    )

2. Config — config.py

Add POLITICAL_WATCHLIST list of dicts. Each entry:

{"name": "Marjorie Taylor Greene", "role": "House (R-GA)", "capitol_trades_slug": "marjorie-taylor-greene", "edgar_cik": None}

3. Scraper — scraper/political_trades.py

Two fetching functions:

fetch_capitol_trades(slug: str) -> list[dict] - GET https://www.capitoltrades.com/politicians/{slug} with requests + BeautifulSoup - Parse the trades table: date, ticker, asset name, type (buy/sell), amount range - Return normalized list of trade dicts - Respect robots.txt; add 1–2s delay between requests - Handle pagination (default page size is 10; iterate until empty page)

fetch_edgar_form4(cik: str) -> list[dict] - GET https://efts.sec.gov/LATEST/search-index?q=%22{cik}%22&dateRange=custom&startdt={lookback}&forms=4 - Parse XML filing for transaction date, ticker, shares, price, buy/sell indicator - Return normalized list

upsert_political_trades(trades: list[dict], db: Session) - Insert with INSERT ... ON CONFLICT DO NOTHING using the unique constraint - Log count of new rows inserted

4. Scheduler job — scheduler/jobs.py

Add job_fetch_political_trades(): - Loops over POLITICAL_WATCHLIST - Calls appropriate fetch function per source - Calls upsert_political_trades() - Interval: every 6 hours (trades are delayed 45 days by law, so polling cadence doesn't need to be aggressive — 6h is fine)

Register in scheduler/jobs.py with job ID fetch_political_trades.

5. API endpoint — api/routes/political_trades.py

GET /api/political-trades
  ?ticker=AAPL           # filter by ticker (optional)
  ?trader=...            # filter by trader name (optional)
  ?days=30               # lookback window (default 30)
  ?limit=100

Response shape:

[{
  "trader_name": "Marjorie Taylor Greene",
  "trader_role": "House (R-GA)",
  "ticker": "NVDA",
  "trade_type": "buy",
  "amount_low": 1000,
  "amount_high": 15000,
  "traded_at": "2026-04-10T00:00:00Z",
  "filed_at": "2026-05-01T00:00:00Z",
  "source_url": "https://..."
}]

Mount under api/main.py.

6. Dashboard tab — frontend/src/

New page: Political Trades

  • Trade feed: sorted by traded_at desc, grouped by date
  • Columns: Trader, Role, Ticker, Type (buy/sell with color), Amount Range, Trade Date, Filed Date
  • Filter bar: trader dropdown, ticker search, date range
  • "Our Universe" badge: highlight rows where ticker is in our tracked stock universe
  • Trader profile cards: photo (from Capitol Trades), total buy/sell count in last 90 days

7. Ticker overlap annotation (stretch)

In the existing dashboard's stock detail view, add a "Political Activity" section showing any trades in the last 90 days for that ticker. Pulled from the same API endpoint with ?ticker=SYMBOL.

Open Questions

  1. Capitol Trades scraping stability — the site has no public API. If they block scrapers, fall back to the Senate/House Financial Disclosure XML feeds (congress.gov publishes raw XML annually; not real-time).
  2. Amount ranges — STOCK Act only requires disclosure of a range (e.g. $1K–$15K, $15K–$50K). We store both bounds and display as range in UI — no point trying to infer a midpoint.
  3. Non-US tickers — some politicians hold foreign ETFs or ADRs. Store ticker as-is; only the "Our Universe" badge logic needs to check against ALL_US_TICKERS.
  4. Elon Musk / EDGAR complexity — Form 4s are filed per company. Musk files for TSLA; we may want to expand to other Form 4 filers over time. Keep edgar_cik in config to make this easy.
  5. Rate limiting — Capitol Trades is a small site. Keep inter-request delay at 2s minimum and cache pages locally for 1 hour to avoid hammering.

Status

draft