Political Insider Trades Scraper — Plan¶
Motivation¶
High-profile political insiders — particularly those in or aligned with the current administration — have a documented history of trading stocks in sectors directly affected by their policy decisions. Tracking these trades gives us an additional signal that is uncorrelated with traditional sentiment or momentum features. The goal is to ingest disclosed trades from a curated watchlist of politicians and associates, surface them in the dashboard, and optionally feed ticker overlap into the ML model as a feature.
Scope¶
In scope:
- Scrape Capitol Trades (capitoltrades.com) for a configurable watchlist of politicians
- Pull SEC EDGAR Form 4 filings for non-congressional insiders (e.g. Elon Musk as Tesla filer)
- Store trades in a new PoliticalTrade DB model
- Scheduler job polling every 6 hours
- New API endpoint exposing recent trades, filterable by ticker
- Dashboard tab: "Political Trades" — timeline of trades, ticker overlap with our universe, trader profile
- Flag ticker overlap: if a tracked insider buys/sells a ticker we cover, annotate it in the dashboard
Out of scope: - OGE executive branch disclosures (complex PDF parsing, low signal-to-noise) - Trading analysis or ML feature integration (follow-on task once data is stable) - Real-time trade alerts (STOCK Act allows 45-day filing lag — data is inherently delayed) - Tracking trades by people not on the configured watchlist
Watchlist¶
Defined in config.py as POLITICAL_WATCHLIST — a list of dicts with name, capitol_trades_slug, and optional edgar_cik. Starting roster:
| Name | Role | Source |
|---|---|---|
| Nancy Pelosi | House (D-CA) | Capitol Trades |
| Paul Pelosi | Spouse of above | Capitol Trades (filed under Nancy) |
| Mitch McConnell | Senate (R-KY) | Capitol Trades |
| Marjorie Taylor Greene | House (R-GA) | Capitol Trades |
| Jim Jordan | House (R-OH) | Capitol Trades |
| Mike Johnson | House Speaker (R-LA) | Capitol Trades |
| Tommy Tuberville | Senate (R-AL) | Capitol Trades |
| Dan Crenshaw | House (R-TX) | Capitol Trades |
| Scott Bessent | Treasury Secretary | Capitol Trades (if filed as member) |
| Elon Musk | DOGE / Tesla insider | SEC EDGAR Form 4 (CIK: 0001494730) |
Watchlist is intentionally bipartisan (Pelosi included) — the signal is political access, not party affiliation.
Approach¶
1. DB model — db/models.py¶
Add PoliticalTrade:
class PoliticalTrade(Base):
__tablename__ = "political_trades"
id = Column(Integer, primary_key=True, index=True)
trader_name = Column(String(100), nullable=False)
trader_role = Column(String(100))
ticker = Column(String(20), nullable=False)
market = Column(String(5), default="US")
trade_type = Column(String(10), nullable=False) # "buy" | "sell"
amount_low = Column(Float) # lower bound of disclosed range (USD)
amount_high = Column(Float) # upper bound
traded_at = Column(DateTime, nullable=False) # date of trade (UTC midnight)
filed_at = Column(DateTime) # date of STOCK Act filing
source = Column(String(50), nullable=False) # "capitol_trades" | "edgar_form4"
source_url = Column(Text)
fetched_at = Column(DateTime, default=utcnow)
__table_args__ = (
UniqueConstraint("trader_name", "ticker", "traded_at", "trade_type", name="uq_political_trade"),
Index("ix_political_trades_ticker_traded", "ticker", "traded_at"),
Index("ix_political_trades_trader", "trader_name"),
)
2. Config — config.py¶
Add POLITICAL_WATCHLIST list of dicts. Each entry:
{"name": "Marjorie Taylor Greene", "role": "House (R-GA)", "capitol_trades_slug": "marjorie-taylor-greene", "edgar_cik": None}
3. Scraper — scraper/political_trades.py¶
Two fetching functions:
fetch_capitol_trades(slug: str) -> list[dict]
- GET https://www.capitoltrades.com/politicians/{slug} with requests + BeautifulSoup
- Parse the trades table: date, ticker, asset name, type (buy/sell), amount range
- Return normalized list of trade dicts
- Respect robots.txt; add 1–2s delay between requests
- Handle pagination (default page size is 10; iterate until empty page)
fetch_edgar_form4(cik: str) -> list[dict]
- GET https://efts.sec.gov/LATEST/search-index?q=%22{cik}%22&dateRange=custom&startdt={lookback}&forms=4
- Parse XML filing for transaction date, ticker, shares, price, buy/sell indicator
- Return normalized list
upsert_political_trades(trades: list[dict], db: Session)
- Insert with INSERT ... ON CONFLICT DO NOTHING using the unique constraint
- Log count of new rows inserted
4. Scheduler job — scheduler/jobs.py¶
Add job_fetch_political_trades():
- Loops over POLITICAL_WATCHLIST
- Calls appropriate fetch function per source
- Calls upsert_political_trades()
- Interval: every 6 hours (trades are delayed 45 days by law, so polling cadence doesn't need to be aggressive — 6h is fine)
Register in scheduler/jobs.py with job ID fetch_political_trades.
5. API endpoint — api/routes/political_trades.py¶
GET /api/political-trades
?ticker=AAPL # filter by ticker (optional)
?trader=... # filter by trader name (optional)
?days=30 # lookback window (default 30)
?limit=100
Response shape:
[{
"trader_name": "Marjorie Taylor Greene",
"trader_role": "House (R-GA)",
"ticker": "NVDA",
"trade_type": "buy",
"amount_low": 1000,
"amount_high": 15000,
"traded_at": "2026-04-10T00:00:00Z",
"filed_at": "2026-05-01T00:00:00Z",
"source_url": "https://..."
}]
Mount under api/main.py.
6. Dashboard tab — frontend/src/¶
New page: Political Trades
- Trade feed: sorted by
traded_atdesc, grouped by date - Columns: Trader, Role, Ticker, Type (buy/sell with color), Amount Range, Trade Date, Filed Date
- Filter bar: trader dropdown, ticker search, date range
- "Our Universe" badge: highlight rows where ticker is in our tracked stock universe
- Trader profile cards: photo (from Capitol Trades), total buy/sell count in last 90 days
7. Ticker overlap annotation (stretch)¶
In the existing dashboard's stock detail view, add a "Political Activity" section showing any trades in the last 90 days for that ticker. Pulled from the same API endpoint with ?ticker=SYMBOL.
Open Questions¶
- Capitol Trades scraping stability — the site has no public API. If they block scrapers, fall back to the Senate/House Financial Disclosure XML feeds (congress.gov publishes raw XML annually; not real-time).
- Amount ranges — STOCK Act only requires disclosure of a range (e.g. $1K–$15K, $15K–$50K). We store both bounds and display as range in UI — no point trying to infer a midpoint.
- Non-US tickers — some politicians hold foreign ETFs or ADRs. Store ticker as-is; only the "Our Universe" badge logic needs to check against
ALL_US_TICKERS. - Elon Musk / EDGAR complexity — Form 4s are filed per company. Musk files for TSLA; we may want to expand to other Form 4 filers over time. Keep
edgar_cikin config to make this easy. - Rate limiting — Capitol Trades is a small site. Keep inter-request delay at 2s minimum and cache pages locally for 1 hour to avoid hammering.
Status¶
draft