System Overview — Spec¶
Purpose¶
Market Pulse is a personal stock prediction tracker that ingests price data and news for US and Korean markets, runs bilingual sentiment analysis, and produces ML-based directional predictions across three time horizons. The dashboard provides interactive exploration with bilingual UI and headline translation.
Operational Constraints¶
| Constraint | Rule |
|---|---|
| Timestamps | All stored as UTC — never KST or ET |
| KR tickers | Bare 6-digit KRX code only (005930, not 005930.KS) |
| Market field | "US" or "KR" on StockPrice and NewsArticle — the sole market discriminator |
| Language field | "en" or "ko" on NewsArticle — set by scraper, used for sentiment and translation |
| Database | Supabase PostgreSQL (AWS ap-northeast-1) in production; SQLite fallback when DATABASE_URL is unset (local dev only) |
| Finnhub | 60 requests/minute on free tier; all US tickers fetched in one 15-min cycle |
| StockTwits | Disabled — public API restricted/unreliable; job commented out in scheduler/jobs.py |
| Azure Translator | Free tier: 2M characters/month — batch translate to minimize calls |
| GOOGLE_API_KEY | Required for My Portfolio screenshot parsing (Gemini 2.5 Flash) |
| ANTHROPIC_API_KEY | Loaded but not currently wired to any feature |
| Feature vector | FEATURE_ORDER in predictor.py is order-critical — training and inference must match exactly |
| User sessions | 30-day TTL; stored in user_sessions table and as a browser cookie named mp_session |
Sector Keywords¶
Six sectors tracked for trend analysis: AI, semiconductor, energy, cloud, EV_battery, manufacturing.
Both English and Korean keywords are defined in config.py:SECTOR_KEYWORDS.
Planned / Not Yet Built¶
- Polygon.io integration for US news backup
- Prediction accuracy tracking and backtesting against realized returns
- Monte Carlo simulation — run 10k price path scenarios per ticker, output only highest win-rate moves as a probability distribution layer on top of RF predictions
- Bootstrap resampling — resample historical returns (vs. assuming a distribution) for more realistic fat-tail scenario simulation
- Regime-switching simulation — simulate markets transitioning between bull/bear/sideways states; integrates with existing GARCH + K-means vol regime work
- Model retraining triggered by accuracy degradation rather than a fixed schedule
- Alerts or push notifications for high-confidence predictions
- financialdatasets.ai — see
docs/plans/financialdatasets-integration.md