Skip to content

Market Pulse Docs

Sentiment Analysis

kosungjunmichael/FinTrack

Sentiment Analysis — Spec¶

Overview¶

Bilingual sentiment scoring for news articles and social posts. A unified score range is used across both languages.

Score Range¶

Range: -1.0 (very negative) to +1.0 (very positive)
Labels:
positive: score ≥ 0.15
negative: score ≤ -0.15
neutral: otherwise

English Scoring¶

Primary model: ProsusAI/finbert (FinBERT)
Fallback: VADER + finance lexicon
model_used value: "en_finbert" (primary) or "vader_finance" (fallback)

Korean Scoring¶

Primary model: snunlp/KR-FinBert-SC
Fallback: keyword lexicon
model_used value: "kr_finbert" (primary) or "ko_lexicon" (fallback)
Confidence is evidence-damped: abs(score) × (count / (count + 5)) — low keyword matches produce low confidence even if all matches are one-sided
model_used value "none" for empty/unscored text

Scoring Schedule¶

Batch scoring runs after every news fetch job (fetch_us_news, fetch_kr_news)
Also runs inside run_predictions before training as a safety net
Articles may sit unscored for up to 15–30 minutes after ingestion

Constraints¶

Articles without a SentimentScore row are left unscored until score_unscored_articles() runs
Features with no sentiment data default to 0.0 (neutral) in build_features()
model_used: "none" indicates empty text — not a scoring failure