Skip to content

Plan: Swap Translation Backend to Azure Translator

Context

The headline translation feature is already built and working (scraper/translator.py), currently wired for DeepL. This plan covers swapping the backend to Azure Translator (Bing) when an API key is available.

Why Azure over DeepL: - 2M characters/month free, forever (not a one-time credit) - Same excellent EN↔KO quality - Credit card for identity verification only — not charged on F0 tier

Setup (one-time)

  1. Go to portal.azure.com — sign in with a Microsoft account
  2. Create a new resource → search "Translator"
  3. Select pricing tier F0 (Free) — 2M chars/month
  4. Once created, go to Keys and Endpoint → copy Key 1 and the Region (e.g. eastus)
  5. Add to .env: AZURE_TRANSLATOR_KEY=your_key_here AZURE_TRANSLATOR_REGION=eastus

Code Changes

Only scraper/translator.py and config.py need to change. Everything else — DB caching, the toggle UI, _get_display_title(), load_recent_news() — stays exactly the same.

config.py

Replace:

DEEPL_API_KEY = os.getenv("DEEPL_API_KEY", "")

With:

AZURE_TRANSLATOR_KEY = os.getenv("AZURE_TRANSLATOR_KEY", "")
AZURE_TRANSLATOR_REGION = os.getenv("AZURE_TRANSLATOR_REGION", "eastus")

scraper/translator.py

Replace the entire file with the Azure Translator REST implementation:

import requests
from config import AZURE_TRANSLATOR_KEY, AZURE_TRANSLATOR_REGION

AZURE_ENDPOINT = "https://api.cognitive.microsofttranslator.com/translate"
AZURE_LANG_MAP = {"en": "en", "ko": "ko"}


def _get_client():
    return AZURE_TRANSLATOR_KEY or None


def batch_translate(articles: list[dict], target_lang: str, db_session) -> list[dict]:
    from db.models import NewsArticle

    if not AZURE_TRANSLATOR_KEY:
        return articles

    field = f"title_{target_lang}"
    azure_lang = AZURE_LANG_MAP.get(target_lang)
    if not azure_lang:
        return articles

    to_translate = [
        a for a in articles
        if a.get("language") != target_lang and not a.get(field)
    ]

    if not to_translate:
        return articles

    headers = {
        "Ocp-Apim-Subscription-Key": AZURE_TRANSLATOR_KEY,
        "Ocp-Apim-Subscription-Region": AZURE_TRANSLATOR_REGION,
        "Content-Type": "application/json",
    }
    body = [{"text": a["title"] or ""} for a in to_translate]

    try:
        response = requests.post(
            AZURE_ENDPOINT,
            params={"api-version": "3.0", "to": azure_lang},
            headers=headers,
            json=body,
            timeout=10,
        )
        response.raise_for_status()
        results = response.json()
    except Exception:
        return articles  # silent fallback to original

    url_to_translation = {
        a["url"]: r["translations"][0]["text"]
        for a, r in zip(to_translate, results)
    }

    for a in articles:
        url = a.get("url")
        if url in url_to_translation:
            translated = url_to_translation[url]
            a[field] = translated
            db_session.query(NewsArticle).filter(
                NewsArticle.url == url
            ).update({getattr(NewsArticle, field): translated})
    db_session.commit()

    return articles

Dependencies

  • requests — already installed (used by scrapers)
  • No new packages needed (no Azure SDK required — it's a plain REST call)
  • Can remove deepl from packages: pip uninstall deepl

Verification

  • Add AZURE_TRANSLATOR_KEY and AZURE_TRANSLATOR_REGION to .env
  • Restart dashboard
  • Go to News Feed → toggle "Show titles in: KO" — Korean translations should appear for English articles
  • Check DB: SELECT title, title_en, title_ko FROM news_articles WHERE title_en IS NOT NULL LIMIT 5;
  • Repeat page load — translations should be instant (served from cache)