Files
stock_chart_site/backend/app/db/migrations/002_sentiment_view.sql
tkrmagid edda01adbf feat(phase-2): KR-FinBERT 감성 스코어링 + 일별 집계 뷰
- backend/app/nlp/finbert.py: snunlp/KR-FinBert-SC 어댑터.
  - score = P(pos) - P(neg) ∈ [-1, +1], label = argmax (neg/neu/pos)
  - 768d mean-pooled last hidden state → news.embedding (VECTOR) 저장
  - settings.huggingface_token 인증, lazy singleton, cuda/cpu auto
- backend/app/nlp/score_news.py: news 테이블에서 sentiment_score IS NULL
  행을 배치 스코어 → UPDATE (... embedding=(:e)::vector). 종목 필터 + limit 옵션.
- backend/app/db/migrations/002_sentiment_view.sql: v_sentiment_daily 뷰.
  종목·KST 일별 n_articles, mean_score, pos/neg/neu_ratio, weighted_score
  (naver_finance 1.0 / google_rss 0.7 / dart 0.5).
- backend/app/db/migrate.py: 이미 실행 중인 DB 에 새 SQL 마이그레이션 적용용
  CLI. 모든 SQL 파일은 idempotent.
- refresh_one.py: refresh 끝에 종목당 200건까지 finbert 스코어, finbert
  SourceStatus 를 RefreshReport 에 추가.
- daily_batch.py: 모든 종목 처리 후 score_pending_news(limit=2000) 로 mop-up.

모델 캐시는 docker-compose hf_cache 볼륨(/root/.cache/huggingface).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 15:57:34 +09:00

33 lines
1.4 KiB
SQL

-- Phase 2: 일별 종목별 감성 집계 뷰.
-- weighted_score : 소스별 가중치 적용
-- naver_finance 1.0 (가장 직접적인 종목 페이지 뉴스)
-- google_rss 0.7 (관련성 노이즈 있음)
-- dart 0.5 (공시는 short title 만으로는 감성이 약함)
\set ON_ERROR_STOP on
CREATE OR REPLACE VIEW v_sentiment_daily AS
SELECT
code,
(published_at AT TIME ZONE 'Asia/Seoul')::date AS date,
COUNT(*) AS n_articles,
AVG(sentiment_score)::REAL AS mean_score,
AVG(CASE WHEN sentiment_label = 'positive' THEN 1.0 ELSE 0.0 END)::REAL AS pos_ratio,
AVG(CASE WHEN sentiment_label = 'negative' THEN 1.0 ELSE 0.0 END)::REAL AS neg_ratio,
AVG(CASE WHEN sentiment_label = 'neutral' THEN 1.0 ELSE 0.0 END)::REAL AS neu_ratio,
AVG(
sentiment_score * CASE source
WHEN 'naver_finance' THEN 1.0
WHEN 'google_rss' THEN 0.7
WHEN 'dart' THEN 0.5
ELSE 0.6
END
)::REAL AS weighted_score
FROM news
WHERE sentiment_score IS NOT NULL
AND code IS NOT NULL
GROUP BY code, (published_at AT TIME ZONE 'Asia/Seoul')::date;
COMMENT ON VIEW v_sentiment_daily IS
'Phase 2: KR-FinBERT 점수를 종목·일(KST) 단위로 집계. Phase 4 LGBM 피처 + UI 차트 보조 데이터로 사용.';