검색 증강 생성(RAG) 개요 및 VectorDB, 임베딩 모델

1️⃣ RAG란 무엇인가?

RAG (Retrieval-Augmented Generation) 은
LLM이 답을 만들 때 외부 데이터베이스에서 최신 정보를 검색해 활용하는 기술입니다.

💬 쉽게 말해
“LLM이 모르면 찾아서 답하는 AI” 입니다.

🔧 왜 필요한가?

기존 LLM 한계	RAG로 보완되는 점
오래된 학습 데이터	최신 문서 검색 후 활용 가능
보안 등의 문제로 기업 내부 데이터 부족	사내 문서·매뉴얼 연동 가능
환각(hallucination) 발생	실제 문서 근거로 답변 생성
근거 제시 불가	답변 출처 제공 가능

2️⃣ 작동 구조 요약

사용자 질문 → 임베딩(벡터화) → VectorDB 검색 → 관련 문서 반환 → LLM이 답변 생성

질문과 문서를 벡터(숫자) 로 변환
VectorDB에서 의미적으로 유사한 문서를 검색
검색된 문서를 프롬프트에 함께 전달
LLM이 근거 기반 답변 생성

3️⃣ VectorDB란?

비정형 데이터(문서, 이미지 등) 를 벡터로 저장하고
의미적 유사도로 검색할 수 있는 데이터베이스입니다.

대표 VectorDB	특징
FAISS	오픈소스, 빠른 검색 속도
Pinecone	클라우드 기반, 관리형 서비스
Qdrant	실시간 업데이트, 필터링 강력
Weaviate	GraphQL API, 하이브리드 검색

✨ 주요 기능

Similarity Search: 의미적 유사도 기반 검색
Metadata Filtering: 출처·날짜 등 조건부 검색
Hybrid Search: 벡터 + 키워드 검색 결합

4️⃣ 임베딩(Embedding) 모델

텍스트 → 숫자 벡터 변환
→ 의미가 비슷한 문장은 벡터 공간상 가까움

문장	벡터 예시
“강아지가 귀엽다”	[0.12, 0.85, -0.33, …]
“고양이가 사랑스럽다”	[0.11, 0.80, -0.31, …]

🌐 주요 임베딩 모델

구분	예시	특징
OpenAI 상용 모델	text-embedding-3-small / large	고성능, 다국어, API 기반
오픈소스 모델	all-MiniLM-L6-v2, MPNet	무료, 로컬 실행 가능
도메인 특화 모델	SciBERT, CodeBERT	분야별 최적화

💡 추천: text-embedding-3-small
빠르고 저렴하며 다국어 지원
RAG 실습에 적합

5️⃣ RAG 실습: Pinecone + LangChain

📦 환경 준비

pip install -U pinecone langchain_pinecone

.env 파일 예시:

OPENAI_API_KEY=your_openai_key
PINECONE_API_KEY=your_pinecone_key
PINECONE_ENVIRONMENT=us-east-1-aws

⚙️ 코드 1: 초기화

from dotenv import load_dotenv
from pinecone import Pinecone, ServerlessSpec
import os

# 환경변수 불러오기
load_dotenv()

# Pinecone 초기화
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

# 인덱스 생성
index_name = "example-index"

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )

# 인덱스 초기화
index = pc.Index(index_name)
#
index.describe_index_stats()

⚙️ 코드 2: 문서 저장

from langchain_core.documents import Document
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings


# 예시 문서 생성 (내용과 메타데이터)
docs = [
    Document(page_content="LangChain is great for building AI apps.", metadata={"source": "tweet"}),
    Document(page_content="Tomorrow will be cloudy with a high of 62 degrees.", metadata={"source": "news"}),
    Document(page_content="Building an exciting new project with LangChain - come check it out!", metadata={"source": "tweet"}),
    Document(page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.", metadata={"source": "news"})
]

# OpenAI 임베딩
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# 벡터스토어 초기화 (PineconeVectorStore에 Pinecone 인덱스와 임베딩 객체 연결)
vector_store = PineconeVectorStore(index=index, embedding=embeddings)

# 문서들을 벡터 임베딩하여 Pinecone에 저장
vector_store.add_documents(docs)

⚙️ 코드 3: 검색

# 질의 1: 트윗에서 LangChain 관련 검색 
query1 = "LangChain provides abstractions to make working with LLMs easy"
results1 = vector_store.similarity_search_with_score(query1, k=2, filter={"source": "tweet"})
print("Query 1 Results:")
for res ,score in results1:
    print(f"* [SIM={score:.6f}] {res.page_content} [{res.metadata}]")

# 질의 2: 뉴스에서 날씨 관련 검색 
query2 = "Will it be hot tomorrow?"
results2 = vector_store.similarity_search_with_score(query2, k=2, filter={"source": "news"})
print("\nQuery 2 Results:")
for res, score in results2:
    print(f"* [SIM={score:.6f}] {res.page_content} [{res.metadata}]")

예상 출력

Query 1 Results:
* [SIM=0.605957] LangChain is great for building AI apps. [{'source': 'tweet'}]
* [SIM=0.428503] Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]

Query 2 Results:
* [SIM=0.581758] Tomorrow will be cloudy with a high of 62 degrees. [{'source': 'news'}]
* [SIM=0.569310] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]

6️⃣ 정리

구분	설명
핵심 구조	임베딩 → VectorDB → 검색 → 생성
장점	최신 정보 반영, 환각 감소, 근거 제공
도구	LangChain + Pinecone + OpenAI
추천 모델	text-embedding-3-small

🧠 “RAG는 LLM이 ‘모르는 걸 아는 척하지 않게’ 만드는 기술이다.”