RAG Prompt Engineering Experiments

An interactive RAG-based chatbot that allows users to upload and query PDF documents. The system uses FAISS vector search for retrieval and a local TinyLlama model for answer generation. It supports multiple prompt strategies, real-time chat interaction, and document-grounded responses with source tracking.

Key Features

PDF ingestion and parsing using PyPDFLoader
Text chunking with overlap for better context retrieval
Vector embeddings using sentence-transformers (MiniLM)
FAISS vector database for fast similarity search
Retrieval-Augmented Generation (RAG) pipeline
Local LLM inference using TinyLlama (no API required)
Multiple prompt strategies (basic, few-shot, guardrail)
Interactive chat UI built with Streamlit
Conversation memory using session state
Response latency tracking
Source document highlighting for transparency
Expandable UI to inspect retrieved chunks

Tech Stack

PythonLangChainFAISSHuggingFace TransformersTinyLlamaSentence-TransformersStreamlitPyTorch

Screenshots