Building a RAG-Based Subsidy Matching System from Scratch with Python
What I Built A RAG (Retrieval-Augmented Generation) system that helps Japanese small businesses find government subsidies. Users describe their business situation, and the system retrieves relevant...

Source: DEV Community
What I Built A RAG (Retrieval-Augmented Generation) system that helps Japanese small businesses find government subsidies. Users describe their business situation, and the system retrieves relevant subsidies from a vector database, then generates a detailed answer using Claude API. Why RAG Instead of Just Using an LLM? LLMs alone have three problems for this use case: Hallucination — They confidently make up subsidy details that don't exist Stale data — Subsidy information updates frequently; LLM training data can't keep up No sources — Users need to verify the information, but LLMs don't cite where it came from RAG solves all three by retrieving real data first, then passing it to the LLM as context. The LLM generates answers grounded in actual documents, not its training data. Architecture User Query │ ▼ ┌─────────────┐ ┌──────────────┐ │ Retriever │────▶│ Generator │ │ (embed + │ │ (Claude API) │ │ search) │ └──────┬───────┘ └──────┬──────┘ │ │ ▼ ┌──────▼──────┐ AI Answer │ ChromaDB