Building a RAG System for PDF Chatting with Local DeepSeek-R1 LLM

Falah Gatea
4 min readJan 29, 2025

--

Introduction

In today’s rapidly evolving world of artificial intelligence (AI), large language models (LLMs) are revolutionizing the way we interact with technology. From intelligent chatbots to advanced reasoning tools, these models are making it easier than ever to automate tasks, solve complex problems, and build innovative applications. Among the many powerful tools available, DeepSeek-R1 stands out, offering state-of-the-art AI capabilities for local deployment without relying on external APIs.

This week, DeepSeek-R1 has been the talk of the town, and for good reason. Curious to explore its capabilities, I decided to test it out by building a Retrieval-Augmented Generation (RAG) system that enables PDF-based chatbot interactions. I focused on asking math-related questions, ranging from basic arithmetic to more complex problem-solving. The results were impressive. If you’re curious about how I did it, I’ve outlined the steps below and included the full source code at the end of this article.

Why Use DeepSeek-R1 for a RAG System?

DeepSeek-R1 provides powerful LLMs that excel in both conversational AI and complex reasoning. Some key advantages include: • High Performance: Optimized for both chat-based applications and mathematical problem-solving. • Flexibility: Can be integrated with various frameworks and tools. • Scalability: Supports local usage for offline AI applications without relying on APIs.

By leveraging DeepSeek-R1, developers can build AI-powered solutions that prioritize privacy, efficiency, and local deployment. This approach enables the implementation of a RAG system for chatting with PDFs, making it an ideal solution for knowledge retrieval and interactive document analysis.

Setting Up the AI Chatbot Locally with Ollama LLM

To ensure a seamless experience, I used Streamlit to create a user-friendly interface and Ollama LLM to run the AI model locally. Below are the steps to set up the chatbot:

1. Install Required Dependencies

Before running the code,

To download and use DeepSeek-R1 locally with Ollama, run the following command:

ollama pull deepseek-r1

This will download the DeepSeek-R1 model onto your local machine for use with Ollama.

And then install the necessary Python libraries using the following command:

pip install streamlit langchain_community langchain_text_splitters langchain_core langchain_ollama pdfplumber

2. Understanding the Code

The chatbot application consists of several key components:

  • Uploading PDFs: Allows users to upload documents containing information that the AI can use.
  • Processing Documents: Splits the documents into manageable chunks for efficient searching.
  • Vector Storage: Uses an in-memory vector store to retrieve relevant document sections.
  • Generating Answers: The AI model generates concise responses based on the context of the retrieved document.

3. Full Source Code

import streamlit as st
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_ollama import OllamaEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM

template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""

pdfs_directory = './pdfs/'

embeddings = OllamaEmbeddings(model="deepseek-r1")
vector_store = InMemoryVectorStore(embeddings)

model = OllamaLLM(model="deepseek-r1")

def upload_pdf(file):
with open(pdfs_directory + file.name, "wb") as f:
f.write(file.getbuffer())

def load_pdf(file_path):
loader = PDFPlumberLoader(file_path)
documents = loader.load()
return documents

def split_text(documents):
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
add_start_index=True
)
return text_splitter.split_documents(documents)

def index_docs(documents):
vector_store.add_documents(documents)

def retrieve_docs(query):
return vector_store.similarity_search(query)

def answer_question(question, documents):
context = "\n\n".join([doc.page_content for doc in documents])
prompt = ChatPromptTemplate.from_template(template)
chain = prompt | model
return chain.invoke({"question": question, "context": context})

uploaded_file = st.file_uploader(
"Upload PDF",
type="pdf",
accept_multiple_files=False
)

if uploaded_file:
upload_pdf(uploaded_file)
documents = load_pdf(pdfs_directory + uploaded_file.name)
chunked_documents = split_text(documents)
index_docs(chunked_documents)

question = st.chat_input()

if question:
st.chat_message("user").write(question)
related_documents = retrieve_docs(question)
answer = answer_question(question, related_documents)
st.chat_message("assistant").write(answer)

4. Running the Chatbot

To launch the chatbot application, run the following command:

streamlit run rag_chat.py

This will open a local web interface where users can upload PDFs and ask questions based on the document content.

Conclusion

The DeepSeek API and local Ollama LLM provide a powerful way to build AI chatbots that can process documents and answer queries efficiently. By running the model locally, you ensure greater control over data privacy and reduce dependency on cloud services. Whether you’re building an AI assistant for research, education, or business applications, this setup offers a great starting point.

References

https://www.deepseek.com/

https://ollama.com/

Thanks for reading If you love this post, give some claps.

Connect with me on FB, Github,linkedin,my blog, PyPi, and my YouTube channel,Email:falahgs07@gmail.com

--

--

Falah Gatea
Falah Gatea

Written by Falah Gatea

Developer Programmer, in Python and deep learning. IOT Microcontroller Developer iraqprogrammer.wordpress.com

Responses (1)