Conversational Retrieval Chain Using LangChain

5 min readAug 5, 2024

In this blog, we will explore building a Conversational Retrieval Chain (CRC) using LangChain. A CRC leverages context from previous interactions to provide more relevant and accurate responses to user queries. By the end of this guide, you’ll have a fully functioning conversational agent capable of handling complex queries efficiently.

What is LangChain?
Understanding Conversational Retrieval Chains
Setting Up the Environment
Step-by-Step Implementation
Enhancing the Conversational Retrieval Chain
Conclusion

What is LangChain?

LangChain is a powerful framework designed for building applications that combine the capabilities of language models with external knowledge sources. It provides tools for creating chatbots, conversational agents, and retrieval-augmented generation systems.

LangChain: Step-by-Step Guide to Building a Custom-Knowledge Chatbot

In this article, I will introduce LangChain and demonstrate how it’s being utilized alongside OpenAI’s API to develop…

bhavikjikadara.medium.com

Use Cases of LangChain:

Chatbots: Enhance customer service with intelligent, context-aware chatbots.
Conversational Agents: Develop agents that can handle complex tasks and queries.
Information Retrieval: Create systems that retrieve relevant information based on user queries.

Understanding Conversational Retrieval Chains

A Conversational Retrieval Chain (CRC) is a system that combines retrieval-based methods with conversational capabilities. Unlike traditional retrieval systems, a CRC maintains the context of previous interactions, enabling it to provide more relevant and coherent responses.

langchain.chains.conversational_retrieval.base.BaseConversationalRetrievalChain - 🦜🔗 LangChain…

Chain for chatting with an index. Create a new model by parsing and validating input data from keyword arguments…

api.python.langchain.com

Benefits of a CRC:

Improved Context Handling: Keeps track of the conversation history.
Better Relevance: Retrieves documents and generates responses that are more relevant to the user’s query.
Enhanced User Experience: Provides a more natural and engaging interaction.

Setting Up the Environment

Before we start building our CRC, we need to set up our development environment. Ensure you have Python installed, and then install the required libraries.

Prerequisites:

Python 3.10 or later

How to install python in Windows 11?

How to download & install Python on a windows local machine?

bhavikjikadara.medium.com

OpenAI API key

Installation:

Install the necessary libraries using pip:

pip install langchain faiss-cpu openai bs4 torch langchain langchain-chroma langchain-community langchain-openai langchain-huggingface python-dotenvStep-by-Step Implementation

1. Import Necessary Packages

Load all the required packages for the project, including web scraping, language models, and LangChain components.

import bs4
import torch
from langchain import hub
from langchain.chains import create_retrieval_chain, create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.messages import AIMessage, HumanMessage
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_huggingface import ChatHuggingFace, HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os
from dotenv import load_dotenv, find_dotenv

# Load API keys and other sensitive data from a .env file.
load_dotenv(find_dotenv())

2. API and Model Configuration

Configure the language models using API keys from the environment variables.

llm = ChatOpenAI(
    api_key=os.getenv("OPENAI_API_KEY"),
    model=os.getenv("OPENAI_MODEL_NAME")
)

3. Load, Chunk, and Index the Contents of the Blog

Perform web scraping, text splitting, and indexing.

Load Documents: Scrape the blog content.

loader = WebBaseLoader(
    web_path=("https://lilianweng.github.io/posts/2023-06-23-agent/"),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    )
)
docs = loader.load()

Split Documents into Chunks: Split the scraped content into smaller, manageable chunks.

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
new_splits = splits[:1]

Index the Chunks: Create an index of the document chunks for retrieval.

vectorstore = Chroma.from_documents(
    documents=new_splits,
    embedding=hf_llm
)
retriever = vectorstore.as_retriever()

4. Incorporate the Retriever into a Question-Answering Chain

Integrate the document retriever into a chain for answering questions.

Define the System Prompt: Create a prompt to instruct the model on how to handle the input.

system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

Create Prompt Template: Set up the prompt template for the chat model.

prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}")
])

Create History-Aware Retriever: Combine the language model and the retriever with the prompt.

history_aware_retriever = create_history_aware_retriever(
    llm, retriever, prompt
)

Create Question-Answer Chain: Build the question-answer chain.

question_answer_chain = create_stuff_documents_chain(
    llm=llm,
    prompt=prompt
)

Create Retrieval-Augmented Generation (RAG) Chain: Form the complete RAG chain for better question answering.

rag_chain = create_retrieval_chain(
    history_aware_retriever,
    question_answer_chain
)

5. Execute Queries with Chat History

Run queries while maintaining chat history for context.

Initial Question: Ask the first question and store the response in chat history.

chat_history = []

question = "What is Task Decomposition?"
response = rag_chain.invoke({
    "input": question,
    "chat_history": chat_history
})
chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=response["answer"]),
    ]
)

Follow-up Question: Ask a follow-up question and get a response based on the updated chat history.

second_question = "What are common ways of doing it?"
response2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history})

print(response2["answer"])

These step-by-step explanation guides you through setting up a retrieval-augmented generation (RAG) chain using LangChain with both OpenAI and Huggingface models, loading and processing a document, creating a retriever, and using the retriever in a question-answering chain that maintains chat history.

Enhancing the Conversational Retrieval Chain

To make your CRC more robust, consider the following enhancements:

Context Management: Improve context management by storing and retrieving conversation history. You can use memory modules or databases to store the context.
Custom Embeddings: Train custom embeddings specific to your domain to improve the accuracy of document retrieval.
User Feedback: Implement a feedback loop where users can rate the relevance of responses. This feedback can be used to fine-tune the system over time.
Additional Features: Integrate external data sources, enhance the user interface, or add more sophisticated query processing.

Conclusion

In this comprehensive guide, we explored building a Conversational Retrieval Chain (CRC) using LangChain, enabling the development of advanced chatbots and conversational agents. By setting up the environment, configuring API and models, performing web scraping, and integrating a question-answering chain with context management, we created a fully functional conversational agent. This agent efficiently handles complex queries by leveraging previous interactions, providing improved context handling, better relevance, and enhanced user experience.

To further enhance your CRC, consider implementing advanced context management, custom embeddings, and user feedback loops. These improvements will refine the system’s accuracy and relevance over time. Additionally, integrating external data sources, enhancing the user interface, and ensuring robust security and privacy measures will make your conversational agent more robust and user-friendly. By continuously learning from user interactions and scaling your infrastructure, you can create a highly effective and engaging Conversational Retrieval Chain, paving the way for innovative conversational AI applications.