Langchain Retriever
tags :
Retriever #
URL A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. *A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.
Retrievers accept a string query as input and return a list of Document’s as output.
Types #
Vectorstore #
ref When to use: If you are just getting started and looking for something quick and easy.
This is the simplest method and the one that is easiest to get started with. It involves creating embeddings for each piece of text.
Parent Document #
ref When to use: If your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together.
This involves indexing multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks).
Multi-Query Retriever #
When to use them: If users are asking questions that are complex and require multiple pieces of distinct information to respond
This uses an LLM to generate multiple queries from the original one. This is useful when the original query needs pieces of information about multiple topics to be properly answered. By generating multiple queries, we can then fetch documents for each of them.
Ensemble #
When to use them: If you have multiple retrieval methods and want to try combining them.
This fetches documents from multiple retrievers and then combines them.
Self-Quering Retriever #
Using Retrievers in LCEL #
Since retrievers are Runnable’s, we can easily compose them with other Runnable objects:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
template = """Answer the question based only on the following context:
context
Question: question
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()
def format_docs(docs):
return "\n\n".join([d.page_content for d in docs])
chain = (
"context": retriever | format_docs, "question": RunnablePassthrough()
| prompt
| model
| StrOutputParser()
)
chain.invoke("What did the president say about technology?")
Source Code of Retriever #
Search Algorithm #
source code , vectorstores module
as_retriever method Can be “similarity” (default), “mmr”, or “similarity_score_threshold”.
similarity #
(default)
mmr #
MMR: Acronym Maximal Marginal Relevance
Imagine you’re at a party, and you’re tasked with introducing your friend to some new people. Your friend has shared some information about what kind of people they want to meet (this is the query_embedding).
Here’s how this Maximal Marginal Relevance would approach that task:
It looks at everyone at the party (the embedding_list).
It finds the person who matches the most with what your friend is looking for, based on their interests and characteristics.
It introduces that person to your friend.
Then it goes back to the party crowd, but this time, instead of just looking for someone who matches what your friend wants, it also wants to find someone who is a bit different from the people they’ve already met.
The lambda_mult parameter is like a balancing act - it decides how much weight we should put on finding someone who matches your friend’s preferences versus finding someone who is different from the people they’ve already met.
The function continues doing this until it has introduced your friend to the desired number of people (the k parameter).
In the end, it returns a list of people your friend has met.
docsearch.as_retriever(
search_type="mmr",
search_kwargs=
'k': 6,
'lambda_mult': 0.25,
'fetch_k': 50,
'filter': 'paper_title':'GPT-4 Technical Report'
# fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
# lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
# filter: Filter by document metadata
similarity_score_threshold #
docsearch.as_retriever(
search_type="similarity_score_threshold",
search_kwargs=
'score_threshold': 0.8
)
# score_threshold: Optional, a floating point value between 0 to 1 to filter the resulting set of retrieved docs
Dynamic values to the retriever, ConfigurableField #
ref langchain==4.0.1 langchain-community==0.0.20
# This worked for me:
from langchain_core.runnables import RunnableParallel, ConfigurableField
from langchain_community.vectorstores import OpenSearchVectorSearch
semantic_search_vectorstore = OpenSearchVectorSearch(
opensearch_url=get_opensearch_url(),
index_name=index_name,
embedding_function=embeddings_function,
vector_field=vector_field,
text_field=text_field,
http_compress=True,
use_ssl=False,
verify_certs=False, # DONT USE IN PRODUCTION
ssl_assert_hostname=False,
ssl_show_warn=False,
bulk_size=5000, # default is 500
)
def get_retriever(
customized_search_type: str,
search_type: str = "similarity",
):
retriever = semantic_search_vectorstore.as_retriever(
k="5",
index_name=INDEX["name"],
text_field=INDEX["text_field"],
vector_field=INDEX["vector_field"],
search_type=search_type,
# search_type="similar]", # similarity (default) similarity_score_threshold, mmr
# search_type="mmr", # similarity (default) similarity_score_threshold, mmr
# search_type="similarity", # similarity (default) similarity_score_threshold, mmr
# search_type="similarity_score_threshold", # similarity (default) similarity_score_threshold, mmr
customized_search_type=customized_search_type,
)
configurable_retriever = retriever.configurable_fields(
search_kwargs=ConfigurableField(
id="search_kwargs",
name="Search Kwargs",
description="The search kwargs to use. Includes dynamic category adjustment.",
)
)
return configurable_retriever
class SemanticQueryInput(TypedDict):
question: str
def get_chain():
retriever_chain = (
RunnableParallel(
context=(
itemgetter("question")
| get_retriever()
),
question=itemgetter("question"),
)
| RunnableParallel(docs=itemgetter("context"), question=itemgetter("question"))
# | enrich_docs_chain
).with_types(input_type=SemanticQueryInput)
return retriever_chain
chain = get_chain()
chain.invoke("question": "A question to RAG", config="configurable": "search_kwargs": dynamic_filters_dict)