Opensearch Retriever
tags :
OpenSearch Retriever #
opensearch retriever source code
Additional supported search_types are #
approximate_search #
search_type: “approximate_search”; default: “approximate_search”
boolean_filter: A Boolean filter is a post filter consists of a Boolean query that contains a k-NN query and a filter.
subquery_clause: Query clause on the knn vector field; default: “must”
lucene_filter: the Lucene algorithm decides whether to perform an exact k-NN search with pre-filtering or an approximate search with modified post-filtering. (deprecated, use `efficient_filter`)
efficient_filter: the Lucene Engine or Faiss Engine decides whether to perform an exact k-NN search with pre-filtering or an approximate search with modified post-filtering.
script_scoring #
Optional Args for Script Scoring Search: search_type: “script_scoring”; default: “approximate_search”
space_type: “l2”, “l1”, “linf”, “cosinesimil”, “innerproduct”, “hammingbit”; default: “l2”
pre_filter: script_score query to pre-filter documents before identifying nearest neighbors; default: “match_all”:
painless_scripting #
Optional Args for Painless Scripting Search: search_type: “painless_scripting”; default: “approximate_search”
space_type: “l2Squared”, “l1Norm”, “cosineSimilarity”; default: “l2Squared”
pre_filter: script_score query to pre-filter documents before identifying nearest neighbors; default: “match_all”:
Overriding Opensearch retriever or creating a custom retriever #
from langchain.schema import BaseRetriever, Document
class OpenSearchRetriever(BaseRetriever):
"""Retriever that uses OpenSearch's vector store for retrieving documents."""
def _get_relevant_documents(
self,
query: str,
*,
run_manager: CallbackManagerForRetrieverRun,
) -> List[Document]:
"""
Get the relevant documents for a given query using OpenSearch's vector store.
Args:
query: The query to search for.
Returns:
A list of relevant documents.
"""
# Use OpenSearch's vector store to get relevant documents
# This is a placeholder and should be replaced with actual code
documents = opensearch_vector_store_search(query)
return documents
# using it in ensemble
ensemble_retriever = EnsembleRetriever(
retrievers=[opensearch_retriever, other_retriever],
weights=[0.5, 0.5],
)