Analyzer
- tags
- Elasticsearch, OpenSearch, Analysis
Summary #
ref, ref2 How Elasticsearch(and full text search) works?, book, evernote

In a nutshell an analyzer is used to tell Elasticsearch, OpenSearch how the text should be indexed and searched.
Analyzer is a wrapper which wraps three functions:
Character filter: Mainly used to strip off some unused characters or change some characters.
Tokenizer: Breaks a text into individual tokens(or words) and it does that based on certain factors(whitespace, ngram etc).
Token filter: It receives the tokens and then apply some filters(example changing uppercase terms to lowercase).

Elasticsearch #
ref, es The analyzer parameter specifies the analyzer used for text analysis when indexing or searching a text field.
Unless overridden with the search_analyzer mapping parameter, this analyzer is used for both index and search analysis. See Specify an analyzer.
OCR of Images #
2024-05-03_14-44-27_screenshot.png #

<b>Let's build an autocompletel:/bs Character Filters Let's build an autocomplete! Tokenizer Tokenizer Tokenizer Tokenizer Let's build an Autocomplete Token Filter Token Filter Token Filter Token Filter let's build an autocomplete
2024-05-01_16-47-13_screenshot.png #

Input string/text Character Filters lokenizers Analysis Token Tilters tokens To inverted index