Analyzer

Analyzer

May 3, 2024 | seedling, permanent

k:PROPERTIES: :ID: E4039A88-BE80-4C42-9CA9-D3D504752ED3 :DRILL_LAST_INTERVAL: -1.0 :DRILL_REPEATS_SINCE_FAIL: 1 :DRILL_TOTAL_REPEATS: 1 :DRILL_FAILURE_COUNT: 1 :DRILL_AVERAGE_QUALITY: 1.0 :DRILL_EASE: 2.5 :NEXT_REVIEW: [2024-04-30 Tue] :MATURITY: seedling :LAST_REVIEW: [2024-05-01 Wed]

:END:

tags
Elasticsearch, OpenSearch, Analysis

Summary #

ref How Elasticsearch(and full text search) works?, book, evernote

In a nutshell an analyzer is used to tell Elasticsearch, OpenSearch how the text should be indexed and searched.

Analyzer is a wrapper which wraps three functions:

  1. Character filter: Mainly used to strip off some unused characters or change some characters.

  2. Tokenizer: Breaks a text into individual tokens(or words) and it does that based on certain factors(whitespace, ngram etc).

  3. Token filter: It receives the tokens and then apply some filters(example changing uppercase terms to lowercase).

Elasticsearch #

ref, es The analyzer parameter specifies the analyzer used for text analysis when indexing or searching a text field.

Unless overridden with the search_analyzer mapping parameter, this analyzer is used for both index and search analysis. See Specify an analyzer.

OCR of Images #

2024-05-01_16-47-13_screenshot.png #

Input string/text Character Filters lokenizers Analysis Token Tilters tokens To inverted index


No notes link to this note

Go to random page

Previous Next