--- title: Ngram --- The ngram tokenizer splits text into "grams," where each "gram" is of a certain length. The tokenizer takes two arguments. The first is the minimum character length of a "gram," and the second is the maximum character length. Grams will be generated for all sizes between the minimum and maximum gram size, inclusive. For example, `pdb.ngram(2,5)` will generate tokens of size `2`, `3`, `4`, and `5`. To generate grams of a single fixed length, set the minimum and maximum gram size equal to each other. ```sql CREATE INDEX search_idx ON mock_items USING bm25 (id, (description::pdb.ngram(3,3))) WITH (key_field='id'); ``` To get a feel for this tokenizer, run the following command and replace the text with your own: ```sql SELECT 'Tokenize me!'::pdb.ngram(3,3)::text[]; ``` ```ini Expected Response text ------------------------------------------------- {tok,oke,ken,eni,niz,ize,"ze ","e m"," me",me!} (1 row) ```