---
title: Ngram
---

The ngram tokenizer splits text into "grams," where each "gram" is of a certain length.

The tokenizer takes two arguments. The first is the minimum character length of a "gram," and the second is the maximum character length. Grams will be generated for all sizes between
the minimum and maximum gram size, inclusive. For example, `pdb.ngram(2,5)` will generate tokens of size `2`, `3`, `4`, and `5`.

To generate grams of a single fixed length, set the minimum and maximum gram size equal to each other.

```sql
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.ngram(3,3)))
WITH (key_field='id');
```

To get a feel for this tokenizer, run the following command and replace the text with your own:

```sql
SELECT 'Tokenize me!'::pdb.ngram(3,3)::text[];
```

```ini Expected Response
                      text
-------------------------------------------------
 {tok,oke,ken,eni,niz,ize,"ze ","e m"," me",me!}
(1 row)
```