--- title: Simple description: Splits on any non-alphanumeric character canonical: https://docs.paradedb.com/documentation/tokenizers/available-tokenizers/simple --- The simple tokenizer splits on any non-alphanumeric character (e.g. whitespace, punctuation, symbols). All characters are [lowercased](/documentation/token-filters/lowercase) by default. ```sql CREATE INDEX search_idx ON mock_items USING bm25 (id, (description::pdb.simple)) WITH (key_field='id'); ``` To get a feel for this tokenizer, run the following command and replace the text with your own: ```sql SELECT 'Tokenize me!'::pdb.simple::text[]; ``` ```ini Expected Response text --------------- {tokenize,me} (1 row) ```