---
title: How Text Search Works
description: Understand how ParadeDB uses token matching to efficiently search large corpuses of text
canonical: https://docs.paradedb.com/documentation/full-text/overview
---

Text search in ParadeDB, like Elasticsearch and most search engines, is centered around the concept of **token matching**.

Token matching consists of two steps. First, at indexing time, text is processed by a tokenizer, which breaks input into discrete units called **tokens** or
**terms**. For example, the [default](/documentation/indexing/create-index) tokenizer splits the text `Sleek running shoes` into the tokens `sleek`, `running`, and `shoes`.

Second, at query time, the query engine looks for token matches based on the specified query and query type. Some common query types include:

- [Match](/documentation/full-text/match): Matches documents containing any or all query tokens
- [Phrase](/documentation/full-text/phrase): Matches documents where all tokens appear in the same order as the query
- [Term](/documentation/full-text/term): Matches documents containing an exact token
- ...and many more [advanced](/documentation/query-builder/overview) query types

## Not Substring Matching

While ParadeDB supports substring matching via [regex](/documentation/query-builder/term/regex) queries, it's important to note that token matching is **not** the
same as substring matching.

Token matching is a much more versatile and powerful technique. It enables relevance scoring, language-specific analysis, typo tolerance, and more expressive query types — capabilities that go far beyond simply looking for a sequence of characters.

## Similarity Search

Text search is different from similarity search, also known as vector search. Whereas text search matches based on token matches, similarity search
matches based on semantic meaning.

Today, most ParadeDB users install [pgvector](https://github.com/pgvector/pgvector) alongside ParadeDB for vector search and hybrid search.
That remains our recommended setup when you need embeddings in Postgres right now.

We are also actively working on a native vector search experience inside ParadeDB indexes that is intended to improve on the current `pgvector`
workflow, especially for filtered and hybrid search. You can follow that work in our [roadmap](/welcome/roadmap#vector-search-improvements) or
[reach out](mailto:support@paradedb.com) if it is important for your use case.