---
title: Create an Index
description: Index a Postgres table for full text search
canonical: https://docs.paradedb.com/documentation/indexing/create-index
---
Before a table can be searched, it must be indexed. ParadeDB uses a custom index type called the BM25 index.
The following code block creates a BM25 index over several columns in the `mock_items` table.
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, description, category)
WITH (key_field='id');
```
```ts Drizzle
import { indexing } from "@paradedb/drizzle-paradedb";
indexing
.bm25Index("search_idx")
.on(mockItems.id, mockItems.description, mockItems.category);
```
```python Django
from django.db import connection
from paradedb.indexes import BM25Index
with connection.schema_editor() as schema_editor:
schema_editor.add_index(
MockItem,
BM25Index(
fields={
"id": {},
"description": {},
"category": {},
},
key_field="id",
name="search_idx",
),
)
```
```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing
idx = Index(
"search_idx",
indexing.BM25Field(MockItem.id),
indexing.BM25Field(MockItem.description),
indexing.BM25Field(MockItem.category),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
with engine.begin() as conn:
idx.create(conn)
```
```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
:mock_items,
fields: {
id: {},
description: {},
category: {}
},
key_field: :id,
name: :search_idx
)
```
```cs EF Core
modelBuilder.Entity()
.HasBm25Index("search_idx", e => e.Id)
.HasField(e => e.Description)
.HasField(e => e.Category);
```
See the [getting started guide](/documentation/getting-started/environment)
for more detail on how to set up your ORM to run index creation commands.
You'll need to drop the existing `search_idx` before you can create a new one:
```sql SQL
DROP INDEX search_idx;
```
```ts Drizzle
import { sql } from "drizzle-orm";
await db.execute(sql`DROP INDEX search_idx`);
```
```python Django
from django.db import connection
with connection.cursor() as cursor:
cursor.execute("DROP INDEX search_idx")
```
```python SQLAlchemy
from sqlalchemy import text
with engine.begin() as conn:
conn.execute(text("DROP INDEX search_idx"))
```
```ruby Rails
ActiveRecord::Base.connection.remove_bm25_index(:mock_items, name: :search_idx)
```
```cs EF Core
await dbContext.Database.ExecuteSqlRawAsync("DROP INDEX search_idx");
```
By default, text columns are tokenized using the [unicode](/documentation/tokenizers/available-tokenizers/unicode) tokenizer, which splits text according to the
Unicode segmentation standard. Because index creation is a time-consuming operation, we recommend experimenting with the [available tokenizers](/documentation/tokenizers/overview)
to find the most suitable one before running `CREATE INDEX`.
For instance, if a column contains multiple languages, the ICU tokenizer may be more appropriate.
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.icu), category)
WITH (key_field='id');
```
```ts Drizzle
import { indexing, tokenizer } from "@paradedb/drizzle-paradedb";
indexing
.bm25Index("search_idx")
.on(
mockItems.id,
indexing.bm25Field(mockItems.description, tokenizer.icu()),
mockItems.category,
);
```
```python Django
from django.db import connection
from paradedb.indexes import BM25Index
from paradedb.search import Tokenizer
with connection.schema_editor() as schema_editor:
schema_editor.add_index(
MockItem,
BM25Index(
fields={
"id": {},
"description": {"tokenizer": Tokenizer.icu()},
"category": {},
},
key_field="id",
name="search_idx",
),
)
```
```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing, tokenizer
idx = Index(
"search_idx",
indexing.BM25Field(MockItem.id),
indexing.BM25Field(
MockItem.description,
tokenizer=tokenizer.icu(),
),
indexing.BM25Field(MockItem.category),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
with engine.begin() as conn:
idx.create(conn)
```
```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
:mock_items,
fields: {
id: {},
description: { tokenizer: Tokenizer.icu() },
category: {}
},
key_field: :id,
name: :search_idx
)
```
```cs EF Core
modelBuilder.Entity()
.HasBm25Index("search_idx", e => e.Id)
.HasField(e => e.Description, Tokenizer.Icu())
.HasField(e => e.Category);
```
Only one BM25 index can exist per table. We recommend indexing all columns in a table that may be present in a search query,
including columns used for sorting, grouping, filtering, and aggregations.
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, description, category, rating, in_stock, created_at, metadata, weight_range)
WITH (key_field='id');
```
```ts Drizzle
import { indexing } from "@paradedb/drizzle-paradedb";
indexing
.bm25Index("search_idx")
.on(
mockItems.id,
mockItems.description,
mockItems.category,
mockItems.rating,
mockItems.inStock,
mockItems.createdAt,
mockItems.metadata,
mockItems.weightRange,
);
```
```python Django
from django.db import connection
from paradedb.indexes import BM25Index
with connection.schema_editor() as schema_editor:
schema_editor.add_index(
MockItem,
BM25Index(
fields={
"id": {},
"description": {},
"category": {},
"rating": {},
"in_stock": {},
"created_at": {},
"metadata": {},
"weight_range": {},
},
key_field="id",
name="search_idx",
),
)
```
```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing
idx = Index(
"search_idx",
indexing.BM25Field(MockItem.id),
indexing.BM25Field(MockItem.description),
indexing.BM25Field(MockItem.category),
indexing.BM25Field(MockItem.rating),
indexing.BM25Field(MockItem.in_stock),
indexing.BM25Field(MockItem.created_at),
indexing.BM25Field(MockItem.metadata_),
indexing.BM25Field(MockItem.weight_range),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
with engine.begin() as conn:
idx.create(conn)
```
```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
:mock_items,
fields: {
id: {},
description: {},
category: {},
rating: {},
in_stock: {},
created_at: {},
metadata: {},
weight_range: {}
},
key_field: :id,
name: :search_idx
)
```
```cs EF Core
modelBuilder.Entity()
.HasBm25Index("search_idx", e => e.Id)
.HasField(e => e.Description)
.HasField(e => e.Category)
.HasField(e => e.Rating)
.HasField(e => e.InStock)
.HasField(e => e.CreatedAt)
.HasField(e => e.Metadata)
.HasField(e => e.WeightRange);
```
Most Postgres types, including text, JSON, numeric, timestamp, range, boolean, and arrays, can be indexed.
## Track Create Index Progress
To monitor the progress of a long-running `CREATE INDEX`, open a separate Postgres connection and query `pg_stat_progress_create_index`:
```sql
SELECT pid, phase, blocks_done, blocks_total
FROM pg_stat_progress_create_index;
```
Comparing `blocks_done` to `blocks_total` will provide a good approximation of the progress so far. If `blocks_done` equals
`blocks_total`, that means that all rows have been indexed and the index is being flushed to disk.
## Choosing a Key Field
In the `CREATE INDEX` statement above, note the mandatory `key_field` option.
Every BM25 index needs a `key_field`, which is the name of a column that will function as a row’s unique identifier within the index.
The `key_field` must:
1. Have a `UNIQUE` constraint. Usually this means the table's `PRIMARY KEY`.
2. Be the first column in the column list.
3. Be untokenized, if it is a text field.
## Token Filters
After tokens are created, [token filters](/documentation/token-filters/overview) can be configured to apply further processing like lowercasing, stemming, or unaccenting.
For example, the following code block adds English stemming to `description`:
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.simple('stemmer=english')), category)
WITH (key_field='id');
```
```ts Drizzle
import { indexing, tokenizer } from "@paradedb/drizzle-paradedb";
indexing
.bm25Index("search_idx")
.on(
mockItems.id,
indexing.bm25Field(
mockItems.description,
tokenizer.simple({ stemmer: "english" }),
),
mockItems.category,
);
```
```python Django
from django.db import connection
from paradedb.indexes import BM25Index
from paradedb.search import Tokenizer
with connection.schema_editor() as schema_editor:
schema_editor.add_index(
MockItem,
BM25Index(
fields={
"id": {},
"description": {
"tokenizer": Tokenizer.simple(
options={"stemmer": "english"}
),
},
"category": {},
},
key_field="id",
name="search_idx",
),
)
```
```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing, tokenizer
idx = Index(
"search_idx",
indexing.BM25Field(MockItem.id),
indexing.BM25Field(
MockItem.description,
tokenizer=tokenizer.simple(options={"stemmer": "english"}),
),
indexing.BM25Field(MockItem.category),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
with engine.begin() as conn:
idx.create(conn)
```
```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
:mock_items,
fields: {
id: {},
description: {
tokenizer: Tokenizer.simple(options: { stemmer: "english" })
},
category: {}
},
key_field: :id,
name: :search_idx
)
```
```cs EF Core
modelBuilder.Entity()
.HasBm25Index("search_idx", e => e.Id)
.HasField(e => e.Description, Tokenizer.Simple(new() { ["stemmer"] = "english" }))
.HasField(e => e.Category);
```