---
title: Indexing JSON
description: Add JSON and JSONB types to the index
canonical: https://docs.paradedb.com/documentation/indexing/indexing-json
---

When indexing JSON, ParadeDB automatically indexes all sub-fields of the JSON object. The type of each sub-field is also inferred automatically.
For example, consider the following statement where `metadata` is `JSONB`:

<CodeGroup>
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, metadata)
WITH (key_field='id');
```

```ts Drizzle
import { indexing } from "@paradedb/drizzle-paradedb";

indexing.bm25Index("search_idx").on(mockItems.id, mockItems.metadata);
```

```python Django
from django.db import connection
from paradedb.indexes import BM25Index

with connection.schema_editor() as schema_editor:
    schema_editor.add_index(
        MockItem,
        BM25Index(
            fields={
                "id": {},
                "metadata": {},
            },
            key_field="id",
            name="search_idx",
        ),
    )
```

```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing

idx = Index(
    "search_idx",
    indexing.BM25Field(MockItem.id),
    indexing.BM25Field(MockItem.metadata_),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)

with engine.begin() as conn:
    idx.create(conn)
```

```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
  :mock_items,
  fields: {
    id: {},
    metadata: {}
  },
  key_field: :id,
  name: :search_idx
)
```

```cs EF Core
modelBuilder.Entity<MockItem>()
    .HasBm25Index("search_idx", e => e.Id)
    .HasField(e => e.Metadata);
```

</CodeGroup>

A single `metadata` JSON may look like:

```json
{ "color": "Silver", "location": "United States" }
```

ParadeDB will automatically index both `metadata.color` and `metadata.location` as text.

By default, all text sub-fields of a JSON object use the same tokenizer. The tokenizer can be configured the same way as text fields:

<CodeGroup>
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (metadata::pdb.ngram(2,3)))
WITH (key_field='id');
```

```ts Drizzle
import { indexing, tokenizer } from "@paradedb/drizzle-paradedb";

indexing
  .bm25Index("search_idx")
  .on(
    mockItems.id,
    indexing.bm25Field(mockItems.metadata, tokenizer.ngram(2, 3)),
  );
```

```python Django
from django.db import connection
from paradedb.indexes import BM25Index
from paradedb.search import Tokenizer

with connection.schema_editor() as schema_editor:
    schema_editor.add_index(
        MockItem,
        BM25Index(
            fields={
                "id": {},
                "metadata": {
                    "tokenizer": Tokenizer.ngram(2, 3),
                },
            },
            key_field="id",
            name="search_idx",
        ),
    )
```

```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing, tokenizer

idx = Index(
    "search_idx",
    indexing.BM25Field(MockItem.id),
    indexing.BM25Field(
        MockItem.metadata_,
        tokenizer=tokenizer.ngram(2, 3),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)

with engine.begin() as conn:
    idx.create(conn)
```

```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
  :mock_items,
  fields: {
    id: {},
    metadata: { tokenizer: Tokenizer.ngram(2, 3) }
  },
  key_field: :id,
  name: :search_idx
)
```

```cs EF Core
modelBuilder.Entity<MockItem>()
    .HasBm25Index("search_idx", e => e.Id)
    .HasField(e => e.Metadata, Tokenizer.Ngram(2, 3));
```

</CodeGroup>

Instead of indexing the entire JSON, sub-fields of the JSON can be indexed individually. This allows for configuring separate tokenizers
within a larger JSON:

<CodeGroup>
```sql SQL
CREATE INDEX search_idx ON mock_items
USING bm25 (id, ((metadata->>'color')::pdb.ngram(2,3)))
WITH (key_field='id');
```

```ts Drizzle
import { indexing, tokenizer } from "@paradedb/drizzle-paradedb";

indexing
  .bm25Index("search_idx")
  .on(
    mockItems.id,
    indexing.bm25Field(
      indexing.jsonText(mockItems.metadata, "color"),
      tokenizer.ngram(2, 3),
    ),
  );
```

```python Django
from django.db import connection
from paradedb.indexes import BM25Index
from paradedb.search import Tokenizer

with connection.schema_editor() as schema_editor:
    schema_editor.add_index(
        MockItem,
        BM25Index(
            fields={
                "id": {},
                "metadata": {
                    "json_keys": {
                        "color": {
                            "tokenizer": Tokenizer.ngram(2, 3),
                        },
                    },
                },
            },
            key_field="id",
            name="search_idx",
        ),
    )
```

```python SQLAlchemy
from sqlalchemy import Index
from paradedb.sqlalchemy import expr, indexing, tokenizer

idx = Index(
    "search_idx",
    indexing.BM25Field(MockItem.id),
    indexing.BM25Field(
        expr.json_text(MockItem.metadata_, "color"),
        tokenizer=tokenizer.ngram(2, 3),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)

with engine.begin() as conn:
    idx.create(conn)
```

```ruby Rails
ActiveRecord::Base.connection.add_bm25_index(
  :mock_items,
  fields: {
    id: {},
    "metadata->>'color'" => { tokenizer: Tokenizer.ngram(2, 3) }
  },
  key_field: :id,
  name: :search_idx
)
```

```cs EF Core
modelBuilder.Entity<MockItem>()
    .HasBm25Index("search_idx", e => e.Id)
    .HasField("metadata->>'color'", Tokenizer.Ngram(2, 3));
```

</CodeGroup>