--- title: Highlighting --- Highlighting is an expensive process and can slow down query times. We recommend passing a `LIMIT` to any query where `pdb.snippet` is called to restrict the number of snippets that need to be generated. Highlighting is not supported for queries that use fuzziness, like `paradedb.fuzzy_term`. Highlighting refers to the practice of visually emphasizing the portions of a document that match a user's search query. ## Basic Usage `pdb.snippet()` can be added to any query where an `@@@` operator is present. The following query generates highlighted snippets against the `description` field. ```sql SELECT id, pdb.snippet(description) FROM mock_items WHERE description @@@ 'shoes' LIMIT 5; ``` By default, `` encloses the snippet. This can be configured with `start_tag` and `end_tag`: ```sql SELECT id, pdb.snippet(description, start_tag => '', end_tag => '') FROM mock_items WHERE description @@@ 'shoes' LIMIT 5; ``` ## Fragment Size For every highlighted term, a fragment of size `max_num_chars` is created containing the term and its surrounding text. A fragment can contain multiple highlighted terms if they are within `max_num_chars` distance of one another. By default, `max_num_chars` is set to `150`. ```sql SELECT id, pdb.snippet(description, max_num_chars => 100) FROM mock_items WHERE description @@@ 'shoes' LIMIT 5; ``` If multiple fragments are found, `pdb.snippet` uses a two-tiered scoring system to determine which fragment to display: 1. Each highlighted term receives a score based on its inverse document frequency. This means that fragments containing rarer terms will score higher. 2. If there is a tie, the fragment that appears earlier in the source text will be displayed. ## Byte Offsets `pdb.snippet_positions()` returns the byte offsets in the original text where the snippets would appear. It returns an array of tuples, where the the first element of the tuple is the byte index of the first byte of the highlighted region, and the second element is the byte index after the last byte of the region. ```sql SELECT id, pdb.snippet(description), pdb.snippet_positions(description) FROM mock_items WHERE description @@@ 'shoes' LIMIT 5; ``` ```csv id | snippet | snippet_positions ----+----------------------------+------------------- 3 | Sleek running shoes | {"{14,19}"} 4 | White jogging shoes | {"{14,19}"} 5 | Generic shoes | {"{8,13}"} (3 rows) ``` ## Snippet Limit and Offset Both `pdb.snippet` and `pdb.snippet_positions` accept `limit` and `offset` arguments. A `limit` restricts the number of highlighted terms, while an `offset` ignores the first `offset` highlighted terms. This can be useful for paginating through documents that contain large numbers of highlighted terms. ```sql SELECT id, pdb.snippet(description, "limit" => 1, "offset" => 1) FROM mock_items WHERE description @@@ 'shoes' AND description @@@ 'sleek' AND description @@@ 'running'; ``` ```sql Expected Response id | snippet ----+---------------------------- 3 | Sleek running shoes (1 row) ``` The `limit` and `offset` arguments must be wrapped in double quotes because they are reserved keywords in Postgres. In the output above, notice that `sleek` is not highlighted because an offset of `1` skips the first highlighted term. Similarly, `shoes` is not highlighted because of the limit `1`.