Advanced Search | Gooclaim | API Documentation

Gooclaim’s Knowledge Layer search is built for the messy reality of insurance content: long policy PDFs with overlapping clauses, partner-specific overrides, multi-language source material, and constantly-evolving rate tables. This page explains the search capabilities you can tune per source.

Search modes

Semantic

Embedding-based similarity. Best for natural-language questions where keywords don’t match exactly — “what does the policy say about pre-existing conditions?” finds clauses worded differently.

Keyword

BM25 lexical match. Best for IDs, codes, and exact phrases — policy numbers, ICD-10 codes, specific clause references.

Hybrid

Default. Runs both in parallel and re-ranks. Recommended for most workflow queries — gets the precision of keyword and the recall of semantic.

AI query expansion

For ambiguous claimant questions (“kab milega paisa?” / “when will I get paid?”), the search engine expands the raw query into multiple semantic variants before hitting the index. This improves recall without manual synonym dictionaries.

Expansion is on by default but can be disabled per query if you need deterministic results (e.g. an audit-traceable lookup).

1 POST /v1/knowledge/search
2 {
3   "query": "kab milega paisa",
4   "expand": true,
5   "mode": "hybrid",
6   "top_k": 5
7 }

Response includes both the expanded variants used and the matched documents, so audits show the full reasoning trail.

Metadata filtering

Every document indexed carries metadata — source, tenant, document type, language, tags applied at ingest. Filter results to scope queries:

1 POST /v1/knowledge/search
2 {
3   "query": "maternity benefit waiting period",
4   "filters": {
5     "document_type": ["policy", "endorsement"],
6     "language": ["en", "hi_en"],
7     "tags": ["plan_b"]
8   }
9 }

Use this to:

Restrict to policy documents only (exclude internal SOPs from claimant-facing answers)
Match the claimant’s preferred language
Apply partner-specific overrides when the same question has different answers per plan

AI-generated answers

By default, search returns ranked passages with citations. If you want a synthesised answer paragraph (e.g. for a rich-content channel like a web chat widget), enable generate_answer:

1 POST /v1/knowledge/search
2 {
3   "query": "what is the network hospital limit?",
4   "generate_answer": true,
5   "answer_template_id": "policy_lookup_v2"
6 }

The generated answer always:

Cites the source passages it drew from
Passes through the Policy Gate before return
Uses an approved template as the formatting scaffold — never raw LLM output

In pilot (v1.0), generated answers are available for internal Console preview and the public API. End-user messaging surfaces always use the template-based response builder, not free-text generation.

Ranking and re-rank

Hybrid mode runs three stages:

Recall — pull top-50 from semantic + top-50 from keyword (de-duplicated)
Re-rank — cross-encoder model scores each candidate against the query
Trim — return the top-K to the caller

You can tune recall_size (default 100) and disable re-rank for latency- sensitive paths. Re-rank typically adds 50-150ms but lifts precision by 15-25% on the messy queries that benefit most from it.

Best practices

Start narrow on indexed content

A 1000-document index with high signal-to-noise produces better answers than a 50,000-document index full of stale memos. Curate ruthlessly.

Tag at ingest, not at query time

Tag documents with plan codes, partner IDs, and document types when you set up the source connector. Query-time filters then become trivial and indexes stay fast.

Audit query logs weekly

The Audit Ledger captures every Knowledge Layer query plus the documents that scored highest. Review weekly to spot drift — questions that used to answer correctly but now don’t usually mean a source has gone stale.

Pair with the Truth Layer for status questions

Don’t ask the Knowledge Layer “where’s my claim?” — that’s a Truth Layer job (live CMS query). Knowledge is for the why, Truth is for the what. The Workflow Engine routes correctly when both are configured.

Next steps

Knowledge Sources

Connect your first source if you haven’t yet.

API Reference

Full schema for the search endpoints.