Option A is the correct solution because hybrid search directly addresses the core retrieval failure modes while maintaining low latency and minimal operational overhead. In medical and scientific domains, exact terminology, abbreviations, and acronyms (for example, drug names, procedures, or conditions) are critical. Pure vector similarity search often underweights these exact matches, leading to missed results and excessive semantically related but irrelevant documents.
Amazon OpenSearch Service natively supports hybrid search, which combines keyword-based retrieval (such as BM25) with vector similarity search. Keyword search ensures precise matching for exact terms and acronyms, while vector search captures semantic meaning and contextual similarity. By blending these approaches, the retrieval system improves both precision and recall without introducing additional infrastructure.
Hybrid search operates within the same OpenSearch index and query path, which preserves low end-user latency even at large scale. This is especially important as the document collection grows to millions of documents. Because OpenSearch handles scoring and ranking internally, no additional orchestration layers or post-processing steps are required.
Option B increases computational cost and latency while failing to address exact-term recall. Option C introduces a new service and ingestion pipeline, increasing operational overhead and latency. Option D adds model hosting, re-ranking infrastructure, and complexity that is unnecessary when OpenSearch provides native hybrid retrieval.
Therefore, Option A delivers the best balance of retrieval quality, scalability, latency, and operational simplicity for medical RAG workloads.
Submit