The requirement is to identify an AWS database service that supports the storage and querying of embeddings (from a generative AI model) as vectors. Embeddings are typically high-dimensional numerical representations of data (e.g., text, images) used in AI applications like conversational search. The database must support vector storage and efficient vector similarity searches. Let’s evaluate each option:
A. Amazon Athena: Amazon Athena is a serverless query service for analyzing data in Amazon S3 using SQL. It is designed for ad-hoc querying of structured data but does not natively support vector storage or vector similarity searches, making it unsuitable for this use case.
B. Amazon Aurora PostgreSQL: Amazon Aurora PostgreSQL is a fully managed relational database compatible with PostgreSQL. With the pgvector extension (available in PostgreSQL and supported by Aurora PostgreSQL), it can store and query vector embeddings efficiently. The pgvector extension enables vector similarity searches (e.g., using cosine similarity or Euclidean distance), which is critical for conversational search applications using embeddings from generative AI models.
C. Amazon Redshift: Amazon Redshift is a data warehousing service optimized for analytical queries on large datasets. While it supports machine learning features and can store numerical data, it does not have native support for vector embeddings or vector similarity searches as of May 17, 2025, making it less suitable for this use case.
D. Amazon EMR: Amazon EMR is a managed big data platform for processing large-scale data using frameworks like Apache Hadoop and Spark. It is not a database service and is not designed for storing or querying vector embeddings in the context of a conversational search application.
Exact Extract Reference: According to the AWS documentation, “Amazon Aurora PostgreSQL-Compatible Edition supports the pgvector extension, which enables efficient storage and similarity searches for vector embeddings. This makes it suitable for AI/ML workloads such as natural language processing and recommendation systems that rely on vector data.” (Source: AWS Aurora Documentation - Using pgvector with Aurora PostgreSQL, https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/PostgreSQLpgvector.html ). Additionally, the pgvector extension supports operations like nearest-neighbor searches, which are essential for querying embeddings in a conversational search system.
Amazon Aurora PostgreSQL with the pgvector extension directly meets the requirement for storing and querying embeddings as vectors, making B the correct answer.
[:, AWS Aurora Documentation: Using pgvector with Aurora PostgreSQL (https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/PostgreSQLpgvector.html), AWS AI Practitioner Study Guide (focus on data engineering for AI, including vector databases), AWS Blog on Vector Search with Aurora (https://aws.amazon.com/blogs/database/using-vector-search-with-amazon-aurora-postgresql/)]
Submit