Multi-tenant RAG implementation with Amazon Bedrock and Amazon OpenSearch Service for SaaS using JWT

In recent years, the emergence of large language models (LLMs) has accelerated AI adoption across various industries. However, to further augment LLMs’ capabilities and effectively use up-to-date information and domain-specific knowledge, integration with external data sources is essential. Retrieval Augmented Generation (RAG) has gained attention as an effective approach to address this challenge. RAG is […]
Enhance generative AI solutions using Amazon Q index with Model Context Protocol – Part 1

Today’s enterprises increasingly rely on AI-driven applications to enhance decision-making, streamline workflows, and deliver improved customer experiences. Achieving these outcomes demands secure, timely, and accurate access to authoritative data—especially when such data resides across diverse repositories and applications within strict enterprise security boundaries. Interoperable technologies powered by open standards like the Model Context Protocol (MCP) […]
Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program

In 2024, the Ministry of Economy, Trade and Industry (METI) launched the Generative AI Accelerator Challenge (GENIAC)—a Japanese national program to boost generative AI by providing companies with funding, mentorship, and massive compute resources for foundation model (FM) development. AWS was selected as the cloud provider for GENIAC’s second cycle (cycle 2). It provided infrastructure […]
Streamline deep learning environments with Amazon Q Developer and MCP

Data science teams working with artificial intelligence and machine learning (AI/ML) face a growing challenge as models become more complex. While Amazon Deep Learning Containers (DLCs) offer robust baseline environments out-of-the-box, customizing them for specific projects often requires significant time and expertise. In this post, we explore how to use Amazon Q Developer and Model […]
Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform

Extracting meaningful insights from unstructured data presents significant challenges for many organizations. Meeting recordings, customer interactions, and interviews contain invaluable business intelligence that remains largely inaccessible due to the prohibitive time and resource costs of manual review. Organizations frequently struggle to efficiently capture and use key information from these interactions, resulting in not only productivity […]
Kyruus builds a generative AI provider matching solution on AWS

This post was written with Zach Heath of Kyruus Health. When health plan members need care, they shouldn’t need a dictionary. Yet millions face this exact challenge—describing symptoms in everyday language while healthcare references clinical terminology and complex specialty classifications. This disconnect forces members to become amateur medical translators, attempting to convert phrases like “my […]
Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance

In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how Amazon Web Services (AWS) customers can build a solution that automates the digitisation and extraction of crucial information from many reports using generative AI. The solution uses Amazon Nova Pro on Amazon Bedrock and Amazon […]
Build real-time travel recommendations using AI agents on Amazon Bedrock

Generative AI is transforming how businesses deliver personalized experiences across industries, including travel and hospitality. Travel agents are enhancing their services by offering personalized holiday packages, carefully curated for customer’s unique preferences, including accessibility needs, dietary restrictions, and activity interests. Meeting these expectations requires a solution that combines comprehensive travel knowledge with real-time pricing and […]
Deploy a full stack voice AI agent with Amazon Nova Sonic

AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and dramatically reducing operational costs—all without sacrificing the human-like interaction customers expect. With the recent launch of Amazon Nova Sonic in Amazon Bedrock, you can now build sophisticated conversational AI agents that communicate naturally through voice, […]
Manage multi-tenant Amazon Bedrock costs using application inference profiles

Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes critical when building a multi-tenant generative AI service designed to serve a large, diverse customer base while maintaining rigorous cost controls and comprehensive usage monitoring. Traditional cost management approaches for such systems often reveal limitations. […]