Accelerating Mixtral MoE fine-tuning on Amazon SageMaker with QLoRA

Companies across various scales and industries are using large language models (LLMs) to develop generative AI applications that provide innovative experiences for customers and employees. However, building or fine-tuning these pre-trained LLMs on extensive datasets demands substantial computational resources and engineering effort. With the increase in sizes of these pre-trained LLMs, the model customization process […]
Amazon SageMaker Inference now supports G6e instances

As the demand for generative AI continues to grow, developers and enterprises seek more flexible, cost-effective, and powerful accelerators to meet their needs. Today, we are thrilled to announce the availability of G6e instances powered by NVIDIA’s L40S Tensor Core GPUs on Amazon SageMaker. You will have the option to provision nodes with 1, 4, and […]
Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

Companies across all industries are harnessing the power of generative AI to address various use cases. Cloud providers have recognized the need to offer model inference through an API call, significantly streamlining the implementation of AI within applications. Although a single API call can address simple use cases, more complex ones may necessitate the use […]
Build generative AI applications on Amazon Bedrock with the AWS SDK for Python (Boto3)

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. With […]
Improve factual consistency with LLM Debates

In this post, we demonstrate the potential of large language model (LLM) debates using a supervised dataset with ground truth. In this LLM debate, we have two debater LLMs, each one taking one side of an argument and defending it based on the previous arguments for N(=3) rounds. The arguments are saved for a judge […]
Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

This post is part of an ongoing series about governing the machine learning (ML) lifecycle at scale. To view this series from the beginning, start with Part 1. This post dives deep into how to set up data governance at scale using Amazon DataZone for the data mesh. The data mesh is a modern approach […]
Amazon Bedrock Flows is now generally available with enhanced safety and traceability

Today, we are excited to announce the general availability of Amazon Bedrock Flows (previously known as Prompt Flows). With Bedrock Flows, you can quickly build and execute complex generative AI workflows without writing code. Key benefits include: Simplified generative AI workflow development with an intuitive visual interface. Seamless integration of latest foundation models (FMs), Prompts, […]
Implement secure API access to your Amazon Q Business applications with IAM federation user access management

Amazon Q Business is a conversational assistant powered by generative AI that enhances workforce productivity by answering questions and completing tasks based on information in your enterprise systems, which each user is authorized to access. AWS recommends using AWS IAM Identity Center when you have a large number of users in order to achieve a […]