Fine-tune VLMs for multipage document-to-JSON with SageMaker AI and SWIFT

Extracting structured data from documents like invoices, receipts, and forms is a persistent business challenge. Variations in format, layout, language, and vendor make standardization difficult, and manual data entry is slow, error-prone, and unscalable. Traditional optical character recognition (OCR) and rule-based systems often fall short in handling this complexity. For instance, a regional bank might […]

How Clario automates clinical research analysis using generative AI on AWS

Clinical outcome assessment (COA) interviews are important instruments in clinical trials for evaluating the efficacy and safety of treatments. In studies of psychosis, anxiety, and mood disorders, these assessments often determine the success or failure of the trial, highlighting the importance of data quality and reliability. The traditional approach to evaluating the quality of these […]

Connect Amazon Bedrock agents to cross-account knowledge bases

Organizations need seamless access to their structured data repositories to power intelligent AI agents. However, when these resources span multiple AWS accounts integration challenges can arise. This post explores a practical solution for connecting Amazon Bedrock agents to knowledge bases in Amazon Redshift clusters residing in different AWS accounts. The challenge Organizations that build AI […]

Introducing structured output for Custom Model Import in Amazon Bedrock

With Amazon Bedrock Custom Model Import, you can deploy and scale fine-tuned or proprietary foundation models in a fully managed, serverless environment. You can bring your own models into Amazon Bedrock, scale them securely without managing infrastructure, and integrate them with other Amazon Bedrock capabilities. Today, we are excited to announce the addition of structured […]

RL without TD learning

In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer. Unlike traditional methods, this algorithm is not based on temporal difference (TD) learning (which has scalability challenges), and scales well to long-horizon tasks. We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal […]

What exactly does word2vec learn?

What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitative and predictive theory describing its learning process. In our new paper, we […]

Whole-Body Conditioned Egocentric Video Prediction

× Predicting Ego-centric Video from human Actions (PEVA). Given past video frames and an action specifying a desired change in 3D pose, PEVA predicts the next video frame. Our results show that, given the first frame and a sequence of actions, our model can generate videos of atomic actions (a), simulate counterfactuals (b), and support […]

Transform your MCP architecture: Unite MCP servers through AgentCore Gateway

As AI agents are adopted at scale, developer teams can create dozens to hundreds of specialized Model Context Protocol (MCP) servers, tailored for specific agent use case and domain, organization functions or teams. Organizations also need to integrate their own existing MCP servers or open source MCP servers for their AI workflows. There is a […]