In this post, we demonstrate how to build an end-to-end solution for text classification using the Amazon Bedrock batch inference capability with the Anthropic’s Claude Haiku model. Amazon Bedrock batch inference offers a 50% discount compared to the on-demand price, which is an important factor when dealing with a large number of requests. We walk through classifying travel agency call center conversations into categories, showcasing how to generate synthetic training data, process large volumes of text data, and automate the entire workflow using AWS services.

Challenges with high-volume text classification

Organizations across various sectors face a common challenge: the need to efficiently handle high-volume classification tasks. From travel agency call centers categorizing customer inquiries to sales teams analyzing lost opportunities and finance departments classifying invoices, these manual processes are a daily necessity. But these tasks come with significant challenges.

The manual approach to analyzing and categorizing these classification requests is not only time-intensive but also prone to inconsistencies. As teams process the high volume of data, the potential for errors and inefficiencies grows. By implementing automated systems to classify these interactions, multiple departments stand to gain substantial benefits. They can uncover hidden trends in their data, significantly enhance the quality of their customer service, and streamline their operations for greater efficiency.

However, the path to effective automated classification has its own challenges. Organizations must grapple with the complexities of efficiently processing vast amounts of textual information while maintaining consistent accuracy in their classification results. In this post, we demonstrate how to create a fully automated workflow while keeping operational costs under control.

Data

For this solution, we used synthetic call center conversation data. For realistic training data that maintains user privacy, we generated synthetic conversations using Anthropic’s Claude 3.7 Sonnet. We used the following prompt generate synthetic data:

Task: Generate <N> synthetic conversations from customer calls to an imaginary travel
company. Come up with 10 most probable categories that calls of this nature can come 
from and treat them as classification categories for these calls. For each generated 
call create a column that indicates the category for that call. 
Conversations should follow the following format:
"User: ...
Agent: ...
User: ...
Agent: ...
...

Class: One of the 10 following categories that is most relevant to the conversation."
Ten acceptable classes:
1. Booking Inquiry - Customer asking about making new reservations
2. Reservation Change - Customer wanting to modify existing bookings
3. Cancellation Request - Customer seeking to cancel their travel plans
4. Refund Issues - Customer inquiring about getting money back
5. Travel Information - Customer seeking details about destinations, documentation, etc.
6. Complaint - Customer expressing dissatisfaction with service
7. Payment Problem - Customer having issues with billing or payments
8. Loyalty Program - Customer asking about rewards points or membership status
9. Special Accommodation - Customer requesting special arrangements
10. Technical Support - Customer having issues with website, app or booking systems

Instructions:
- Keep conversations concise
- Use John Doe for male names and Jane Doe for female names
- Use [email protected] for male email address,[email protected] for female email 
address and [email protected] for corporate email address, whenever you need to 
generate emails.Use " or ' instead of " whenever there is a quote within the
conversation

The synthetic dataset includes the following information:

Solution overview

The solution architecture uses a serverless, event-driven, scalable design to effectively handle and classify large quantities of classification requests. Built on AWS, it automatically starts working when new classification request data arrives in an Amazon Simple Storage Service (Amazon S3) bucket. The system then uses Amazon Bedrock batch processing to analyze and categorize the content at scale, minimizing the need for constant manual oversight.

The following diagram illustrates the solution architecture.

Architecture Diagram

The architecture follows a well-structured flow that facilitates reliable processing of classification requests:

We use AWS best practices in this solution, including event-driven and batch processing for optimal resource utilization, batch operations for cost-effectiveness, decoupled components for independent scaling, and least privilege access patterns. We implemented the system using the AWS Cloud Development Kit (AWS CDK) with TypeScript for infrastructure as code (IaC) and Python for application logic, making sure we achieve seamless automation, dynamic scaling, and efficient processing of classification requests, positioning it to effectively address both current requirements and future demands.

Prerequisites

To perform the solution, you must have the following prerequisites:

Deploy the solution

The solution is accessible in the GitHub repository.

Complete the following steps to set up and deploy the solution:

  1. Clone the Repository: Run the following command: git clone [email protected]:aws-samples/sample-genai-bedrock-batch-classifier.git
  2. Set Up AWS Credentials: Create an AWS Identity and Access Management (IAM) user with appropriate permissions, generate credentials for AWS Command Line Interface (AWS CLI) access, and create a profile. For instructions, see Authenticating using IAM user credentials for the AWS CLI. You can use the Admin Role for testing purposes, although it violates the principle of least privilege and should be avoided in production environments in favor of custom roles with minimal required permissions.
  3. Bootstrap the Application: In the CDK folder, run the command npm install & cdk bootstrap --profile {your_profile_name}, replacing {your_profile_name} with your AWS profile name.
  4. Deploy the Solution: Run the command cdk deploy --all --profile {your_profile_name}, replacing {your_profile_name} with your AWS profile name.

After you complete the deployment process, you will see a total of six stacks created in your AWS account, as illustrated in the following screenshot.

List of stacks

SharedStack acts as a central hub for resources that multiple parts of the system need to access. Within this stack, there are two S3 buckets: one handles internal operations behind the scenes, and the other serves as a bridge between the system and customers, so they can both submit their classification requests and retrieve their results.

DataPreparationStack serves as a data transformation engine. It’s designed to handle incoming files in three specific formats: XLSX, CSV, and JSON, which at the time of writing are the only supported input formats. This stack’s primary role is to convert these inputs into the specialized JSONL format required by Amazon Bedrock. The data processing script is available in the GitHub repo. This transformation makes sure that incoming data, regardless of its original format, is properly structured before being processed by Amazon Bedrock. The format is as follows:

{
 "recordId": ${unique_id}, 
 "modelInput": {
     "anthropic_version": "bedrock-2023-05-31", 
     "max_tokens": 1024,
     "messages": [ { 
           "role": "user", 
           "content": [{"type":"text", "text": ${initial_text}]} ],
      },
      "system": ${prompt}
}

where:
initial_text - text that you want to classify
prompt       - instructions to Bedrock service how to classify
unique_id    - id coming from the upstream service, otherwise it will be 
               automatically generated by the code

BatchClassifierStack handles the classification operations. Although currently powered by Anthropic’s Claude Haiku, the system maintains flexibility by allowing straightforward switches to alternative models as needed. This adaptability is made possible through a comprehensive constants file that serves as the system’s control center. The following configurations are available:

BatchResultsProcessingStack functions as the data postprocessing stage, transforming the Amazon Bedrock JSONL output into user-friendly formats. At the time of writing, the system supports CSV, JSON, and XLSX. These processed files are then stored in a designated output folder in the S3 bucket, organized by date for quick retrieval and management. The conversion scripts are available in the GitHub repo. The output files have the following schema:

Excel File Sample

AnalyticsStack provides a business intelligence (BI) dashboard that displays a list of classifications and allows filtering based on defined in prompt categories. It offers the following key configuration options:

Now that you’ve successfully deployed the system, you can prepare your data file—this can be either real customer data or the synthetic dataset we provided for testing. When your file is ready, go to the S3 bucket named {prefix}-{account_id}-customer-requests-bucket-{region} and upload your file to input_data folder. After the batch inference job is complete, you can view the classification results on the dashboard. You can find it under the name {prefix}-{account_id}-classifications-dashboard-{region}. The following screenshot shows a preview of what you can expect.

BI Dashboard

The dashboard will not display data until Amazon Bedrock finishes processing the batch inference jobs and the AWS Glue crawler creates the Athena table. Without these steps completed, the dashboard can’t connect to the table because it doesn’t exist yet. Additionally, you must update the QuickSight role permissions that were set up during pre-deployment. To update permissions, complete the following steps:

  1. On the QuickSight console, choose the user icon in the top navigation bar and choose Manage QuickSight.
  2. In the navigation pane, choose Security & Permissions.
  3. Verify that the role has been granted proper access to the S3 bucket with the following path format: {prefix}-{account_id}-internal-classifications-{region}.

Results

To test the solution’s performance and reliability, we tested 1,190 synthetically generated travel agency conversations from a single Excel file across multiple runs. The results were remarkably consistent across 10 consecutive runs, with processing times ranging between 11–12 minutes per batch (200 classifications in a single batch).Our solution achieved the following:

Challenges

For certain cases, the generated class didn’t exactly match the class name given in the prompt. For instance, in multiple cases, it output “Hotel/Flight Booking Inquiry” instead of “Booking Inquiry,” which was defined as the class in the prompt. This was addressed by prompt engineering and asking the model to check the final class output to match exactly with one of the provided classes.

Error handling

For troubleshooting purposes, the solution includes an Amazon DynamoDB table that tracks batch processing status, along with Amazon CloudWatch Logs. Error tracking is not automated and requires manual monitoring and validation.

Key takeaways

Although our testing focused on travel agency scenarios, the solution’s architecture is flexible and can be adapted to various classification needs across different industries and use cases.

Known limitations

The following are key limitations of the classification solution and should be considered when planning its use:

Clean up

To avoid additional charges, clean up your AWS resources when they’re no longer needed by running the command cdk destroy --all --profile {your_profile_name}, replacing {your_profile_name} with your AWS profile name.

To remove resources associated with this project, complete the following steps:

  1. Delete the S3 buckets:
    1. On the Amazon S3 console, choose Buckets in the navigation pane.
    2. Locate your buckets by searching for your {prefix}.
    3. Delete these buckets to facilitate proper cleanup.
  2. Clean up the DynamoDB resources:
    1. On the DynamoDB console, choose Tables in the navigation pane.
    2. Delete the table {prefix}-{account_id}-batch-processing-status-{region}.

This comprehensive cleanup helps make sure residual resources don’t remain in your AWS account from this project.

Conclusion

In this post, we explored how Amazon Bedrock batch inference can transform your large-scale text classification workflows. You can now automate time-consuming tasks your teams handle daily, such as analyzing lost sales opportunities, categorizing travel requests, and processing insurance claims. This solution frees your teams to focus on growing and improving your business.

Furthermore, this solution gives the opportunity to create a system that provides real-time classifications, seamlessly integrates with your communication channels, offers enhanced monitoring capabilities, and supports multiple languages for global operations.

This solution was developed for internal use in test and non-production environments only. It is the responsibility of the customer to perform their due diligence to verify the solution aligns with their compliance obligations.

We’re excited to see how you will adapt this solution to your unique challenges. Share your experience or questions in the comments—we’re here to help you get started on your automation journey.


About the authors

Nika Mishurina is a Senior Solutions Architect with Amazon Web Services. She is passionate about delighting customers through building end-to-end production-ready solutions for Amazon. Outside of work, she loves traveling, working out, and exploring new things.

Farshad Harirchi is a Principal Data Scientist at AWS Professional Services. He helps customers across industries, from retail to industrial and financial services, with the design and development of generative AI and machine learning solutions. Farshad brings extensive experience in the entire machine learning and MLOps stack. Outside of work, he enjoys traveling, playing outdoor sports, and exploring board games.