Ep168: Scaling Agentic Workloads: Why Reliable Infrastructure is Non-Negotiable for Enterprise AI by Anyscale
** AWS re:Invent 2025 Dec 1-5, Las Vegas - Register Here! **Learn how Anyscale's Ray platform enables companies like Instacart to supercharge their model training while Amazon saves heavily by shifting to Ray's multimodal capabilities.Topics Include:Ray originated at UC Berkeley when PhD students spent more time building clusters than ML modelsAnyscale now launches 1 million clusters monthly with contributions from OpenAI, Uber, Google, CoinbaseInstacart achieved 10-100x increase in model training data using Ray's scaling capabilitiesML evolved from single-node Pandas/NumPy to distributed Spark, now Ray for multimodal dataRay Core transforms simple Python functions into distributed tasks across massive compute clustersHigher-level Ray libraries simplify data processing, model training, hyperparameter tuning, and model servingAnyscale platform adds production features: auto-restart, logging, observability, and zone-aware schedulingUnlike Spark's CPU-only approach, Ray handles both CPUs and GPUs for multimodal workloadsRay enables LLM post-training and fine-tuning using reinforcement learning on enterprise dataMulti-agent systems can scale automatically with Ray Serve handling thousands of requests per secondAnyscale leverages AWS infrastructure while keeping customer data within their own VPCsRay supports EC2, EKS, and HyperPod with features like fractional GPU usage and auto-scalingParticipants:Sharath Cholleti – Member of Technical Staff, AnyscaleSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
--------
24:57
--------
24:57
Ep167: Leveraging Amazon Bedrock and Agents for Accelerating Innovation and Engineering with Trellix
** AWS re:Invent 2025 Dec 1-5, Las Vegas - Register Here! **Trellix's Director of Strategy Zak Krider reveals how they automated tedious security tasks like event parsing and threat detection using Amazon Bedrock's multi-model approach, achieving 100% accuracy while eliminating bottlenecks in their development lifecycle.Topics Include:Trellix merged FireEye and McAfee Enterprise, combining two decades of cybersecurity AI expertiseProcessing thousands of daily security events revealed traditional ML's weakness: overwhelming false positivesTwo years ago, they integrated generative AI to automate threat investigation workflowsAmazon Bedrock's multi-model access enabled rapid testing and "fail fast, learn fast" methodologyBuilt custom cybersecurity testing framework since public benchmarks don't reflect domain-specific needsAgentic AI now autonomously investigates threats across dark web, CVEs, and telemetry dataAWS NOVA builds investigation plans while Claude executes detailed threat research analysisLaunched "Sidekick" internal tool with agents mimicking human developer onboarding processesChose prompt engineering over fine-tuning for flexibility, cost-effectiveness, and faster iterationAutomated security rule generation across multiple languages that typically require unicorn developersAchieved 100% accuracy in automated event parsing, eliminating tedious manual SOC workKey lesson: don't default to one model; test and mix for optimal resultsParticipants:Zak Krider - Director of Strategy & AI, TrellixSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
--------
15:49
--------
15:49
Ep166: It’s the end of observability as we know it with Honeycomb
Honeycomb's VP of Marketing Shabih Syed reveals why traditional observability is dead and how AI-powered tools are transforming the way engineers debug production systems, with real examples.Topics Include:Observability is how you understand and troubleshoot your production systems in real-timeShabih's 18-year journey: developer to product manager to marketing VP shares unique perspectiveAI coding assistants are fundamentally changing how fast engineers ship code to productionCustomer patience is gone - one checkout failure means losing them foreverOver 90% of engineers now "vibe code" with AI, creating new complexityObservability costs are spiraling - engineers forced to limit logging, creating debugging dead-endsHoneycomb reimagines observability: meeting expectations, reducing complexity, breaking the cost curveMajor customers like Booking.com and Intercom already transforming with AI-native observabilityMCP server brings production data directly into your IDE for real-time AI assistanceCanvas enables plain English investigations to find "unknown unknowns" before they become problemsAnomaly detection helps junior engineers spot issues they wouldn't know to look forStatic dashboards are dead - AI-powered workflows are the future of system observationParticipants:Shabih Syed - VP Product Marketing, Honeycomb.io See how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
--------
22:28
--------
22:28
Ep165: Siteimprove + Bedrock Agents - Powering Accessibility at Scale
Discover how Siteimprove partnered with AWS to build an AI system processing 100 million accessibility checks monthly, making the web usable for 1.3 billion people with disabilities worldwide. Topics Include:AWS and Siteimprove partnered to solve digital accessibility at massive scale using AI.Digital accessibility ensures 1.3 billion people with disabilities can use web content effectively.Deep semantic understanding is needed to verify if content truly matches its descriptions.Siteimprove processes 75 million webpages across government, healthcare, and education sectors daily.The challenge required AWS infrastructure beyond just AI models for cost-effective scaling.Their platform unifies accessibility checks with SEO, analytics, and content performance tools.Business requirements included enterprise security, multi-region support, and flexible pricing models.They built three processing patterns: interactive conversations, overnight batch, and high-priority async.The AI Accelerator framework separates business logic from model adapters for easy expansion.Intelligent routing sends simple checks to Nova micro, complex ones to Nova Pro.Production system now processes over 100 million accessibility checks monthly using Bedrock Batch.Key lessons: cross-region inference reduces latency, prompt optimization crucial, special characters increase hallucination. Participants:Hamed Shahir - Director of AI, SiteimproveDavid Kaleko - Senior Applied Scientist, Amazon Web ServicesSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
--------
33:02
--------
33:02
Ep164: From Regulatory Burden to Business Advantage: How Archer is conquering regulatory change and compliance with Amazon Bedrock
Archer's Global Head of Engineering reveals how they're using Amazon Bedrock to help enterprises avoid billions in regulatory fines by transforming complex compliance laws into actionable AI-powered workflows.Topics Include:James Griffith, VP Engineering at Archer, leads development for risk and compliance solutionsArcher helps enterprises navigate the complex world of regulatory compliance beyond outdated spreadsheetsSince 2009, banks alone have been fined $342 billion by regulators worldwideEven "deregulated" Texas added 1,100 new laws in just one legislative sessionRegulatory data exists online but is overwhelming—too much for humans to processArcher built an AI pipeline: ingesting regulations, extracting obligations, and generating compliance controlsAmazon Bedrock eliminated the need to build ML infrastructure or hire specialized teamsModel interchangeability let them switch between Claude and Llama with just clicksBuilt-in guardrails prevented users from misusing AI without custom security developmentFrom initial vision to working product took just six months using BedrockDifferent AI models deploy globally, adapting to each country's unique regulatory stanceEngineers experiment safely with AI using Bedrock, preparing the team for the futureParticipants:James Griffith – Global Head of Engineering, ArcherSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
Stay ahead of the rapidly evolving cloud and AI landscape with the AWS for Software Companies podcast. Hear from renowned software leaders, respected industry analysts, and experienced consultants alongside AWS experts as they explore the technologies shaping the future—from generative AI and agentic systems to intelligent cloud architectures, and modern data management. Learn how AI agents are transforming enterprise workflows, how leading companies are modernizing their cloud strategies with security best practices at the core, and what's driving the next wave of SaaS innovation. New episodes drop regularly to keep you informed on the trends that matter most to your business.