Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How Kafka improves agentic AI

June 16, 2025
Maarten Vandeperre
Related topics:
Artificial intelligenceData integrationDeveloper ProductivityEvent-DrivenIntegrationKafkaPlatform engineeringServerless
Related products:
Red Hat AIRed Hat build of Apache CamelRed Hat build of DebeziumRed Hat build of QuarkusRed Hat FuseRed Hat OpenShift ServerlessStreams for Apache Kafka

Share:

    If service mesh is the unsung hero of more observable AI inference with enhanced security (as demonstrated in my previous article), then Apache Kafka is the invisible infrastructure backbone of everything that happens in between. As AI systems evolve from simple API calls to multi-step, autonomous agents, one thing is clear, event-driven coordination is foundational.

    Whether you’re chaining together large language model (LLM) prompts, monitoring user behavior, or triggering workflows based on dynamic conditions, your agent is essentially just another actor in a distributed, event-driven system. In that system, Apache Kafka is more than a tool; it’s the foundation.

    After a brief explanation of event-driven architecture and Kafka, let’s dive into how Kafka can improve agentic AI.

    Event-driven architecture

    Event-driven architecture is the foundation of modern systems. Traditional applications are often designed like call-and-response interactions (i.e., synchronous) where you click a button and get a result. But modern systems are more like conversations. They react to events, not just commands.

    In an event-driven architecture, components respond to triggers, often asynchronously. Instead of chaining synchronous calls, services communicate via events (e.g., a payment service emits a "payment.completed" event). Downstream systems (e.g., invoice generator, notification sender, and shipping service) pick it up when and if needed.

    Consider the following use case:

    1. A user submits a form for a car loan.
    2. The system emits a loan.requested event.
    3. Multiple services respond independently:
      • One checks credit scores.
      • One notifies the sales team.
      • One generates an application document.

    Figure 1 depicts the process that allows for loose coupling, scalability, and resilience. 

    In an event-driven architecture, components respond to triggers, often asynchronously and disconnected from other consuming components
    In an event-driven architecture, components respond to triggers, often asynchronously and disconnected from other consuming components
    Created by Maarten Vandeperre, License under Apache 2.0.
    Figure 1: In an event-driven architecture, components respond to triggers, often asynchronously and disconnected from other consuming components.

    Kafka 101

    Apache Kafka is an open source distributed event-streaming platform. It acts as a durable, high-throughput, low-latency message broker, optimized for handling real-time data feeds.

    The core concepts in Apache Kafka include:

    • Producer: An application that writes events (messages) to Kafka topics.
    • Consumer: An application that reads from those topics.
    • Topic: A named stream of messages. You can think of it as a log.
    • Consumer group: A group of consumers that share the load of processing messages from a topic. Each message is delivered to one consumer within the group. Consumer groups operate independently of each other, meaning that the same message can be processed in different ways by different groups. This allows for multiple parallel applications (e.g., analytics, logging, or transformation services), to consume the same event stream without interfering with one another.

    Figure 2 depicts these concepts.

    A visualisation of a producer, consumer, consumer group and a (partitioned) Kafka topic in between
    A visualisation of a producer, consumer, consumer group and a (partitioned) Kafka topic in between
    Created by Maarten Vandeperre, License under Apache 2.0.
    Figure 2: A visualization of a producer, consumer, consumer group and a (partitioned) Kafka topic in between.

    Kafka decouples senders and receivers, allowing each component to evolve independently. It also retains data for a configurable retention period, allowing consumers to reread history.

    This architecture marks a shift from traditional synchronous models, where a user sends a request and waits for a response, to an asynchronous, distributed design. In an agentic AI context, this often means that the result of a process can't be returned immediately. It's produced by multiple services working in sequence. As such, the final output must be pushed to the user asynchronously by using server-sent events (SSE), web sockets, or polling endpoints. This aligns better with the event-driven nature of agentic workflows and improves scalability and fault isolation.

    Agentic AI: Event-driven architecture by nature

    At its core, an agent isn’t a single model answering a prompt. It’s a composition of steps, decisions, memory, observations, and actions. You can treat each of those steps as an event in a larger pipeline.

    Imagine a customer support agent:

    • It receives a user request (event).
    • It queries a knowledge base or LLM (event).
    • It might escalate to a human if confidence is low (event).
    • It then logs interaction to CRM (event).

    These aren’t synchronous API hops. They’re a chain of intentions, often spread across services, clouds, or runtimes. Kafka excels here by decoupling producers (the models) from consumers (tools that log, route, throttle, or transform output, or even other models or agents). The result? A loosely coupled yet highly controllable system.

    Add guardrails without touching the code

    One of the superpowers of Kafka is injecting new behavior at runtime (e.g., adding new consumer groups on topics or topics in between), without needing to redeploy or modify existing services (see Figure 4). 

    When you're working with powerful LLMs, you want to ensure they stay on-brand and operate within compliance and safety guidelines. As described, with Kafka, you can add these guardrails without rewriting your model or application logic. This can make compliance and safety guardrails modular, not hardcoded, which is crucial when your AI evolves faster than your audit committee.

    Input guardrails

    Imagine a user visits your car dealership website and uses the chatbot feature. They try to prompt it with something unrelated or inappropriate, such as:

    • “Write a romantic poem.”
    • “Tell me how to hack a Tesla.”

    These inputs aren’t relevant to your business goals and might violate policies. Instead of feeding these directly to the model:

    • The chatbot publishes the raw user message to a Kafka topic: "user.messages.raw".
    • An input validation service (Kafka consumer) reads from this topic, analyzes the message intent, and checks for unsafe or off-topic content.
    • If the message is safe and relevant (e.g., "What’s the latest model of Mercedes?"), it’s republished to "user.messages.cleaned".
    • The LLM only consumes from this cleaned topic.

    Output guardrails

    Now imagine the LLM replying:

    • "You might like a Porsche instead."
    • "I would never buy a European car."

    Even if it's technically true, this isn't something a Mercedes chatbot should be suggesting. Instead, this should be the process:

    • The model's response writes to "agent.responses.raw".
    • A brand compliance service subscribes to this topic and checks whether the message aligns with brand and policy rules.
    • If it violates a rule (e.g., suggesting a competitor), it routes the message to "agent.responses.flagged" and replaces it with a fallback message: "Let me tell you more about the features of our latest Mercedes models."

    Kafka enables this entire workflow without touching the model code by inserting new consumers or consumer groups that act as intelligent filters or policy enforcers.

    Throttle models at the right moment

    Large models don’t just cost money, they consume attention and compute like wildfire. 

     You may want to:

    • Rate-limit calls to LLMs (especially shared ones).
    • Queue less urgent requests (e.g., background summaries).
    • Trigger fallbacks for overloaded GPU nodes.

    Kafka enables you to buffer and batch intelligently. To avoid over-engineering, you can use:

    • Kafka + KServe queue policies.
    • Kafka Streams to implement smart scheduling.
    • Custom quotas based on topic partitions or message headers.

    This lets you throttle around your models, not inside them, preserving flexibility and avoiding noisy code-level logic.

    Prioritize with Kafka

    Suppose you run an agentic software-as-a-service (SaaS) with multiple subscription tiers as follows:

    • Free users go through a shared inference service.
    • Pro users have access to dedicated GPU lanes.
    • Enterprise customers get guaranteed latency SLAs.

    Kafka can make this trivial by using different topics:

    • agent.requests.free
    • agent.requests.pro
    • agent.requests.enterprise

    Serving can be based on license grade or priority queueing, each routed to separate backends or consumer groups. You can even implement SLA-aware prioritization, where consumer groups prioritize enterprise topics before others, effectively creating a multi-class queue for AI traffic.

    This architecture is nearly impossible to maintain with traditional REST polling, but it’s native to event streaming.

    Auditing and traceability

    Kafka’s append-only log is a goldmine for observability.

    Let’s say you want to trace what happened with a specific user session or conversation:

    • Each message has been given a traceId or sessionId.
    • You have spun up a (temporary) consumer group that filters messages with that traceId.
    • Voilà, you have a re-playable audit log of that session. 

    This is invaluable for debugging, security audits, or even post-mortem explainability in agentic decisions. Kafka isn’t just a bus; it’s your forensic record of agent behavior.

    Kafka plays nice with everything

    Kafka doesn’t live in a vacuum. It shines even brighter thanks to its ecosystem, which is massive and battle-tested:

    • Need low-code data routing? Use Apache Camel (with Quarkus for speed and startup performance).
    • Want to scale your AI agent actions on Kubernetes or add an abstraction layer around Kafka? Use Knative Eventing with Kafka as your backend.
    • Need to react on input from Slack, S3, Elasticsearch, or Redis? Kafka Connect has you covered.
    • Need to push enriched output to Slack, S3, Elasticsearch, or Redis? Again, Kafka Connect has you covered.

    Kafka isn’t just about events. It’s about glueing your entire AI fabric together in a flexible and robust manner.

    Apache Camel (with Quarkus)

    Use Camel’s low-code DSL to integrate Kafka topics with databases, REST APIs, CSV files, and third-party services. With Quarkus, you get blazing-fast startup and memory efficiency for running Camel (on Kubernetes). The combination enables rapid prototyping without sacrificing maintainability, aligning well with AI's fast-moving workflows (Figure 3).

    Event-driven architecture with integration layer that abstracts away Kafka from and to external or internal (micro-)services
    Event-driven architecture with integration layer that abstracts away Kafka from and to external or internal (micro-)services
    Created by Maarten Vandeperre, License under Apache 2.0.
    Figure 3: Event-driven architecture with integration layer that abstracts away Kafka from and to external or internal microservices.

    Knative eventing

    Use Kafka as your backbone in Knative. Reference the Camel event-driven architecture image. You can switch the Camel integrations with Knative, for the rest the image will look the same. 

    If you follow the approach described in this article, you can design your system around standardized interaction patterns. For instance, you can create a REST-based API proxy for each type of interaction (e.g., model, database, file system, third-party APIs, MCP, etc.). Each proxy handles the main function and calls a predefined notification endpoint upon success or failure.

    Knative Eventing can then map incoming Kafka messages to these REST (proxy) endpoints, effectively orchestrating the distributed flow. On the output side, notification endpoints triggered by proxies can also be defined and managed in Knative. This means adding functionality (i.e., validation, enrichment, or throttling) or rerouting flows becomes a matter of editing Knative YAML definitions, not the application code. This separation of concerns allows platform or SRE teams to manage, extend, and adapt AI workflows without impacting the development lifecycle.

    You can:

    • Auto-scale model functions.
    • Plug into observability stacks (Kiali, Prometheus).
    • Connect with Tekton pipelines for retraining.
    • Abstract away Kafka with Knative definitions.
    Example of a Knative channel definition, which will result in a REST endpoint that can be called from other component/services
    Example of a Knative channel definition, which will result in a REST endpoint that can be called from other component/services
    Created by Maarten Vandeperre, License under Apache 2.0.
    Example Knative subscription definition, which will result in consuming a message from the given channel (i.e., Kafka topic) and post it on the defined REST endpoint
    Example Knative subscription definition, which will result in consuming a message from the given channel (i.e., Kafka topic) and post it on the defined REST endpoint
    Created by Maarten Vandeperre, License under Apache 2.0.

    Kafka Connect and operator support

    Hundreds of connectors to databases, Elasticsearch, MongoDB, and more. To sync customer chats with CRM or Salesforce, just plug and stream.

    On Red Hat OpenShift, you can deploy Kafka via the AMQ Streams operator and manage clusters declaratively. Dev and Ops teams speak the same GitOps language.

    Beyond tooling: Standardization meets innovation

    Kafka gives your organization a shared vocabulary around events and responsibilities. This aligns with broader architectural goals: separate domains, enable self-service pipelines, and move fast with control. It creates an infrastructure that is opinionated where it matters (observability, scalability) and flexible where it counts (tooling, language, flow). Read the article, Standardization and innovation are no longer enemies for more information.

    Emergency stops: Humans in the loop

    Sometimes, guardrails aren’t enough. You need a kill switch, a way to pause or reroute AI behavior in real time without relying on code changes (e.g., when you detect severe hallucinations).

    Soft stop

    A soft stop could be useful in the following scenario:

    • Imagine your monitoring system detects a spike in hallucinations or unsafe behavior.
    • You don’t want to delete anything or push hotfixes—you just need breathing room.
    • With Kafka, you can scale down or pause consumers of a topic like "agent.execution".
    • That single change stops the agent from taking action without disrupting upstream services.
    • Later, once the issue is investigated or mitigated, consumers can resume from where they left off: no data lost.

    Hard stop

    Alternatively, a hard stop could be useful as follows:

    • For high-risk or sensitive use cases, you might want human approval before the agent response reaches the user.
    • Instead of publishing to "agent.responses.final", the final output is rerouted to "agent.responses.pending_approval".
    • A human dashboard application (i.e., another Kafka consumer) shows these messages to moderators.
    • Employees can approve, reject, or edit the message, which then gets sent to the user through "agent.responses.approved".

    Human escalation loops

    With human escalation loops, you can:

    • Add triggers for manual review based on output classification. For instance, if the toxicity score > 0.7, bypass automated delivery.
    • Add routing logic when revenue is at stake: When a discount (maybe above a certain threshold) is granted by the LLM, you route that request to a moderation team (e.g., "discount.queries.pending_approval").
    • Add routing logic, so different topics go to different moderation teams: "finance.queries.pending_approval", "legal.queries.pending_approval", etc.
    • Track how often you needed manual intervention, offering valuable metrics on agent reliability.

    Kafka enables this escalation logic to be implemented as just another stream processing step, not as a redesign of your AI.

    The result is that you can implement "humans in the loop" dynamically with full observability and without rewriting any business logic or risking user trust.

    Event-driven AI marketing assistant

    Let’s build a hypothetical event-driven marketing assistant.

    1. Event: The customer submits a form ⇒ "lead.new".
    2. Input validator checks tone and content ⇒ "lead.cleaned".
    3. LLM generates sales pitch ⇒ "pitch.generated".
    4. Brand filter runs guardrails ⇒ "pitch.compliant".
    5. KPI analyzer adds metadata ⇒ "pitch.analyzed".
    6. External enrichment service calls an API for product availability ⇒ "availability.checked".
    7. Response combiner merges enriched data into one final message ⇒ "message.ready".
    8. Emergency stop layer checks for risky recommendations ⇒ "message.pending_approval".
    9. Moderator reviews and approves or edits ⇒ "message.approved".
    10. The message router sends:
      1. Slack to sales ⇒ "notifications.sales"
      2. Email to customer ⇒ "email.send"
      3. CRM update ⇒ "crm.sync"

    Kafka powers all of this. A decoupled service handles each step that listens to and emits events. Emergency approval ensures that no message reaches the customer without passing through brand and safety checks, providing full traceability and control.

    You can add new validations, models, or business logic by simply plugging in new consumers with no redeployments and no rewiring, just evolving your system at your pace (Figure 4).

     

    Example (agentic) event-driven workflow with the possibility to introduce an emergency button when one or multiple LLMs start to halucinate
    Example (agentic) event-driven workflow with the possibility to introduce an emergency button when one or multiple LLMs start to halucinate
    Created by Maarten Vandeperre, License under Apache 2.0.
    Figure 4: An (agentic) event-driven workflow with the possibility of introducing an emergency button when one or multiple LLMs start to hallucinate.

    Kafka is the backbone of agentic AI

    Agentic AI isn’t monolithic. Kafka was built for a modular, asynchronous, and ever-evolving system.

    Kafka offers:

    • Loose coupling between components.
    • Dynamic throttling and routing.
    • Built-in (or around) observability and audit ability.
    • The freedom to grow and rewire without fear.

    A service mesh gives you connections with greater observability and stronger security posture between services; and Kafka gives you the intelligent and auditable glue between events. That’s exactly what the next generation of AI systems needs. So the next time your AI agent starts making plans, make sure those plans go through Kafka.

    Check out my previous article, How to use service mesh to improve AI model security.

     

    Related Posts

    • Serverless Kafka on Kubernetes

    • Get OpenShift Streams for Apache Kafka on AWS Marketplace

    • How Knative broker GA enhances Kafka on OpenShift Serverless

    • Event-driven architecture: What is an event?

    Recent Posts

    • Assessing AI for OpenShift operations: Advanced configurations

    • OpenShift Lightspeed: Assessing AI for OpenShift operations

    • OpenShift Data Foundation and HashiCorp Vault securing data

    • Axolotl meets LLM Compressor: Fast, sparse, open

    • What’s new for developers in Red Hat OpenShift 4.19

    What’s up next?

    This short e-book examines the compelling combination of open source and artificial intelligence (AI), showcasing the benefits of effective open source-licensed models and tools for developers.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue