The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for Observability Summit North America 2026.
Please note: This schedule is automatically displayed in Central Daylight Time (UTC -5). To see the schedule in your preferred timezone, select from the drop-down menu located at the bottom of the menu to the right.
The schedule is subject to change.
Sign up or log in to add sessions to your schedule and sync them to your phone or calendar.
AI is accelerating software development at an exponential pace, but we have no idea what our AI systems are actually doing. Agents operate across distributed frameworks. One request spawns dozens of hops with zero visibility. The OpenSearch Observability Stack closes that gap—built for open source contributors, with a growing focus on developers and operators using these systems every day. Open source. Linux Foundation-governed. One pipeline. Every framework. Every model. Every hop visible. The agentic era deserves open infrastructure, and we’ll share how this is a step towards building it together.
Anirudha is a Senior Manager, Software Development at Amazon Web Services (AWS), leading development of insight engines and visualization platforms for the OpenSearch Project. He specializes in distributed systems, data analytics, and search technologies, including architecting one... Read More →
Today, many organizations are pushing beyond existing limits for telemetry volume. Systems are ever-more distributed and generative AI workloads produce enormous amounts of data. As telemetry volumes grow, observability pipelines must become more efficient.
At scale, telemetry egress directly impacts observability spend. Cloud providers charge per gigabyte of data transferred across regions or providers, and those bytes add up quickly. The protocol used to encode telemetry determines how much data is sent over the network. Even modest improvements in encoding efficiency (i.e. the protocol) can translate into significant cost savings. However, the OpenTelemetry Protocol (OTLP) was not initially optimized for performance. Instead, it prioritized interoperability and easy adoption.
Today the OpenTelemetry community is exploring OTAP, a new stateful protocol for transmitting OpenTelemetry data based on Apache Arrow. By using columnar encoding and maintaining state throughout a stream, OTAP avoids repeatedly sending the same metadata, reducing payload size and network transfer. However, because OTAP relies on long-lived stateful streams rather than independent requests, there is additional architectural and operational complexity in its implementation. There are further challenges to larger adoption by the community; for example, Apache Arrow support varies significantly across languages.
Protocol design today is critical to efficiently scaling your systems. In this talk we will explore how protocol design affects telemetry egress and overall observability cost. We will go over some strategies for improving encoding efficiency, compare stateless and stateful approaches, and discuss the potential benefits and drawbacks of adopting a protocol like OTAP. Join us to learn more about how your protocol decisions can influence your costs over time.
Production AI agents make thousands of tool-calling decisions daily, yet observability stops at the model boundary. OpenTelemetry's GenAI semantic conventions capture token counts and latencies—what the LLM processed—but not why an agent selected a specific tool. Research (McKenzie et al., 2023) demonstrates inverse scaling: more capable models exhibit unpredictable tool selection patterns. This gap leaves engineers guessing during critical production failures.
We present gen-ai-otel, an open-source OpenTelemetry extension introducing decision-level telemetry for MCP agents. A new attribute namespace (gen_ai.agent.*) captures tool selection confidence, session context, permission scope validation, and baseline deviations. The zero-sidecar architecture routes telemetry through standard Collector pipelines to existing backends—Jaeger, Prometheus, or graph databases—with low overhead and cardinality-aware attributes.
A live demo reconstructs an agent's decision chain, revealing anomalies invisible to token metrics—reducing decision-debugging time. Attendees leave with: 1) Collector configs, 2) Grafana dashboards for confidence tracking, 3) demo code and repo—all Apache 2.0 licensed.
Senior Chief Researcher, TUBITAK (THE SCIENTIFIC AND TECHNOLOGICAL RESEARCH COUNCIL OF TÜRKİYE)
Mustafa Dayıoğlu (PhD, ITU) is a security architect with 25 years of experience in cybersecurity at TÜBİTAK, designing large-scale security systems serving 80 million citizens for regulated environments. Specializes in threat modeling and protocol development for AI agent systems... Read More →
R&D Architect with 25+ years building distributed systems and leading open research collaborations. Principal collaborator on SFAMDF and GraphSentinel—open initiatives exploring proactive, federated security patterns for MCP‑based agentic AI systems. Research interests include... Read More →