Loading…
May 21-22, 2026
Learn more and Register to Attend

The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for Observability Summit North America 2026.

Please note: This schedule is automatically displayed in Central Daylight Time (UTC -5). To see the schedule in your preferred timezone, select from the drop-down menu located at the bottom of the menu to the right.

The schedule is subject to change.
Company: Beginner clear filter
Thursday, May 21
 

10:20am CDT

The Invisible Tax: How Data Format Conversions Drive up Telemetry Pipeline Costs - Cijo Thomas & Joshua MacDonald, Microsoft
Thursday May 21, 2026 10:20am - 10:45am CDT
Telemetry signals traverse long pipelines before reaching observability backends. While enrichment, filtering, and redaction provide clear value, significant compute cost often comes from repeated conversion through different data formats.
Telemetry commonly flows through SDK formats, wire protocols, collector‑internal formats, and backend ingestion schemas. Each boundary introduces marshaling, unmarshalling and copying. These transformations add no new information, yet consume CPU and memory and scale linearly with volume—creating a hidden "transform tax" that compounds dramatically at terabyte scale.
This talk will share results from measuring instrumented OpenTelemetry SDK and Collector pipelines. We quantify compute spent on pure format conversion versus value‑generating processing and show how these costs grow with scale.
Attendees will learn about conversion costs and strategies to reduce waste: eliminating unnecessary translations, aligning pipeline representations, leveraging zero‑copy techniques, and minimizing transformation hops between pipeline stages. We also examine Apache Arrow‑based representations as one approach to reducing this overhead.
Speakers
avatar for Cijo Thomas

Cijo Thomas

Principal Software Engineer, Microsoft
Cijo is a Software Engineer at Microsoft specializing in Observability. He has been deeply involved with the OpenTelemetry project since its inception and is a core maintainer for the OpenTelemetry .NET and OpenTelemetry Rust implementations. His expertise extends beyond OpenTelemetry... Read More →
avatar for Joshua MacDonald

Joshua MacDonald

Principal Software Engineer, Microsoft
Joshua MacDonald is an OpenTelemetry contributor working in the observability industry. On the side, he writes open-source telemetry software and operates a community water system.
Thursday May 21, 2026 10:20am - 10:45am CDT
Level One | Ballroom A
  CNCF Observability Projects

12:05pm CDT

⚡ Lightning Talk: From Collector To Terminal: A Better Way To See Your OpenTelemetry Logs - Jon Reeve, ControlTheory
Thursday May 21, 2026 12:05pm - 12:15pm CDT
The OpenTelemetry Collector is powerful, but the "debug exporter" only shows raw output. What if you could see your OpenTelemetry logs - with structure, filters, and context - right in your terminal?

This talk introduces Gonzo, an open-source, OTLP-native terminal UI that visualizes logs from the Collector or any OTLP-capable source in real time. Learn how to validate both source instrumentation, and Collector pipelines - including components like filelog, k8sattributes, and transform - without a backend.

Whether debugging, testing configs, or teaching OTel, Gonzo offers a faster, clearer way to understand your telemetry as it flows.

Key Takeaways:
- Validate source instrumentation and Collector pipelines end-to-end
- See enriched OTel logs with structure and context in the terminal
- Debug and iterate on OTel configs faster - no backend required
Speakers
avatar for Jon Reeve

Jon Reeve

CPO and Co-founder, ControlTheory
Jonathan Reeve is a co-founder of ControlTheory, where he helps teams take control of their observability data with smarter, more efficient telemetry pipelines. A passionate advocate for OpenTelemetry and open standards, Jonathan focuses on making observability more scalable, cost-effective... Read More →
Thursday May 21, 2026 12:05pm - 12:15pm CDT
Level One | Ballroom B

2:55pm CDT

When the Cloud Fails: Debugging the "Undocumented" - Dhruv Jain, Gojek (GoTo Group) Indonesia
Thursday May 21, 2026 2:55pm - 3:20pm CDT
What happens when a system degrades under high load while all internal metrics remain “green”? At hyperscale, supporting on-demand services across Southeast Asia’s most populous countries, a team observed up to a 7% drop in message delivery. The root cause was not application code, messaging brokers, or load balancers, but a hidden limitation deep within a cloud provider’s firewall.

This war-story session presents a forensic investigation into a managed cloud load balancer and its interaction with connection-tracking tables. The talk walks through the production cutover that triggered the issue and the targeted load testing that ultimately isolated the failure to cloud infrastructure behavior invisible to standard monitoring.

Beyond root cause analysis, the session focuses on outcomes: how sustained, evidence-based debugging led the cloud provider to acknowledge the issue—initially labeled a “limitation”—and introduce a new observability metric, firewall/connections_tracked. Attendees will leave with a practical framework for debugging black-box cloud failures and identifying the node-level metrics needed to detect silent network drops before they impact users.
Speakers
avatar for Dhruv Jain

Dhruv Jain

Lead Software Engineer, Gojek (GoTo Group)
Dhruv Jain is a Lead Software Engineer at Gojek, where he focuses on building and scaling MQTT infrastructure that handles millions of concurrent connections across Southeast Asia. Beyond his work at Gojek, he is an active contributor to the open-source community and Google Summer... Read More →
Thursday May 21, 2026 2:55pm - 3:20pm CDT
Level One | Ballroom A
  End-User Case Studies
 
Friday, May 22
 

10:20am CDT

Exploring Observability with MCP Servers - Tiffany Jernigan, Grafana Labs
Friday May 22, 2026 10:20am - 10:45am CDT
You may have heard of the pillars of observability: metrics, logs, traces, and, depending on who you ask, profiles. As systems grow in complexity, the need to both individually understand and correlate these signals becomes paramount for rapid incident detection, root cause analysis, and performance optimization. Yet, even with advances like OpenTelemetry, making sense of your own data often requires learning specialized query languages and navigating complex toolchains, which is a barrier for many users.

While AI tools like ChatGPT can offer general advice, they lack access to your specific observability data. This is where Model Context Protocol (MCP) servers come in. MCP servers provide a standardized way for AI assistants and other tools to securely connect to your observability data, making it easier to investigate and diagnose issues faster using natural language.

In this talk, we’ll cover MCP and demonstrate how to explore your observability data using Grafana MCP, while also touching on how the same approach can work with other MCP-compatible tools or custom MCP servers.
Speakers
avatar for Tiffany Jernigan

Tiffany Jernigan

Senior Developer Advocate, Grafana Labs
Tiffany is senior developer advocate at Grafana Labs and a CNCF Ambassador. She also formerly worked as a software developer and developer advocate at VMware, Amazon, Docker, and Intel. Prior to that, she graduated from Georgia Tech with a degree in electrical engineering. In her... Read More →
Friday May 22, 2026 10:20am - 10:45am CDT
Level One | Ballroom A
  AI and MCP in Observability

10:50am CDT

Don't Let Users Find Your Outages: Synthetic Monitoring for Kubernetes Platforms - Kate Agnew, Marriott & David Norton, Platformers
Friday May 22, 2026 10:50am - 11:15am CDT
No platform owner wants to be told their platform is down by a user. A core responsibility of the platform operating model is ensuring a reliable platform for the organization. In practice, it isn't always easy to detect when things are broken, especially when it falls outside of the traditional metrics coverage.

In our work, we adopted synthetic monitoring using Kuberhealthy, a CNCF project, to gain better visibility into whether the Kubernetes platform is operating as a user would expect. Synthetic monitoring allows us to replicate application developer workflows to validate end-to-end functionality of the platform.

Come and learn about implementing synthetics, how to not break things, and broadly how to improve stability with Kubernetes using synthetic monitoring.
Speakers
avatar for Kate Agnew

Kate Agnew

Sr. Director of Platform Engineering, Marriott
Kate Agnew is a Sr Director of Platform Engineering at Marriott, where she manages the enterprise Kubernetes and Service Mesh platform. Prior to Marriott, she held a similar platform leadership role at Optum, and has had multiple other leadership and technology positions at smaller... Read More →
avatar for David Norton

David Norton

President and Principal Consultant, Platformers
David Norton is a founder and principal consultant at Platformers. He has been working in cloud platform engineering since 2016. Prior to that, he worked as an application developer.

David lives in St. Louis Park, MN, and usually enjoys spending time with his family, playing pickleball, reading, and fishing... Read More →
Friday May 22, 2026 10:50am - 11:15am CDT
Level One | Ballroom B

11:50am CDT

⚡ Lightning Talk: Show Me the Money: Metrics Edition - Brian Davis, Red Canary
Friday May 22, 2026 11:50am - 12:00pm CDT
Existing cloud and Kubernetes cost management tools struggle to track expenses at a granular level, leaving engineers unable to answer critical questions like: How much is one specific customer costing us in DynamoDB usage? Or, which system component is consuming the most of our Kafka cluster?1


This lightning talk demonstrates how to leverage existing observability frameworks to gain detailed, low-level cost insights. Attendees will learn basic techniques to instrument standard metrics—such as component name, customer ID, and team—with custom labels for fine-grained cost allocation.1


This session includes a practical case study from Red Canary, who has used this exact methodology for over five years to transform their tactical decision-making and better manage cloud spend. By treating cost allocation as an observability problem, engineers can provide the finance team with the deep data required for effective resource management.1


Attendees will leave with an actionable plan for implementing a metrics-based cost tracking system (likely with the tooling you already have), independent of high-level cloud billing tools, to drive significant operational efficiency.
Speakers
avatar for Brian Davis

Brian Davis

Principal Software Architect, Red Canary
Principal Software Architect at Red Canary, a Zscaler Company, Brian Davis has been building and monitoring complex systems for over two decades, ranging from signal-processing algorithms to complex data-processing applications, deploying these on Solaris servers, on-prem virtual... Read More →
Friday May 22, 2026 11:50am - 12:00pm CDT
Level One | Ballroom A
  End-User Case Studies

1:25pm CDT

eBPF Application Instrumentation for Java: Challenges, Design, and Real-World Examples - Endre Sara, Causely, Inc & Stephen Lang, Grafana Labs
Friday May 22, 2026 1:25pm - 1:50pm CDT
Java is one of the most widely used languages for enterprise applications. Frameworks such as Spring Boot and Quarkus make observability straightforward when the OpenTelemetry Java agent can be injected.

In many production environments, however, modifying application code or JVM startup parameters is not possible. In these cases, eBPF-based instrumentation enables observability without code changes, but applying eBPF to Java is challenging. JVM abstraction layers, differences across JDK versions, and the diversity of frameworks and libraries complicate generic instrumentation. The problem becomes even harder when applications rely on TLS-encrypted communication such as HTTPS, gRPC, databases, and messaging systems, where payloads are opaque.

This talk explains how the OpenTelemetry eBPF Instrumentation (OBI) project addresses these challenges, covering key design decisions, trade-offs, and current limitations. The discussion is grounded in real-world examples, including Spring Boot services using HTTPS and gRPC, and a Quarkus application with TLS-encrypted PostgreSQL and Kafka, showing what is possible today with agentless Java observability using eBPF.
Speakers
avatar for Stephen Lang

Stephen Lang

Staff Software Engineer, Grafana Labs
Stephen is a Staff Software Engineer on Grafana's Beyla team and an approver for the OpenTelemetry eBPF Instrumentation (OBI) project.
avatar for Endre Sara

Endre Sara

Co-Founder, Causely, Inc
Endre is a Co-Founder of Causely, where he’s building the IT industry’s first causal reasoning. Previously, Endre was VP of Advanced Engineering at Turbonomic. Prior to Turbonomic, Endre was a VP at Goldman Sachs. Endre holds an M.E. in Electrical Engineering from the Technical... Read More →
Friday May 22, 2026 1:25pm - 1:50pm CDT
Level One | Ballroom B
  CNCF Observability Projects

4:10pm CDT

[CANCELLATION] The Missing Layer in eBPF Observability: Storage - Kritik Sachdeva, IBM
Friday May 22, 2026 4:10pm - 4:35pm CDT
Modern observability has embraced eBPF for profiling CPU usage and tracing network paths in production systems. Yet one critical layer remains largely under-instrumented: storage. Despite being a frequent source of performance issues, storage I/O is still treated as a black box, especially in cloud native environments.

This talk we will walk through the basic storage I/O path in Linux and Kubernetes, highlight where traditional metrics fall short, and discuss the kinds of storage latency and wait signals that eBPF can surface at runtime without requiring kernel modifications or specialized debugging setups.

Using simple examples, the session will show how hidden storage latency and queuing effects surface in real workloads, and why these blind spots become more visible with data-intensive and AI workloads where applications or GPUs often wait on storage without clear indicators.

By the end of this talk, attendees will gain a practical understanding of where storage observability breaks down today, what eBPF can realistically help uncover at a foundational level, and how to reason about storage-related performance issues alongside CPU and networking metrics.
Speakers
avatar for kritik sachdeva

kritik sachdeva

Technical Support Professional, IBM
I’m Kritik Sachdeva, currently working as a Support Professional at IBM. I’ve been working with Ceph & OpenShift for the past 5 years, and since college I had a great interest in technologies like K8s, containers, or Ceph.

Since then, I’ve enjoyed exploring how different... Read More →
Friday May 22, 2026 4:10pm - 4:35pm CDT
Level One | Ballroom B
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Content Experience Level
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.