Loading…
May 21-22, 2026
Learn more and Register to Attend

The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for Observability Summit North America 2026.

Please note: This schedule is automatically displayed in Central Daylight Time (UTC -5). To see the schedule in your preferred timezone, select from the drop-down menu located at the bottom of the menu to the right.

The schedule is subject to change.
Type: End-User Case Studies clear filter
Thursday, May 21
 

10:20am CDT

[CANCELLATION] Scaling a Proprietary-to-OpenTelemetry Migration With AI-Assisted, Spec-Driven Workflows - Ying Mo & Paras Kampasi, IBM
Thursday May 21, 2026 10:20am - 10:45am CDT
This talk presents a practical methodology for migrating a large proprietary observability platform to an OpenTelemetry-native architecture, using a GenAI-assisted workflow paired with a robust spec-driven strategy. Faced with hundreds of custom Java-based sensors, the engineering team designed a spec-driven conversion process that leverages GenAI to extract specifications, generate unit tests, and assist in implementing Go-based OpenTelemetry receivers. Each stage incorporates human review and test feedback loops to address the reliability limitations of GenAI and ensure functional correctness.

Additionally, a data-driven feasibility evaluation was conducted prior to large-scale conversion, where defined task types were benchmarked with and without GenAI to quantify effort savings and highlight where GenAI provides the greatest value.

Attendees will learn a reproducible workflow for large-scale migrations from proprietary to OpenTelemetry, how to pair GenAI with automated testing to manage risk, and insights on where GenAI accelerates real-world engineering tasks without compromising quality.
Speakers
avatar for Ying Mo

Ying Mo

Senior Software Engineer, IBM
Ying Mo is a Senior Software Engineer at IBM, recently working on IBM Instana, an observability platform, leading engineering team to transform the product to OpenTelemetry native. He is always enthusiastic to bring innovative ideas into product by leveraging open source technology... Read More →
avatar for Paras Kampasi

Paras Kampasi

Technical Product Manager, IBM
I work at the intersection of OpenTelemetry, observability, and modern cloud-native practices, helping teams make complex systems understandable and reliable. I speak and write about practical ways to apply open standards, close feedback loops between SREs and product teams, and turn... Read More →
Thursday May 21, 2026 10:20am - 10:45am CDT
Level One | Ballroom B
  End-User Case Studies

1:25pm CDT

Unified End-to-End Observability: How Comcast Generates SpanMetrics at Enterprise Scale - Raghu Vamshi Challa, Comcast
Thursday May 21, 2026 1:25pm - 1:50pm CDT
Enterprises often struggle with the "black box" nature of proprietary APM tools and the high cost of distributed tracing at scale. In this session, we will demonstrate how Comcast tackled this challenge by migrating 350 critical applications from AppDynamics to a cloud-native OpenTelemetry (OTel) stack, achieving a truly unified end-to-end observability experience.

We will pull back the curtain on the architecture that powers this migration. Specifically, we will show how we leveraged the OpenTelemetry Collector to generate Request, Error, and Duration (R.E.D.) metrics from trace data using the SpanMetrics connector. A key highlight will be our unique deployment of Conduit, which serves as a resilient transport layer to ensure data integrity and effective load balancing in a high-volume environment.

Attendees will leave with a blueprint for breaking free from APM vendor lock-in. To help the community fast-track this transition, we will also be sharing and walking through our reusable, battle-tested Grafana dashboards that can be leveraged by any enterprise.
Speakers
avatar for Raghu Challa

Raghu Challa

Comcast Engineer 6, Software Development & Engineering - Backend Engineering, Comcast
Raghu is an Observability Lead at Comcast, driving the enterprise-wide migration from legacy APM tools to OpenTelemetry. He specializes in designing high-scale telemetry pipelines that process massive volumes of trace data. Raghu is passionate about democratizing observability and... Read More →
Thursday May 21, 2026 1:25pm - 1:50pm CDT
Level One | Ballroom B
  End-User Case Studies

1:55pm CDT

Taming Tenancy, Cost and Architecture at Collibra Through OpenTelemetry and Our Telemetry Backbone - Alex Van Boxel, Collibra
Thursday May 21, 2026 1:55pm - 2:20pm CDT
Operating a SaaS platform presents the same observability problems as any other enterprise, but due to the scale and tenancy, we introduce a huge multiplier on the observability signals, having an effect on cost and effectiveness.

This session dives into the techniques Collibra used to tame these problems and how to maintain clarity when infrastructure spans virtual machines, modern Kubernetes clusters, and a complex mix of single- and multi-tenant architectures. Without the right context, telemetry data becomes a noisy, indistinguishable flood.

We will dive into the architectural decision to leverage the C4 system model, ensuring every piece of telemetry carries the vital context of what it belongs to and where it sits in the hierarchy. Enabling us to gain insights into both signal attribution and allowing virtual chargebacks. The presentation details the implementation of a pipeline using custom-built OpenTelemetry collectors designed to handle the data and enrich it before sending it to the appropriate backends.

This session will give you practical insights on the challenges SaaS platforms have, but the techniques that are used to tame them can be applied everywhere.
Speakers
avatar for Alex Van Boxel

Alex Van Boxel

Principal System Architect, Collibra
Alex Van Boxel is a Principal System Architect at Collibra. With an engineering background in R&D at Alcatel-Lucent, Progress Software, and Veepee, he loves to focus on the fundamental building blocks of the software industry. That means reading, understanding, and contributing to... Read More →
Thursday May 21, 2026 1:55pm - 2:20pm CDT
Level One | Ballroom A
  End-User Case Studies
  • Content Experience Level Any

2:55pm CDT

When the Cloud Fails: Debugging the "Undocumented" - Dhruv Jain, Gojek (GoTo Group) Indonesia
Thursday May 21, 2026 2:55pm - 3:20pm CDT
What happens when a system degrades under high load while all internal metrics remain “green”? At hyperscale, supporting on-demand services across Southeast Asia’s most populous countries, a team observed up to a 7% drop in message delivery. The root cause was not application code, messaging brokers, or load balancers, but a hidden limitation deep within a cloud provider’s firewall.

This war-story session presents a forensic investigation into a managed cloud load balancer and its interaction with connection-tracking tables. The talk walks through the production cutover that triggered the issue and the targeted load testing that ultimately isolated the failure to cloud infrastructure behavior invisible to standard monitoring.

Beyond root cause analysis, the session focuses on outcomes: how sustained, evidence-based debugging led the cloud provider to acknowledge the issue—initially labeled a “limitation”—and introduce a new observability metric, firewall/connections_tracked. Attendees will leave with a practical framework for debugging black-box cloud failures and identifying the node-level metrics needed to detect silent network drops before they impact users.
Speakers
avatar for Dhruv Jain

Dhruv Jain

Lead Software Engineer, Gojek (GoTo Group)
Dhruv Jain is a Lead Software Engineer at Gojek, where he focuses on building and scaling MQTT infrastructure that handles millions of concurrent connections across Southeast Asia. Beyond his work at Gojek, he is an active contributor to the open-source community and Google Summer... Read More →
Thursday May 21, 2026 2:55pm - 3:20pm CDT
Level One | Ballroom A
  End-User Case Studies
 
Friday, May 22
 

11:50am CDT

⚡ Lightning Talk: Show Me the Money: Metrics Edition - Brian Davis, Red Canary
Friday May 22, 2026 11:50am - 12:00pm CDT
Existing cloud and Kubernetes cost management tools struggle to track expenses at a granular level, leaving engineers unable to answer critical questions like: How much is one specific customer costing us in DynamoDB usage? Or, which system component is consuming the most of our Kafka cluster?1


This lightning talk demonstrates how to leverage existing observability frameworks to gain detailed, low-level cost insights. Attendees will learn basic techniques to instrument standard metrics—such as component name, customer ID, and team—with custom labels for fine-grained cost allocation.1


This session includes a practical case study from Red Canary, who has used this exact methodology for over five years to transform their tactical decision-making and better manage cloud spend. By treating cost allocation as an observability problem, engineers can provide the finance team with the deep data required for effective resource management.1


Attendees will leave with an actionable plan for implementing a metrics-based cost tracking system (likely with the tooling you already have), independent of high-level cloud billing tools, to drive significant operational efficiency.
Speakers
avatar for Brian Davis

Brian Davis

Principal Software Architect, Red Canary
Principal Software Architect at Red Canary, a Zscaler Company, Brian Davis has been building and monitoring complex systems for over two decades, ranging from signal-processing algorithms to complex data-processing applications, deploying these on Solaris servers, on-prem virtual... Read More →
Friday May 22, 2026 11:50am - 12:00pm CDT
Level One | Ballroom A
  End-User Case Studies

2:25pm CDT

Implementation of Unified Observability at Scale From Scratch - Ahmed J., Emaar
Friday May 22, 2026 2:25pm - 2:50pm CDT
Unified observability has lately been regarded as the holy grail by some. One platform, universal observability, for everything. Usually, this would be the default, but when you are at a 30-year-old non-technical enterprise, dealing with a mixture of legacy and modern systems, it's a whole different story.

A consequence of legacy decisions, in some cases, results in having multiple observability platforms for different teams within the company, adding overhead, cost, noise, and audit complexity. This was the case at Emaar, a property developer based in Dubai, until the PE team took on the exciting project of unifying all observability into one platform. This included applications, infrastructure, network, and security. The complexity arises not just from the different data sources, but rather from the number and nature of the deployment sites. This included sites across 10 countries consisting of data centers, hotels, malls, shops, etc.

This talk will outline the experience of implementing a unified observability platform consisting of thousands of network devices, machines, and application workloads using open-source technologies that resulted in 6 figures of cost savings.
Speakers
avatar for Ahmed J.

Ahmed J.

Platform Engineer, Emaar
Ahmed is a platform engineer with a background in artificial intelligence research and development. He excels at building scalable infrastructure to deploy and manage production-grade applications and models. He co-led the orchestration of modern infrastructure and observability at... Read More →
Friday May 22, 2026 2:25pm - 2:50pm CDT
Level One | Ballroom A
  End-User Case Studies
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Content Experience Level
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.