Loading…
May 21-22, 2026
Learn more and Register to Attend

The Sched app allows you to build your schedule, but is not a substitute for your event registration. You must be registered for Observability Summit North America 2026.

Please note: This schedule is automatically displayed in Central Daylight Time (UTC -5). To see the schedule in your preferred timezone, select from the drop-down menu located at the bottom of the menu to the right.

The schedule is subject to change.
Thursday May 21, 2026 11:50am - 12:00pm CDT
Operating a metrics system beyond billions of data points introduces failure modes that don't exist at smaller deployments. This lightning talk shares battle-tested lessons from running, Thanos, Prometheus and OpenTelemetry in production across distributed Kubernetes environments, focusing on three critical challenges: implementing multi-tenancy without noisy neighbor problems, building rate limiting that prevents a single tenant from destabilizing the cluster, and isolating query workloads so expensive queries don't starve metric ingestion.

The talk walks through real incidents where these challenges caused production impact, including 5xx errors on Thanos Receivers from unbounded queries, Prometheus remote write lag and partial query results from overwhelmed Store Gateways. For each problem, the talk presents custom solutions developed—including tenant-aware rate limiting middleware and workload isolation patterns—and shares concrete configuration approaches that attendees can apply to their own deployments.
Attendees will leave with actionable techniques for scaling their observability infrastructure to trillion-scale while maintaining reliability under load.
Speakers
avatar for Narendra Sanikommu

Narendra Sanikommu

Senior Software Engineer, Nvidia
Experienced software engineer who is passionate about solving complex software engineering challenges. With around 14 years of experience in software engineering – has a strong foundation in building and optimizing high-performance systems particularly in Observability, Big Data... Read More →
Thursday May 21, 2026 11:50am - 12:00pm CDT
Level One | Ballroom A
  Scalability Challenges and Solutions
  • Content Experience Level Any

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link