Loading…
Open Source Summit + Embedded Linux Conference North America...
May 18-20, 2026
Minneapolis, MN
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit North America 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Central DaylightTime (UTC -5). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.


Company: Advanced clear filter
arrow_back View All Dates
Tuesday, May 19
 

4:20pm CDT

The Hidden Cost of Sleep: How Scheduler Wakeup Latency Impacts High-Throughput AI Inference - Shubhang Kaushik, Ampere Computing
Tuesday May 19, 2026 4:20pm - 5:00pm CDT
As a Linux Kernel Developer at Ampere Computing, I focus on optimizing the scheduler for high-density ARM64 systems. My work culminates in a patch merged for the Linux 7.0 release that refines avg_idle tracking a critical metric the scheduler uses to decide how long to search for an idle CPU before giving up. In my session "The Hidden Cost of Sleep", I will break down the try_to_wake_up() path to show how even minor inaccuracies in idle-time accounting lead to poor CPU selection and increased cache misses. I’ll explain how my Linux 7.0 optimizations [commit
36ae1c45b2cede] specifically reduce the 'search cost' during wakeups, directly improving the responsiveness of AI inference workloads. By sharing raw performance data and trace analysis, I’ll demonstrate why getting the wakeup path right is the only way to achieve the deterministic performance needed for autonomous AI agents and scalable trust infrastructure.
Speakers
avatar for Shubhang Kaushik

Shubhang Kaushik

Software Engineer, Ampere Computing
Linux Kernel Developer
Tuesday May 19, 2026 4:20pm - 5:00pm CDT
205C+D (Level Two)
  Linux

4:20pm CDT

Headroom: A Context Optimization Layer for LLM Applications - Tejas Chopra, Netflix, Inc.
Tuesday May 19, 2026 4:20pm - 5:00pm CDT
LLM tokens are expensive. With context windows expanding to 200K+ tokens, a single API call can cost several dollars & in production systems handling thousands of requests, these costs compound quickly.
Most optimization efforts focus on model selection or prompt engineering, but the context itself often contains massive redundancy.

Headroom is an open-source Python library (https://github.com/chopratejas/headroom) that sits between your application and your LLM provider, transparently optimizing context before it reaches the model.
The core insight is simple: LLM contexts—especially in agentic workflows—are filled with repetitive tool outputs, verbose JSON arrays, and boilerplate that consumes tokens without adding proportional value

Headroom introduces novel concepts such as reversible compression, cache aligners, compression routers, and even persistent memory

Real-world results:
- 50-90% token reduction on typical agentic workloads
- Drop-in integrations for LangChain, OpenAI, Anthropic, and any OpenAI-compatible provider
- Zero code changes required when using the proxy server
Speakers
avatar for Tejas Chopra

Tejas Chopra

Sr. Engineer, Netflix, Inc.
Tejas Chopra is a senior ML and AI infrastructure Engineer at Netflix, where he builds large-scale systems for production AI and data platforms. He is the creator of Headroom, an open-source context optimization engine for LLMs, and a frequent speaker at global conferences on ML systems... Read More →
Tuesday May 19, 2026 4:20pm - 5:00pm CDT
211A+B (Level Two)
  Open AI & Data
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience Experience Level
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -