Open Source Summit + Embedded Linux Conference North America 2026: Full Schedule

May 18-20, 2026
Minneapolis, MN
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit North America 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Central DaylightTime (UTC -5). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

11:20am CDT

MOT: A Tool To Fight Open-washing in AI - Arnaud Le Hors, IBM

Monday May 18, 2026 11:20am - 12:00pm CDT

211A+B (Level Two)

Many models referred to as "open source" are distributed under restrictive licenses and fail to include the necessary information to actually qualify as open source. Just because a model is on HuggingFace does not mean it is open source.

Several attempts have been made to provide a definition of what "open source AI" ought to be but we now have a tool that can help: the Model Openness Tool (MOT).

The MOT was developed by the Generative AI Commons as an implementation of the Model Openness Framework (MOF) to provide model producers and consumers with a practical way to assess how open a model really is. This session will introduce attendees to the MOT and include a demo showing how it can be used along with Hugging Face and GitHub to provide greater understanding of which models are really open.

Speakers

Arnaud Le Hors

Senior Technical Staff Member, IBM

Arnaud Le Hors is Senior Technical Staff Member of Open Technologies at IBM. He has been working on standards and open source for over 30 years. Arnaud was editor of several key web specifications including HTML and DOM and was a pioneer of open source with the release of libXpm in... Read More →

Monday May 18, 2026 11:20am - 12:00pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Beginner

1:30pm CDT

Crawl, Walk, Run With Your MCP Servers - Lin Sun, solo.io

Monday May 18, 2026 1:30pm - 2:10pm CDT

211A+B (Level Two)

You have built your first MCP server and tested it with the MCP inspector, but it only uses stdio or streamable HTTP without HTTPS. Do you rewrite your server to add authentication and authorization, or is there a smarter way? What if you have multiple MCP servers? Can you unify them under a single virtual server without touching any of the originals? How do you deploy all of this to Kubernetes securely and reliably?

In this demo-driven session, Lin takes you from building a simple MCP server and securing it the hard way. Then she offloads authentication, authorization, and tool multiplexing to an MCP gateway. She will show how to deploy a virtual MCP server in Kubernetes and program an AI agent to call its tools, making complex setups feel effortless. By the end, you will have practical techniques to run, secure, and scale your MCP servers with confidence.

Speakers

Lin Sun

Head of Open Source, Solo.io

Lin is the Head of Open Source at Solo.io, contributing full-time to the open-source community. She serves on the CNCF Technical Oversight Committee (TOC), is a CNCF Ambassador, and is a maintainer for Istio, kgateway, and kagent. An international speaker at tech conferences, Lin... Read More →

Monday May 18, 2026 1:30pm - 2:10pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Beginner

2:25pm CDT

From Image To Itinerary: Multimodal Agentic Travel Planning With MCP, A2A, and BeeAI - Ezequiel Lanza, Intel

Monday May 18, 2026 2:25pm - 3:05pm CDT

211A+B (Level Two)

Planning a trip is a deceptively complex problem for AI, especially when the journey starts from visual context rather than text. In this session, we present a multimodal-first, local-first agentic architecture where a user uploads an image (e.g. “where is this place?”), and the system builds a travel plan from that visual input using Model Context Protocol (MCP), A2A (Agent-to-Agent), and BeeAI — all running fully locally without cloud dependencies.

The system employs a router and specialist agent pattern, where dedicated agents handle image understanding, hotel search, and flight search, each backed by MCP servers. A multimodal model extracts meaning from the image, after which the router decomposes the task and delegates work through A2A to the appropriate specialists.

We will walk-through how BeeAI manages agent lifecycles, how A2A enables explicit agent collaboration, and how MCP acts as a stable contract layer between reasoning and real-world capabilities. The focus is on practical architecture, configuration, and lessons learned, showing how to build MCP-centric, multimodal systems that remain extensible, reproducible, and maintainable as new agents and tools are added.

Speakers

Ezequiel Lanza

Ai Software Evangelist, Intel

Passionate about helping people discover the exciting world of artificial intelligence, Ezequiel is a frequent AI conference presenter and the creator of use cases, tutorials, and guides that help developers adopt open source AI tools.

Monday May 18, 2026 2:25pm - 3:05pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

3:35pm CDT

Who You Gonna Call? Taming OpenClaw's Rogue AI Agents With OpenTelemetry and Tetragon - Henrik Rexed, Dynatrace

Monday May 18, 2026 3:35pm - 4:15pm CDT

211A+B (Level Two)

There's something strange in your infrastructure. Who you gonna call?
OpenClaw , the open source AI agent formerly known as Clawdbot, then Moltbot exploded past 150,000 GitHub stars in weeks. It connects LLMs to your messaging platforms, terminal, and file system, giving AI full autonomous control. But like a Ghostbusters ghost, it wreaks havoc: $20 in tokens burned overnight to check the time, a one-click RCE (CVE-2026-25253), 21,000 exposed instances, and 341 malicious skills in the marketplace.
I will straps on my proton pack to bust these ghosts with open source tools. First, the OpenClaw Observability Plugin :
- an OpenTelemetry-based plugin capturing full agent lifecycle traces: request → agent turn → tool calls, with per-tool timing, token breakdowns, and error tracking. Your PKE meter for rogue AI.
- Then, Tetragon , eBPF-powered kernel-level policies restricting file access, network connections, and process execution. The containment unit no prompt injection can escape. A live demo ties it all together: OpenClaw + observability plugin + Tetragon, with traces and security events flowing into one dashboard.
We came, we saw, we traced it.

Speakers

Henrik Rexed

Cloud Native advocate & CNCF Ambassador, Dynatrace

Henrik is a Cloud Native Advocate at Dynatrace and a CNCF Ambassador . Prior to Dynatrace, Henrik has worked more than 15 years, as Performance Engineer. Henrik Rexed Is Also one of the Organizer of the conferences named WOPR, KCD Austria and the owner of the Youtube Channel Isit... Read More →

Monday May 18, 2026 3:35pm - 4:15pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

4:30pm CDT

Scaling LLM Inference With Tiered Caching: Extending LMCache With Amazon SageMaker HyperPod - Yihua Cheng, Tensormesh, Inc. & Ziwen Ning

Monday May 18, 2026 4:30pm - 5:10pm CDT

211A+B (Level Two)

LMCache supports tiered KV caching with CPU memory offloading, extending inference beyond GPU memory limits. But what happens when local CPU memory isn't enough? This session introduces the next tier: offloading KV cache to Amazon SageMaker HyperPod managed storage, expanding cache capacity for large-scale LLM inference.

We'll cover the technical design of the SageMaker HyperPod connector contribution to LMCache. Hot entries stay in GPU memory, warm entries spill to CPU memory, and cold entries persist to HyperPod's managed storage. This three-tier architecture lets organizations cache far more context than local resources allow, reducing redundant computation for repeated prompts and long-context scenarios.

The session demonstrates the integration in action, showing cache hit rates, latency across tiers, and how the connector handles transitions between local and remote storage. We'll discuss key engineering decisions, including async prefetching and failure handling.

Attendees will leave with practical knowledge of how managed cloud storage can extend open source caching frameworks for LLM inference infrastructure.

Speakers

Yihua Cheng

CTO, Tensormesh, Inc.

Yihua Cheng is co-founder and CTO of Tensormesh. He has a deep background in large language models, high-performance computing, and open-source development.
Yihua created LMCache and the vLLM production stack, open-source projects that have collectively earned over 9,000 GitHub... Read More →

Ziwen Ning

Open Source Contributor

Ziwen Ning is an open-source contributor to LMCache. He was previously a Senior Software Development Engineer at AWS, working on Amazon SageMaker HyperPod with a focus on building scalable ML infrastructure. Before that at Annapurna Labs, he enhanced the AI/ML experience through the... Read More →

Monday May 18, 2026 4:30pm - 5:10pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

5:25pm CDT

Beyond Vector Search: Building Knowledge Graphs for Autonomous Infrastructure - Torsten Boettjer, Rescile

Monday May 18, 2026 5:25pm - 6:05pm CDT

211A+B (Level Two)

Modern platform engineering has a 'context' problem. As infrastructure scales across Kubernetes, hybrid clouds, and internal developer platforms (IDPs) like Backstage, traditional RAG systems struggle to answer multi-hop queries like 'Which services depend on this failing database?' or 'What is the blast radius of this IAM change?'
In this session, we explore how GraphRAG—a combination of Knowledge Graphs and LLMs—solves the reasoning gap that vector-only search leaves behind. We will demonstrate how to index infrastructure as a graph of entities and relationships, allowing AI agents to perform complex root-cause analysis and automate documentation. Attendees will leave with a blueprint for building an open-source GraphRAG pipeline to turn platform data into actionable intelligence."

Speakers

Torsten Boettjer

Co-Founder, Rescile

Co-Founder at Rescile, 20 years experience in platform engineering, former CCIO at Avaloq, CTO at Cisco, Head of Innovation at Swisscom, Product Management at Oracle Cloud Infrastructure

Monday May 18, 2026 5:25pm - 6:05pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

11:00am CDT

Connecting the Dots With Context Graphs - Stephen Chin, Neo4j

Tuesday May 19, 2026 11:00am - 11:40am CDT

211A+B (Level Two)

AI systems need more than intelligence; they need context that persists. Without it, even strong models can misinterpret information, lose decision rationale, or repeat the same mistakes. Context Graphs have emerged as a practical pattern for agentic AI: a living graph that captures not only what was retrieved or known, but how context led to actions through tool calls, constraints, policies, and outcomes, stitched across entities and time so precedent becomes searchable.

This talk explores context engineering as the discipline of designing that context layer, and shows how context graphs complement retrieval by enabling multi-hop, structured context assembly (building on GraphRAG-style hierarchical summaries) while improving explainability and evaluation. Attendees will leave with a practical understanding of how to build context pipelines that combine contextual retrieval with persistent memory and provenance, and why context graphs are becoming central to trustworthy, enterprise-ready AI systems.

Speakers

Stephen Chin

VP of Developer Relations at Neo4j, Open AI & Data Program Chair

Tuesday May 19, 2026 11:00am - 11:40am CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

11:55am CDT

From Tools To Platforms: MCP Patterns for Building Open Agent Ecosystems - Guangya Liu, JPMC

Tuesday May 19, 2026 11:55am - 12:35pm CDT

211A+B (Level Two)

Model Context Protocol (MCP) is quickly becoming a foundational interface for agent–tool interaction, but most implementations today stop at simple, single-server tool exposure. This session explores practical MCP design patterns that move beyond “one server, one agent” toward scalable, interoperable, and ecosystem-friendly architectures.

Based on real-world experimentation and open-source implementations, we will walk through a set of MCP patterns, including:
1. Single MCP Server patterns for tool and data exposure
2. Multi-Server composition and routing patterns
3. MCP Host / Gateway patterns for aggregation and policy control
4. Plugin-style extension patterns that allow third-party MCP servers to integrate without code changes
5. Read vs. write MCP patterns for observability, automation, and feedback loops

The talk focuses on when and why to apply each pattern, common pitfalls, and architectural trade-offs. Attendees will leave with a mental model for designing MCP-based systems that scale from local experiments to ecosystem-level platforms, enabling agents, tools, and platforms to evolve independently while remaining interoperable.

Speakers

Guangya Liu

Executive Director, JPMC

Tuesday May 19, 2026 11:55am - 12:35pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Any

2:10pm CDT

Beyond Static RAG: Building Self-Correcting Agentic Pipelines With Open Source Databases - Ben Grieser, MariaDB

Tuesday May 19, 2026 2:10pm - 2:50pm CDT

211A+B (Level Two)

As LLMs shift from chatbots to autonomous agents, the limits of "single-shot" RAG are surfacing. Static retrieval often introduces irrelevant context that misleads models. To solve this, developers are adopting Corrective (CRAG) and Adaptive RAG, requiring databases to act as active reasoning runtimes rather than simple stores.

This session explores building self-correcting AI agents using an open-source relational stack. We will demonstrate how to bridge the gap between semantic search and structured data using the Model Context Protocol (MCP) and native MariaDB vector indexing.

Technical topics include:

The Critic Loop: Implementing self-correcting architectures that validate retrieved documents before LLM synthesis.

Hybrid Querying: Combining vector indexing with relational SQL in single ACID transactions to reduce agentic loop latency.

Standardizing Communication: Using the MariaDB MCP Server for secure, tool-based access to live data.

Scaling State: Managing concurrent agent sessions without sacrificing data integrity.

Attendees will leave with a blueprint for building reliable, autonomous systems using open-source database patterns that move beyond basic vector search.

Speakers

Ben Grieser

Sr Solutions Engineer, MariaDB

As a technologist and database expert, Ben Grieser works at the intersection of open source innovation and product engineering. In his role at MariaDB, he regularly talks with team using open source technology to bring complex data products to life. Ben is passionate about making... Read More →

Tuesday May 19, 2026 2:10pm - 2:50pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

3:05pm CDT

Identity Management for AI Agents - Abdel Fane, OpenA2A

Tuesday May 19, 2026 3:05pm - 3:45pm CDT

211A+B (Level Two)

Every enterprise has identity management for humans—SSO, MFA, RBAC, audit logs. But AI agents? They run with API keys, no verified identity, no behavioral tracking, no audit trail.

This talk bridges the gap between traditional IAM and the emerging world of autonomous AI agents:

What we learned from human IAM:

- Why identity must be cryptographic, not just credentials

- How least-privilege access control prevents lateral movement

- Why audit trails matter for compliance and incident response

Applying it to AI agents:

- Agent identity: Ed25519 keypairs vs API keys

- Capability-based access: what tools can this agent call?

- Behavioral trust scoring: detecting compromised agents

- MCP server attestation: verifying the tools agents connect to

We'll examine real attack scenarios—agent impersonation, tool injection, privilege escalation—and show how identity-first security prevents them.

Live demo using AIM (Agent Identity Management), an Apache-2.0 open-source platform. All patterns are framework-agnostic and applicable to LangChain, CrewAI, AutoGen, or raw MCP implementations.

Attendees leave with actionable security patterns for their AI agent deployments.

Speakers

Abdel Fane

CEO & Founder, OpenA2A

Abdel is a cybersecurity architect with 17+ years of experience securing enterprise environments across healthcare, finance, and government sectors. He has led security initiatives at Grail, Booz Allen Hamilton, Protiviti, and Allstate, specializing in cloud security & DevSecOps.
... Read More →

Tuesday May 19, 2026 3:05pm - 3:45pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Intermediate

4:20pm CDT

Headroom: A Context Optimization Layer for LLM Applications - Tejas Chopra, Netflix, Inc.

Tuesday May 19, 2026 4:20pm - 5:00pm CDT

211A+B (Level Two)

LLM tokens are expensive. With context windows expanding to 200K+ tokens, a single API call can cost several dollars & in production systems handling thousands of requests, these costs compound quickly.
Most optimization efforts focus on model selection or prompt engineering, but the context itself often contains massive redundancy.

Headroom is an open-source Python library (https://github.com/chopratejas/headroom) that sits between your application and your LLM provider, transparently optimizing context before it reaches the model.
The core insight is simple: LLM contexts—especially in agentic workflows—are filled with repetitive tool outputs, verbose JSON arrays, and boilerplate that consumes tokens without adding proportional value

Headroom introduces novel concepts such as reversible compression, cache aligners, compression routers, and even persistent memory

Real-world results:
- 50-90% token reduction on typical agentic workloads
- Drop-in integrations for LangChain, OpenAI, Anthropic, and any OpenAI-compatible provider
- Zero code changes required when using the proxy server

Speakers

Tejas Chopra

Sr. Engineer, Netflix, Inc.

Tejas Chopra is a senior ML and AI infrastructure Engineer at Netflix, where he builds large-scale systems for production AI and data platforms. He is the creator of Headroom, an open-source context optimization engine for LLMs, and a frequent speaker at global conferences on ML systems... Read More →

Tuesday May 19, 2026 4:20pm - 5:00pm CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Advanced

11:00am CDT

KV-Cache Centric Inference: Building an Open Source LLM Serving Platform Around State - Martin Hickey, IBM Research

Wednesday May 20, 2026 11:00am - 11:40am CDT

211A+B (Level Two)

We optimize LLM inference around compute—faster kernels, better batching, smarter parallelism. But in production, the real bottleneck is state. The KV‑cache holds precomputed attention data that turns a multi‑second prefill into a sub‑second cache hit. Lose it to eviction, isolate it on one node, or route away from it, and you pay the full compute cost again for work you already did.

llm-d is an open-source distributed inference platform, co-founded by Google, IBM Research, Red Hat, NVIDIA, and CoreWeave, that treats the KV‑cache as the core of the system rather than a byproduct. That enables tiered memory management—offloading KV blocks from GPU to CPU to shared storage—cross‑replica reuse so cached state computed anywhere is usable everywhere, and cache‑aware scheduling that routes requests to the replica most likely to hold their prefix.

This session walks through how llm-d and vLLM implement each layer of this stack, how they combine into a production system, and what the open‑source community can build on top. We’ll share benchmarks, Kubernetes deployment patterns, and practical guidance for operators running LLM workloads at scale.

Speakers

Martin Hickey

Senior Technical Staff Member, IBM Research

Martin Hickey is a STSM at IBM Research, focused on Open Source, Cloud Native Computing, and AI. Martin has notable contributions to open source projects like vLLM, LMCache, Kubernetes, Helm, OpenTelemetry and OpenStack. Martin is a core maintainer for LMCache and an emeritus core... Read More →

Wednesday May 20, 2026 11:00am - 11:40am CDT
211A+B (Level Two)

Open AI & Data

Audience Experience Level Any

11:20am CDT

1:30pm CDT

2:25pm CDT

3:35pm CDT

4:30pm CDT

5:25pm CDT

11:00am CDT

11:55am CDT

2:10pm CDT

3:05pm CDT

4:20pm CDT

11:00am CDT

Get help with the event