By Yipeng Ouyang · March 26, 2026
In early March 2026, while working with various LLM agent frameworks — Claude Code, OpenAI Codex, Google Gemini CLI — I noticed something peculiar. The same SKILL.md file, when deployed to different frameworks, produced dramatically different results. A skill that worked flawlessly on Claude Code would fail on Gemini CLI. Another that performed well on Codex would produce mediocre results on Kimi.
This wasn't a content problem — the instructions were identical. It was a format problem. Each framework, trained on different data distributions, had different "format preferences." Claude responded best to XML-structured prompts. GPT models struggled with JSON input (the infamous "format tax"). Gemini parsed YAML nested data with 51.9% accuracy versus JSON's 43.1%.
Key Insight: Format sensitivity is not a bug — it's a fundamental property of how LLMs process structured information. The same semantic content, presented in different syntactic forms, yields different cognitive outcomes.
This immediately reminded me of the classical compiler problem. In the 1950s, programmers faced a similar challenge: writing code for different machine architectures required manual per-platform adaptation. The solution was the intermediate representation (IR) — a unified abstraction that decoupled source languages from target machines.
What if we could apply the same principle to agent skills? A unified IR that captures what a skill means, independent of how it's formatted for any particular framework. Then, platform-specific "emitters" could render that IR into each framework's preferred format.
On March 26, I sketched the initial concept on a whiteboard:
The name "Skill Compiler" was born that day. The core insight — reducing O(m×n) adaptation complexity to O(m+n) — became the project's mathematical foundation.
The agent skill ecosystem was growing explosively. Thousands of community-contributed skills were being shared on GitHub. But they were all format-agnostic — a single Markdown file deployed identically everywhere. This was like writing assembly code by hand for each CPU architecture in 2026.
Meanwhile, Snyk's security audit had just revealed that 37% of community skills contained vulnerabilities. No systematic mechanism existed to enforce security properties before skills reached an agent's context window.
A compiler could solve both problems simultaneously: portability through IR-driven format adaptation, and security through compile-time analysis and constraint injection.
With the concept validated on paper, the next step was clear: build a working prototype in Rust. The language's type system, performance characteristics, and ecosystem (serde, pulldown-cmark, askama) made it the natural choice for a compiler project.
The journey from whiteboard sketch to working compiler would begin in April.