Teaching Claude Code How to Use marimo

I was building evaluation notebooks for an entity classification pipeline. The notebooks were in marimo — reactive Python notebooks stored as plain .py files. I was editing them through Claude Code, my AI coding agent, while viewing them in the marimo browser editor.

Every time Claude edited the file, marimo would overwrite the changes. I’d fix the SQL query, marimo would revert it. Fix it again, reverted again. Three rounds of this before I figured out what was happening.

The problem with external editing

marimo loads your notebook into memory when you open it. After that, it works entirely from its in-memory state. It doesn’t watch the filesystem for changes. So when an external tool — Claude Code, vim, VSCode, whatever — edits the .py file on disk, marimo doesn’t notice. Then when you save any cell in the marimo UI, it writes its in-memory version back to disk, silently overwriting your external edits.

The fix is simple: marimo edit --watch notebook.py. The --watch flag monitors the file and streams changes to the browser editor. But I didn’t know this, Claude Code didn’t know this, and we wasted time going in circles.

This is the kind of thing that’s easy to miss in docs but painful to hit in practice. And it’s exactly the kind of knowledge an AI coding agent needs to have.

Claude Code skills

Claude Code supports skills — structured reference documents that get loaded dynamically when relevant tasks come up. Think of them as specialized knowledge packs. The dagster-expert skill is a good example: it has a SKILL.md that defines when to activate and an index of reference files covering CLI commands, asset patterns, automation, and integrations.

Skills are plugins with this structure:

.claude-plugin/
  plugin.json          # name, description, version
skills/
  my-skill/
    SKILL.md           # trigger description + reference index
    references/
      TOPIC_A.md       # detailed reference doc
      TOPIC_B.md       # another reference doc

The SKILL.md frontmatter tells Claude when to use the skill:

---
name: marimo-expert
description:
  Expert guidance for working with marimo reactive notebooks. ALWAYS use
  before doing any task that involves marimo notebooks...
---

When Claude detects it’s working with marimo, it reads the relevant reference files before answering. No more guessing at APIs or missing critical gotchas like the --watch flag.

Building the skill

I created a marimo-expert skill covering the topics that kept coming up during my notebook work:

Each topic is a separate markdown file in references/, following the same pattern as the dagster-expert skill. The main SKILL.md has an index pointing to each reference file so Claude can find what it needs quickly.

Registration

For a local plugin, you register it in two files:

~/.claude/plugins/installed_plugins.json:

{
  "marimo-expert@local": [{
    "scope": "user",
    "installPath": "~/.claude/plugins/cache/local/marimo-expert/0.1.0",
    "version": "0.1.0"
  }]
}

~/.claude/settings.json:

{
  "enabledPlugins": {
    "marimo-expert@local": true
  }
}

Restart Claude Code and the skill is active.

Discovering the official marimo skills

After building my skill, I found that marimo already has an official skills repo. It has 10 skills including marimo-notebook — which covers file format, cell structure, script mode, pytest integration, and PEP 723 dependencies.

You install them with:

npx skills add marimo-team/skills --agent claude-code

The official marimo-notebook skill and my marimo-expert skill turned out to be complementary rather than overlapping. The official one focuses on writing correct marimo code — cell signatures, return tuples, output rendering rules, marimo check. Mine covers the operational side — watch mode, caching strategies, configuration, and the reactivity model in depth.

Contributing upstream

Rather than maintaining a separate skill, I identified the reference docs that would actually add value to the official skill and opened a PR adding four new reference files:

FileWhat it covers
WATCHING.md--watch flag, external editing gotcha, watcher_on_save, module autoreloading
EXPENSIVE.mdmo.stop, mo.cache, mo.persistent_cache, mo.lazy, memory management
CONFIGURATION.mdpyproject.toml settings, marimo.toml, config priority chain
REACTIVITY.mdDAG model, variable uniqueness, mutation limitations, cell-local variables

The skills repo is lightweight — no CLA, no issue-first requirement, just “issues and pull requests are welcome.” The existing PRs are small and focused, things like “add pytest reference” or “add pathlib security guidance.”

What I’d do differently

The skill I built locally has 12 reference files. For the upstream PR, I trimmed it to 4 — the ones covering topics the official skill genuinely doesn’t address. The SQL, state, UI, and export docs I’d written were redundant with files already in the official skill (just with different formatting).

If I were starting over, I’d install the official skills first, use them for a few sessions, and only then write reference docs for the gaps. Building the full skill from scratch was useful for understanding the plugin system, but half the content was duplicated effort.

Takeaways


References

← Back to blog