← Collections

Hugging Face

MIMO Claude Code Traces 1K

A Claude Code-style coding-agent trace dataset with 1,017 complete trajectories generated by MiMo-V2.5-Pro, covering tool use, reasoning fields, debugging, refactoring, shell workflows, and software-engineering tasks.

June 22, 2026 DatasetCode AgentAgent TracesTool UseSoftware EngineeringSFT Hugging Face

MIMO Claude Code Traces 1K: A Coding-Agent Trajectory Dataset

MIMO Claude Code Traces 1K is an open-source dataset of Claude Code-style coding-agent trajectories. It contains 1,017 complete traces generated with MiMo-V2.5-Pro, covering user coding tasks, multi-turn messages, tool schemas, assistant reasoning fields, tool calls, tool outputs, and metadata such as model name, category, duration, cost, token usage, and whether tools were used.

The dataset is designed for research around code agents: tool-use imitation, code-agent distillation, supervised fine-tuning, trajectory modeling, reasoning/tool-call alignment, and evaluation of software-engineering behavior.

MiMo-V2.5-Pro benchmark results

Why MIMO Claude Code Traces Exists

Modern coding agents are not just code generators. They inspect files, run shell commands, edit code, recover from tool errors, reason over long contexts, and gradually refine a solution across many turns. Training and evaluating this behavior requires more than isolated prompt-response pairs.

MIMO Claude Code Traces narrows this problem into complete agent trajectories collected in a Claude Code-style environment. This setting is useful because software-engineering agents naturally involve:

By preserving full event streams instead of only final answers, the dataset supports research on how code agents actually behave while solving tasks.

Model and Generation Setup

The traces were generated with mimo-v2.5-pro, MiMo’s most capable model at the time of release. MiMo-V2.5-Pro is a 1.02T-parameter Mixture-of-Experts model with 42B active parameters, a hybrid-attention architecture, and a 1M-token context window.

It is designed for agentic workloads, complex software engineering, and long-horizon tasks, with improved instruction following and coherence across ultra-long contexts. In tool-using harnesses, it can sustain complex trajectories spanning hundreds to more than a thousand tool calls.

MiMo-V2.5-Pro token efficiency

The dataset was produced in an agentic coding setup with tools such as Bash, Read, Write, Edit, Glob, Grep, TodoWrite, and planning utilities. The dataset construction process used approximately 400M tokens in total, while the released trace metadata records approximately 127.2M logged usage tokens across input, cache-read, and output token fields.

Dataset Overview

Each .jsonl file contains one complete Claude Code-style event stream. The release is organized by task category under session/.

Key Statistics

StatisticValue
Total traces1,017
Total JSONL files1,017
Modelmimo-v2.5-pro
Generation budget~400M tokens
Logged usage tokens127,236,485
Claude Code-style event rows15,046
Conversation messages11,995
Assistant tool calls5,271
Tool result messages5,271
Traces with tool calls859
Traces with reasoning fields1,017
Recorded turns4,932
Recorded duration~20.5 hours
Recorded API cost field total$163.89

Logged Usage Tokens

Token fieldCount
input_tokens8,033,778
cache_read_input_tokens117,286,784
cache_creation_input_tokens0
output_tokens1,915,923
Total logged usage tokens127,236,485

Dataset Structure

The dataset is organized as a top-level README plus category folders. Each category folder contains JSONL traces for one type of coding-agent task.

mimo-claude-code-traces-1k/
├── README.md
└── session/
    ├── algorithms/
    │   └── *.jsonl
    ├── api_integration/
    │   └── *.jsonl
    ├── code_generation/
    │   └── *.jsonl
    ├── data_processing/
    │   └── *.jsonl
    ├── debugging/
    │   └── *.jsonl
    ├── hf_trace/
    │   └── *.jsonl
    ├── math_problems/
    │   └── *.jsonl
    ├── refactoring/
    │   └── *.jsonl
    ├── shell_devops/
    │   └── *.jsonl
    └── supplement/
        └── *.jsonl

Category Distribution

The dataset covers algorithmic tasks, code generation, debugging, refactoring, shell/devops, Hugging Face traces, data processing, and reasoning-heavy coding prompts.

CategoryTracesMessagesTool callsTraces with toolsTurns
algorithms1571,853722148854
api_integration231,30098423214
code_generation2133,2461,4292131,448
data_processing5888544258345
debugging16294125296380
hf_trace5763731636225
math_problems7674526076332
refactoring12679621664339
shell_devops7090141670486
supplement7569123475309
Total1,01711,9955,2718594,932

Token and Cost by Category

CategoryLogged tokensOutput tokensDuration msCost USD
algorithms23,302,782335,33012,891,69329.179555
api_integration5,273,76052,9073,989,45014.793162
code_generation41,079,654771,09124,428,45651.162873
data_processing8,638,826112,4054,548,50710.566932
debugging9,390,091133,2256,306,09811.999537
hf_trace4,365,21182,6274,164,12910.435188
math_problems8,867,295128,3005,003,4099.130514
refactoring8,571,921101,5685,166,70410.012703
shell_devops9,940,490126,3584,455,33510.254762
supplement7,806,45572,1122,985,1436.351987

Tool Use

MIMO Claude Code Traces captures explicit tool-call behavior, including successful calls and tool error messages. This makes it useful for learning when to call tools, how to recover from failed calls, and how to combine shell/file operations with natural-language reasoning.

ToolCalls
Bash1,805
Read1,480
Write919
Glob381
Edit339
Grep163
Agent53
EnterPlanMode38
ExitPlanMode36
AskUserQuestion28
TodoWrite25
TaskOutput2
WebFetch1
TaskStop1
Total5,271

Available tool schemas are included in every trace. The common Claude Code-style tool inventory includes:

Task, AskUserQuestion, Bash, CronCreate, CronDelete, CronList, Edit,
EnterPlanMode, EnterWorktree, ExitPlanMode, ExitWorktree, Glob, Grep,
NotebookEdit, Read, ScheduleWakeup, Skill, TaskOutput, TaskStop,
TodoWrite, WebFetch, WebSearch, Write

Event Stream Schema

Each .jsonl file in session/<category>/ is one Claude Code-style event stream. Each line is one event.

Common top-level fields include:

FieldTypeDescription
typestringEvent type, such as mode, permission-mode, user, assistant, last-prompt, or system
sessionIdstringSession identifier
uuidstringEvent UUID. Missing original UUIDs are deterministically generated during conversion
parentUuidstring/nullParent event UUID for message-chain reconstruction
timestampstringEvent timestamp
cwdstringWorking directory recorded or normalized for the trace
versionstringDataset conversion/version marker
messageobjectUser, assistant, or tool-result message payload

Assistant events store Claude-style message content blocks:

Content blockDescription
textAssistant natural-language response
thinkingReasoning content from the original trace
tool_useTool call with id, name, and input

Tool outputs are represented as user events whose message.content contains tool_result blocks linked by tool_use_id.

Event-Type Counts

Event typeCount
mode1,017
permission-mode1,017
user6,288
assistant4,690
last-prompt1,017
system1,017
Total15,046

Example Event Lines

{"type":"mode","mode":"normal","sessionId":"7469deea-7e45-4732-8f06-9666d52052d4"}
{"type":"user","message":{"role":"user","content":"Implement Kahn's algorithm for topological sort..."},"sessionId":"7469deea-7e45-4732-8f06-9666d52052d4","uuid":"...","parentUuid":null}
{"type":"assistant","message":{"model":"mimo-v2.5-pro","role":"assistant","content":[{"type":"thinking","thinking":"..."},{"type":"text","text":"Let me explore the codebase."},{"type":"tool_use","id":"call_...","name":"Bash","input":{"command":"ls /data/agent/choucisan"}}]},"sessionId":"7469deea-7e45-4732-8f06-9666d52052d4","uuid":"...","parentUuid":"..."}
{"type":"user","message":{"role":"user","content":[{"type":"tool_result","tool_use_id":"call_...","content":"...","is_error":false}]},"sessionId":"7469deea-7e45-4732-8f06-9666d52052d4","uuid":"...","parentUuid":"..."}

Some fields are normalized because the original collected data did not include the full Claude Code runtime envelope. In particular, uuid, parentUuid, requestId, cwd, version, and gitBranch are deterministic conversion fields rather than raw Claude Code runtime fields.

Highlights

Quick Start

Load the dataset directly from Hugging Face:

from datasets import load_dataset

repo_id = "choucsan/mimo-claude-code-traces-1k"
dataset = load_dataset(repo_id, data_files="session/**/*.jsonl")

print(dataset["train"][0])

Read local JSONL files:

import json
from pathlib import Path

root = Path("mimo-claude-code-traces-1k")
files = sorted((root / "session").glob("*/*.jsonl"))

events = []
for path in files:
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            if line.strip():
                events.append(json.loads(line))

print(len(events))
print(events[0])

Count tool calls:

from collections import Counter

tool_counts = Counter()

for event in events:
    message = event.get("message") or {}
    for block in message.get("content", []) or []:
        if isinstance(block, dict) and block.get("type") == "tool_use":
            tool_counts[block.get("name")] += 1

print(tool_counts.most_common())

Applications

MIMO Claude Code Traces can be used in several research and development settings.

ApplicationHow MIMO Claude Code Traces Helps
Code-agent distillationDistills mimo-v2.5-pro agent behavior into smaller code models
Supervised fine-tuningTrains coding assistants on task-to-trajectory data
Tool-call predictionLearns when to call shell, read, write, edit, grep, or planning tools
Reasoning/tool alignmentConnects assistant reasoning fields with subsequent tool-use decisions
Offline RLProvides complete trajectories for tool-using code-agent policy learning
Reward modelingSupports trace-level and step-level preference modeling for tool choice, edit quality, or task completion
Debugging researchPreserves failed commands, shell diagnostics, tool errors, and recovery attempts
Refactoring and cleanupCaptures multi-step codebase edits and iterative improvements
Shell and devops workflowsIncludes command-line operations, file inspection, and execution feedback
Cost-aware agentsUses token, duration, and cost metadata for efficiency-aware modeling
Evaluation harnessesBenchmarks parsers, function-calling policies, and Claude Code-style trace consumers

Data Quality Notes

Citation and Contact

If MIMO Claude Code Traces helps your work, please consider linking back to the dataset page. For questions, corrections, or collaboration, contact choucisan@gmail.com.