🧩 Chat Bricks¶
Correct, verifiable chat-template rendering and per-token loss masks for LLM/VLM training — with any HuggingFace model.
Chat Bricks gives you the things apply_chat_template doesn't: per-token labels and action_mask for multi-turn SFT and RL, swappable tool-call formats for the same base model, and a first-class skills block. Rendering is verified byte-identical against the model's official template, so you can trust what hits your loss function.
The problem¶
When you train on multi-turn or tool-using conversations, you need a per-token mask that says "compute loss on these assistant tokens, ignore everything else." HuggingFace's apply_chat_template doesn't produce this — return_assistant_tokens_mask only works on templates that ship with explicit {% generation %} markers, which most don't. Hand-rolling a mask from string offsets silently breaks on multi-turn, tool-call turns, or non-append-only templates. A wrong mask doesn't crash — it quietly degrades your model and you blame the data.
Chat Bricks reconstructs the mask by aligning incremental renders to token spans, with model-specific overrides for templates that aren't append-only. Rendering is checked byte-for-byte against each model's official chat template in CI.
What you get¶
- Loss masking that works. Per-token
labelsandaction_maskacross multi-turn, tool-call, and skill turns. Byte-identical rendering verified against the official template. - Tool-call variant control. Swap tool format on the same base model via
ToolPolicy+ToolFormatter— no Jinja rewrites. See Tools and tool-call variants. - Skills as a first-class block. Advertise
(name, description)pairs in the system prompt viaskills_template. See Skills. - Any HuggingFace model, out of the box.
Chat(template="org/model", ...)falls back to the tokenizer's chat template with masking reconstructed by diffing. See Use any HuggingFace model. - Verified correctness.
compare_hf_template(...)and CI parity tests for every built-in template. See Verification & correctness. - VLM support. Vision-language templates and a registerable vision processor. See Vision Templates.
60-second SFT example¶
from transformers import AutoTokenizer
from chat_bricks import Chat
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
chat = Chat(template="Qwen/Qwen2.5-3B-Instruct", messages=[
{"role": "user", "content": "What is 3 times 5?"},
{"role": "assistant", "content": "", "tool_calls": [
{"type": "function", "function": {"name": "multiply",
"arguments": {"x": 3, "y": 5}}}]},
{"role": "tool", "content": "15"},
{"role": "assistant", "content": "It's 15."},
])
inputs = chat.tokenize(tokenizer)
# inputs["input_ids"], inputs["labels"], inputs["action_mask"], inputs["attention_mask"]
Continue with the Quick Start or jump to any of the how-to pages above.
| Discord | |
|---|---|
Scan to join wechat group |
Join our discord channel |