1 · Train a tokenizer
Build a Byte Pair Encoding vocabulary for your language. Drop text files,
pick a vocab size, watch merges happen on your GPU. Save the
.json — Pre-tokenize feeds it to the transformer in step 3.
3 · Train a transformer
Two flows in one: pretrain a fresh foundation model on your
.bin, or load a .llm checkpoint and
fine-tune it on a smaller, task-specific corpus. Forward, backward,
and AdamW all run on your GPU — no server, no cloud.
Generate from a model
Drop a .llm checkpoint, type a prompt, watch tokens stream.
No .bin needed — vocab + weights ride with the checkpoint.
Useful for sanity-checking a run, comparing two checkpoints, or just
playing with sampling knobs.