TBCX is a C extension for the Tcl 9.1 family that serializes compiled Tcl bytecode (plus enough metadata to reconstruct procs, TclOO methods, and lambda constructs) into a compact .tbcx file — and later loads that file into another interpreter for fast startup without re‑parsing or re‑compiling the original source. There's also a disassembler for human‑readable inspection.
Status: production release (v1.1). We optimize for simplicity (no backward compatibility guarantees yet) and strict Tcl 9.1 compliance throughout. Artifacts require an exact Tcl major/minor match at load time; 9.2+ artifacts are not accepted by a 9.1 loader and vice versa.
For versions prior to Tcl 9.1, please check
package require tbcx
# Save: source text → .tbcx
# - in: Tcl value (script text) | open readable channel | path to a readable .tcl file
# - out: open writable binary channel | path to a new .tbcx file
tbcx::save ./hello.tcl ./hello.tbcx
# Load: .tbcx → installs procs/methods/lambdas and executes top level
# - in: open readable binary channel | path to a .tbcx file
tbcx::load ./hello.tbcx
# Dump: pretty disassembly
puts [tbcx::dump ./hello.tbcx]- Save: Compile a script and write a
.tbcxartifact containing:- Top-level bytecode (with literals, AuxData, exception ranges, local names)
- All discovered
procs, each as precompiled bytecode - TclOO classes (advisory catalog of discovered class names for dump/introspection) and methods/constructors/destructors as precompiled bytecode. Class creation and superclass structure are reconstructed by executing the rewritten top-level script during
tbcx::load, not from a standalone serialized class graph. - Self methods (
self methodinsideoo::define) serialized with kindTBCX_METH_SELF; loaded viaoo::define { self method ... }to preserve metaclass inheritance for subclasses - Lambda literals (
apply {args body ?ns?}forms) compiled and serialized as lambda‑bytecode literals - Bodies found in
namespace eval $ns { ... }and other script-body contexts (compiled and bound to the correct namespace) - Instruction-level body detection (Phase 1): analyzes
invokeStkpatterns to identify and precompile bodies foreval,uplevel,try/on/trap/finally,catch,foreach,while,for,if/elseif/else,time,timerate,dict for/map/update/with(including FQN::tcl::dict::*forms), andself method - Unpushed literal detection (Phase 2): identifies dead-reference body literals from Tcl 9.1's inline-compiled
foreach/lmaploops (compiled toforeach_startopcodes) and precompiles them - O(1) opcode dispatch: instruction scanner uses a 256-entry
opMap[]lookup table covering all 120 Tcl 9.1 instruction types, replacing per-instruction string comparisons - Bytearray detection: strings with bytes ≥ 0x80 are probed and emitted as
TBCX_LIT_BYTEARRto avoid UTF-8 encoding corruption on round-trip startCommandstripping: at save time,startCommanddebugging instructions (~15% of bytecode) are replaced withnopbytes, reducing execution overhead while preserving jump offsets and exception ranges- Cross-interpreter support: body literals are emitted as
TBCX_LIT_BYTESRC(bytecode + preserved source text); loaded withsetPrecompiled=0so Tcl can gracefully recompile from source in child interpreters or after epoch bumps
- Load: Read a
.tbcx, reconstruct precompiled procs, method bodies, and literal lambdas, then execute the top-level block withTCL_EVAL_GLOBAL. Class creation, namespace setup, and other top-level effects happen naturally when the rewritten script runs — with source-equivalent semantics. - Dump: Pretty-print / disassemble
.tbcxcontents (header, literals, AuxData summaries, exception ranges, full instruction streams). - Safe interp support: In safe interpreters,
tbcx_SafeInitprovides the package and type infrastructure but does not register anytbcx::*commands. A parent interpreter may selectively grant access withinterp aliasorinterp expose. - Tcl 9.1 aware: Uses Tcl 9.1 internal bytecode structures, literal encodings, and AuxData types; exposes them via a stable binary format header (
TBCX_FORMAT = 91).
Compile and serialize to .tbcx.
inis resolved in this order:- open channel name — if the value names an existing open channel, it is read as text (encoding is left as-is — the caller controls it);
- readable file path — if the value is a path to a readable file, it is opened in text mode (default UTF-8 encoding), read, and closed by TBCX;
- literal script text — otherwise the value is treated as inline Tcl script text. Consequently, a value that looks like a path but is not currently readable is compiled as script text, not reported as a file-open error.
outmay be:- an open writable channel — binary mode (
-translation binary -eofchar {}) is enforced; the channel is not closed. Note: the caller's channel settings are mutated and not restored. - a path — TBCX writes a temporary file in the target directory and renames it into place only after serialization succeeds, so a failed save never leaves a truncated artifact at the final path.
- an open writable channel — binary mode (
- Result: returns the output channel handle or normalized output path.
What gets saved:
- The top‑level compiled block of the input script (code, literal pool, AuxData, exception ranges, local names/temps).
- All discovered
procbodies, precompiled with correct namespace bindings. Conflicting definitions acrossif/elsebranches are handled via indexed proc markers. - TclOO methods/constructors/destructors, precompiled.
- Lambda literals appearing in the script (e.g.
apply {args body ?ns?}forms) are compiled and serialized as lambda‑bytecode literals so they do not recompile on first use after load. - Namespace eval bodies and other script-body literals (try, foreach, while, for, catch, if/elseif/else bodies) are detected and pre-compiled to bytecode when safe to do so.
Load a .tbcx artifact, materialize procs and OO methods, rehydrate lambda bytecode literals, and execute the top‑level block with TCL_EVAL_GLOBAL.
inmay be an open readable binary channel or a path to a.tbcxfile.- Result: the top‑level executes (like
source), procs, OO methods, and embedded lambda literals become available without re‑compilation.
Produce a human‑readable string describing the artifact, including a disassembly of each compiled block and any lambda literals.
filenamemust be a path to a readable.tbcxfile.- Output: header, summaries, literal listings, AuxData and exception info, plus disassembly of the top‑level/proc/method/lambda bytecode.
Explicitly purge stale entries from the per‑interpreter lambda shimmer‑recovery registry (the ApplyShim). This is normally not needed — stale entries are purged lazily on each tbcx::load call — but can be useful in long‑running interpreters that load many .tbcx files and want to reclaim memory sooner.
- Takes no arguments.
tbcx::save compiles the given script and captures definitions in a single pass:
- Capture and rewrite:
CaptureAndRewriteScriptwalks the script's token tree once, extractingproc,namespace eval,oo::class create,oo::define(method/constructor/destructor/self method), andoo::objdefineforms. It simultaneously produces a rewritten script where captured method/constructor/destructor bodies are replaced with indexed stubs, ensuring the top-level bytecode doesn't redundantly contain their full source. - Namespace body scanning: The rewritten script and captured definition bodies are scanned (
ScanScriptBodiesRec) for nested script-body patterns —namespace eval,try/on/trap/finally,foreach,while,for,catch,if/elseif/else,uplevel,eval,dict for/map/update/with,lsort -command— building a mapping from body text to namespace FQN. - Pre‑compilation: Matched namespace eval body literals in the top-level literal pool are compiled into a side table (never modifying the pool itself) so they serialize as bytecode rather than source text.
- Strips
startCommanddebugging instructions from all compiled blocks (replaced withnopbytes for ~15% leaner execution).
- Strips
- Instruction scanning (
InstrScanBodyLiterals): Two-phase bytecode analysis runs on each compiled block:- Phase 1 (invokeStk analysis): Models the operand stack using a 256-entry
opMap[]dispatch table (built once per call from instruction names, O(1) per instruction). Tracks literal indices throughpush/loadStk/storeStk/swapetc. to identify which literal is the command argument for eachinvokeStkcall. Marks body literals foreval,try/on/trap/finally,catch,foreach,while,for,if,uplevel,time,timerate,dict for/map/update/with(including FQN::tcl::dict::*), andself method. - Phase 2 (unpushed literal detection): For blocks containing
foreach_startopcodes, identifies literal pool entries that are never referenced by anypushinstruction — these are dead body-text references kept by Tcl's compiler for error reporting. Marks them for precompilation viaLooksLikeScriptBody()filtering.
- Phase 1 (invokeStk analysis): Models the operand stack using a 256-entry
- Bytearray detection:
WriteLit_Untypedprobes string literals for bytes ≥ 0x80; if all code points ≤ 255, emits asTBCX_LIT_BYTEARRto prevent UTF-8 encoding corruption on round-trip. - Serialize: Emit:
- A header (magic, format version, Tcl version, code length, exception/literal/AuxData/local counts, max stack depth).
- The top-level compiled block (code bytes, literals, AuxData, exception ranges, locals epilogue). Captured proc bodies are stripped during this phase. Body literals are emitted as
TBCX_LIT_BYTESRC(source text + compiled bytecode). - A table of procs: name FQN, namespace, argument spec, then the separately compiled body block.
- Classes (advisory) — discovered class names for dump/introspection (currently
nSupers = 0; actual class structure is reconstructed by the top-level script at load time), then methods (class FQN, kind 0–4, name, argument spec, compiled body block). Self methods use kind 4 (TBCX_METH_SELF). - A final flush of any buffered output.
Runaway protection: The serializer tracks total literal calls, block calls, recursion depth, and output bytes. If any limit is exceeded (2M literals, 256K blocks, depth 64, or 256 MB output), serialization aborts with a diagnostic error.
Supported literal kinds: bignum, boolean, bytearray, dict (insertion order preserved), double, list, string, wideint, wideuint, lambda‑bytecode, bytesrc (bytecode + source text for cross-interp recompilation).
Supported AuxData: jump tables (string and numeric), dict-update, NewForeachInfo.
tbcx::load reads the header, validates magic/format/Tcl‑version compatibility, then deserializes sections:
- Top-level block: Deserialized and marked
TCL_BYTECODE_PRECOMPILEDso Tcl skips compile-epoch checks and executes the bytecode directly.TBCX_LIT_BYTESRCliterals within the block are loaded withsetPrecompiled=0and their source text restored as string rep, allowing Tcl to recompile from source when needed (e.g. cross-interpreter evaluation or epoch mismatch). - Procs: A temporary ProcShim intercepts the
proccommand (bothobjProc2andnreProc2dispatch paths). When the top-level block evaluates aproccall matching a saved definition (by FQN + argument signature, or by indexed marker for conflicting definitions), the shim substitutes the precompiled body. Unmatchedproccalls pass through to Tcl's original handler. - Classes and methods: An OOShim temporarily renames
oo::define(andoo::objdefinewhen available) to intercept method/constructor/destructor installations. Matching definitions receive precompiled bodies; constructors and destructors use a create-then-swap pattern (placeholder body";"→ TclOO builds dispatch → bytecode swap) to preservenextrouting through the constructor chain. Self methods (kind 4) are installed viaoo::define CLASS { self method NAME ARGS BODY }— this uses the renamed originaloo::definecommand, which properly sets up the metaclass inheritance chain so subclass class-objects inherit the method. - Lambda recovery: An ApplyShim is installed as persistent per-interpreter
AssocData. When a precompiled lambda'slambdaExprinternal rep gets evicted by shimmer, the shim detects the missing rep on the next[apply]call and re-installs the precompiledProc*from its registry before forwarding to Tcl's real[apply]. - Top-level execution: The precompiled top-level block is evaluated via
Tcl_EvalObjExwithTCL_EVAL_GLOBAL. Compiled locals for the top-level frame are set up by linking named variables to existing globals (viaTopLocals_Begin/TopLocals_End).TCL_RETURNis handled the same waysourcedoes — converting it toTCL_OKwith the return value as the result. - Cleanup: The ProcShim and OOShim are removed (original command handlers restored). The ApplyShim persists for the interpreter's lifetime to support lambda shimmer recovery.
Endianness is detected and handled so that hosts read/write a consistent little-endian format on disk.
Header (compact, binary, little‑endian):
| Field | Type | Description |
|---|---|---|
magic |
u32 | 0x58434254 ("TBCX") |
format |
u32 | 91 (Tcl 9.1) |
tcl_version |
u32 | maj<<24 | min<<16 | patch<<8 | type |
codeLenTop |
u64 | Code byte count for top-level block |
numExceptTop |
u32 | Exception range count |
numLitsTop |
u32 | Literal count |
numAuxTop |
u32 | AuxData count |
numLocalsTop |
u32 | Local variable count |
maxStackTop |
u32 | Maximum stack depth |
Sections (in order):
- Top‑level block — code bytes, literal array, AuxData array, exception ranges, epilogue (maxStack, reserved, numLocals, local names).
- Procs — u32 count, then repeated tuples: name FQN (LPString), namespace (LPString), argument spec (LPString), compiled block.
- Classes (advisory) — u32 count, then class FQN; currently records discovered class names for dump/introspection only. Class creation and superclass structure are reconstructed by the rewritten top-level script at load time.
- Methods — u32 count, then repeated tuples: class FQN, kind (u8: 0=inst, 1=class, 2=ctor, 3=dtor, 4=self), name, argument spec, compiled block.
Literal tags (u32):
| Tag | Kind | Payload |
|---|---|---|
| 0 | BIGNUM | u8 sign, u32 magLen, LE magnitude bytes |
| 1 | BOOLEAN | u8 (0/1) |
| 2 | BYTEARR | u32 length + raw bytes |
| 3 | DICT | u32 pair count, then key/value literal pairs (insertion order) |
| 4 | DOUBLE | 64-bit IEEE-754 as u64 |
| 5 | LIST | u32 count, then nested literals |
| 6 | STRING | LPString (u32 length + bytes) |
| 7 | WIDEINT | signed 64-bit as u64 |
| 8 | WIDEUINT | unsigned 64-bit as u64 |
| 9 | LAMBDA_BC | ns FQN, args, compiled block, body source text |
| 11 | BYTESRC | source text (LPString) + ns FQN + compiled block (enables cross-interp recompilation) |
AuxData tags (u32):
| Tag | Kind | Payload |
|---|---|---|
| 0 | JT_STR | u32 count; key LPString + u32 offset per entry |
| 1 | JT_NUM | u32 count; u64 key + u32 offset per entry |
| 2 | DICTUPD | u32 length; local indices |
| 3 | NEWFORE | u32 numLists, u32 loopCtTemp, u32 firstValueTemp, u32 numLists (dup), then per-list var indices |
Method kinds (u8):
| Kind | Name | Description |
|---|---|---|
| 0 | INST | Instance method |
| 1 | CLASS | Class method (classmethod) |
| 2 | CTOR | Constructor |
| 3 | DTOR | Destructor |
| 4 | SELF | Self method (installed via oo::define { self method } for metaclass inheritance) |
LPString: a u32 byte-length followed by that many raw bytes (no NUL terminator on disk).
- Functional equivalence:
tbcx::loadaims to be functionally identical tosourceof the original script. Differences are limited to avoiding re‑parse/re‑compile time. - Namespaces: The top-level block is executed with
TCL_EVAL_GLOBAL. Saved blocks carry namespace metadata to bind compiled code correctly. Lambda literals that include a namespace element keep that association. - Version check: The loader requires an exact major.minor Tcl version match (e.g. 9.1). Bytecode instruction semantics can change between minor versions.
- Sanity limits: Code ≤ 64 MiB; literal/AuxData/exception pools ≤ 1M entries; LPString ≤ 4 MiB; output ≤ 256 MB; recursion depth ≤ 64.
This package uses Tcl 9.1 APIs and selected internals (e.g., tclInt.h, tclCompile.h).
You'll need Tcl 9.1 headers/libs on your include path and to build as a standard loadable extension.
The entry point tbcx_Init registers commands and provides tbcx 1.1.
The safe entry point tbcx_SafeInit provides the package and type infrastructure but registers no commands; use interp alias or interp expose from a parent interpreter to grant selective access.
Example (TEA Linux/macOS):
./configure
make install
make test- Security: Loading a
.tbcxexecutes code (top-level) and installs commands/classes. Only load artifacts you trust. - Compatibility:
TBCX_FORMATis91(Tcl 9.1). Different formats are rejected during load. An exact major.minor Tcl version match is required. - AuxData coverage: The saver asserts that all AuxData items in a block are of known kinds (jump tables, dict-update, NewForeachInfo). Unknown kinds cause the save to abort.
- OO coverage: Supports
oo::class create,oo::define(method/classmethod/constructor/destructor/self method plus declarative keywords like variable/superclass/mixin/filter/forward), andoo::objdefine. Builder-form class bodies are expanded into multi-word stubs for correct load-time reconstruction. Self methods (self methodinsideoo::define) are serialized with kind 4 (TBCX_METH_SELF) and loaded viaoo::define { self method ... }to preserve metaclass inheritance for subclasses. - Lambda shimmer recovery: Precompiled lambdas are registered in a persistent per-interpreter ApplyShim. If the
lambdaExprinternal rep is evicted by shimmer, the shim transparently re-installs it on the next[apply]call. - Precompilation boundary: TBCX precompiles bodies and lambdas only when they are present in statically identifiable literal positions. Strings assembled at runtime (e.g. with
format, interpolation, orlistconstruction) still round-trip correctly, but they remain ordinary data and compile at execution time when Tcl evaluates them. - OO coverage: TBCX preserves normal TclOO class/object construction semantics by executing the rewritten top-level script, while substituting precompiled bodies for recognized
oo::define/oo::objdefinemethod forms. Tested scenarios include class methods, self methods, per-object methods, private methods, inheritance (including diamond), mixins, filters, forwards, abstract/singleton metaclasses, method rename/delete/export changes, metaclasses withself method, andnext-based constructor chaining. Declarative TclOO builder commands (variable,superclass,mixin,filter,forward) are preserved in the rewritten top-level. - Multi-interpreter and threads: TBCX follows Tcl's standard threading model: only the thread that created an interpreter may call
tbcx::save,tbcx::load,tbcx::dump, ortbcx::gcon that interpreter. Multi-thread support means multiple independent interpreters (each used by its owning thread), not sharing one interpreter across threads. Calling a TBCX command from a non-owning thread returnsTCL_ERRORwith a diagnostic message. Artifacts are designed to load into interpreters other than the originating one. Interpreter-specific state such as the ApplyShim lambda registry, load depth, and OO shim IDs remains per-interpreter. tbcx::gc: Safe to call before any load (no-op) and safe to call repeatedly. Does not interfere with subsequent save/load operations.- Load reentrancy: Nested or reentrant
tbcx::loadcalls are capped at depth 8 per interpreter. - Conflicting proc definitions: When multiple branches define a proc with the same name (e.g.
if {$cond} {proc p ...} else {proc p ...}), the saver emits indexed markers so the loader matches by position rather than by FQN alone. - Endianness: Host endianness is detected at runtime; streams are always little-endian on disk.
tbcx.h— shared definitions (header layout, tags, limits, buffered I/O types)tbcx.c— package init, byte‑order detection, type discovery, command registration, safe inittbcxsave.c— capture, rewrite, compile, and serializetbcxload.c— deserialize, shim, materialize, and executetbcxdump.c— disassembler/dumper
Issues and PRs are welcome. Given the reliance on Tcl internals, please include Tcl 9.1 details (commit/tag, platform, compiler) with any bug reports.
MIT License. Copyright © 2025–2026 Miguel Bañón.
Built on top of Tcl 9.1's bytecode engine, object types, AuxData, and TclOO.
Thanks to the Tcl/Tk community.