# Loom — Golang AI Engine, Portable & Zero CGO (v0.79)

> Loom is a pure Golang AI engine: neural networks in Go with zero CGO on Windows, Linux, macOS, Android, iOS, and WASM. v0.79 Bedrock Validation: CPU train/save/reload, MHA decode, seven-layer Lucy suite, C-ABI 461/461. BitNet, WebGPU, 21 dtypes, welvet bindings.

Canonical: https://openfluke.com/loom

---

Open Source · github.com/openfluke/loom
The Universal
AI Engine
M-POLY-VTD — a ground-up neural engine in Go: 3D volumetric grids, 21 numeric types,
and polyglot bindings ( welvet ) for Python, TypeScript, Dart, and WASM.
Train once, run with bit-identical results on CPU, WebGPU, and every major OS.
v0.79.0 — Bedrock
Seven-layer CPU suite
21 DTypes · DNVM
Read the docs
GitHub
Releases
v0.79.0
Bedrock validation · 111/142 checklist
v0.79 — trustworthy CPU train → save → reload → infer
Lucy [7] — 10 layer types × 21 dtypes × 1³/2³/3³ grids · SC/MC · train · native save/reload
MHA layout + KV — [B,S,D] training · autoregressive decode · Poly Talk fixed
Native persistence — BitNet ternary + signed low-bit round-trip · LoomSyncInferenceWeights
Welvet C-ABI — 461/461 export parity (rebuild libwelvet after upgrade)
Still from v0.78: Dense asm forward · BitNet CPU · WebGPU · Donate Compute · TANHI
AI Deep Research
Independent AI Analysis of Loom
Comparative research on M-POLY-VTD vs PyTorch and JAX — plus the full engine reference on this site,
synced from loom/docs .
Architecture & research
3D grids, target propagation, DNVM
Start with the overview — volumetric dispatch, WeightStore morphing, step mesh, transformers, and v0.79 bedrock validation .
Why Loom?
All Loom docs
Research write-up
🧊
AI that thinks in 3D
Most AI frameworks process data in a straight line, like an assembly line. Loom uses a
three-dimensional grid — more like how your brain's neurons actually connect, jumping across regions rather than always going layer by layer.
💾
Fits AI on a USB stick
Loom can compress AI models by up to 98.4%. A model that normally takes gigabytes of storage
can shrink to a fraction — small enough to run on a phone or an old laptop with no internet required.
🧬
Learns like biology, not math
Traditional AI learning requires freezing everything to calculate one massive equation.
Loom's Target Propagation lets each part of the network learn independently — more like
how neurons fire and strengthen in a real brain.
Read the Full Technical Breakdown
For Non-Technical People
What is Loom, exactly?
"Think of Loom like SQLite — but for AI."
SQLite is a tiny database that runs inside your app with no server needed.
Loom is the same idea for neural networks: a self-contained engine you can
drop into any project, on any device, with no cloud account, no GPU server,
no complicated setup.
🧠
Train it like a brain
A neural network learns by seeing examples — like showing a child thousands of
pictures of cats until they know what a cat is. Loom provides all the tools
to build and teach these networks.
📦
Pack it anywhere
Once trained, your model is a tiny file. Drop it into your Python script,
your phone app, your website, or a game engine. Loom runs it everywhere
with the exact same output.
🔒
No cloud needed
Unlike ChatGPT or other AI services, Loom runs 100% locally on your device.
Your data never leaves your machine. Perfect for privacy-sensitive apps
or offline use.
⚡
WebGPU acceleration
On supported devices, Loom uses your GPU through WebGPU — achieving
17× to 65× faster training than CPU. Works in browsers too.
🌍
Every language
Python developer? pip install welvet . JavaScript? npm install @openfluke/welvet .
Go, C, C#, Rust? There are bindings for all of them. One model, every language.
🎯
Deterministic on CPU & GPU
Loom's Deterministic Neural Virtual Machine (DNVM) delivers bit-identical behaviour across
Apple Silicon, x86, WebGPU, and language bindings. Lucy and SoulGlitch depend on this for reproducible local inference.
🧬
Evolution built in
Loom includes a full NEAT evolution engine — models can mutate and breed
like living organisms. This powers SoulGlitch's creature evolution system.
Lucy & SoulGlitch
Supported Hugging Face models
Approved checkpoints share the same list in loom/lucy and SoulGlitch—download once, run offline via welvet.
SmolLM2
135M · 360M · 1.7B Instruct — mobile to server brains
Qwen3
0.6B · 1.7B · 4B — GPU-friendly chat models
BitNet b1.58
microsoft/bitnet-b1.58-2B-4T — packed ternary CPU path (v0.78+, native save/reload in v0.79)
Plus custom Loom/poly networks (training, NEAT, DNA) with no HF download.
Get Started
Install in 30 seconds
Pick your language and paste the command. No account required.
Python
Node.js
Go
WebAssembly
$
pip install welvet
Copy
Ships with precompiled native libraries for Windows, Linux, macOS, iOS, and Android.
Zero Python dependencies.
PyPI page →
$
npm install @openfluke/welvet
Copy
Works in Node.js and browsers via WebAssembly.
npm page →
$
go get github.com/openfluke/loom/poly
Copy
Pure Go module. No CGO. Works with standard go build .
Quick reference →
· Source →
Download main.wasm from the releases page
Download
6.9 MB WASM bundle. Drop into any web page and run Loom in the browser.
All releases →
Platform Support
Runs everywhere
Prebuilt native libraries for every major platform — just download and go.
Windows
x86-64, ARM64
Linux
x86-64, ARM64, ARM v7, x86
macOS
x86-64, ARM64 (M-series), Universal
Android
ARM64, x86-64
iOS
ARM64, Simulator, XCFramework
WebAssembly
Browser + Node.js
WebGPU
Forward + Backward pass, 17×–65× speedup
PyPI
welvet — zero dependencies
For Developers
What's under the hood
Loom isn't just a wrapper around PyTorch. It's a ground-up engine built for portability and precision.
All major layer types
Dense, MHA, SwiGLU, RMSNorm, LayerNorm, CNN 1D/2D/3D, Transposed Conv, RNN, LSTM, Embedding, KMeans, Softmax, Parallel, Sequential, Residual.
21 numeric types
float64 all the way down to binary (1-bit), including fp8, fp4, int4, and ternary. Choose precision vs. model size at runtime.
NEAT evolution + DNA
A full neuroevolution engine with mutation, crossover, and fitness selection. Models have a "DNA" signature for reproducible evolution.
98.4% compression
Native bit-packed serialization shrinks model files by 98.4% compared to raw float storage. Plus SafeTensors support for HuggingFace compatibility.
Target propagation
An alternative to backpropagation where each layer is given a direct target. More biologically plausible and works for non-differentiable layers.
Step mesh engine
Clock-cycle 3D grid with double-buffered layers, spatial remote links, BPTT, and neural target propagation — online learning without a rigid layer stack.
BitNet & low-bit CPU
BitNet b1.58–style checkpoints with packed ternary linear layers. Lucy pulls from Hugging Face; welvet C-ABI exposes CPU inference paths.
Operation mesh
Donate Compute (LAN TCP model sharing), TANHI UDP layer telemetry for SoulGlitch HUD, tiled forward/backward, and Qwen3-family HF ingest.
Full documentation
Deployment guide
BitNet CPU
Watch It Work
See Loom In Action
Real demos — Loom models running in real time, live TANHI telemetry to SoulGlitch on your phone, benchmarks, and 3D visualization.
Loom × SoulGlitch · live
TANHI × Regional Mix — models on your PC, view on your phone
Watch Loom AI models run in real time on a regional_mix harness (Dense, MHA, SwiGLU, RNN, LSTM with remote links across 3D topologies).
Execution streams over UDP as TANHI telemetry into SoulGlitch on your local phone — a spatial, time-scrubbable trace instead of numbers in a terminal.
TANHI docs →
·
YouTube →
Performance Benchmark
Forget Llama.cpp: WebGPU Inference in Pure Go
SmolLM2-135M benchmarks: 68 tok/s on RTX 1650 Super, 143 tok/s on Linux i5, 229 tok/s on Mac M4.
Zero CGO. FlashPoly Tiling. Bit-level deterministic across OS boundaries.
Visualization
Loom: Visualizing 3D Neural Networks in Real-Time
Watch the AI "think" in real-time. Stepping mode, 3D grid topology, Zig-Zag and Starburst routing patterns — the black box, opened.
Android · Airplane Mode
Offline LLM Inference on Android via Loom AI
Loom v0.0.8 running 100% locally on Android — device locked in Airplane Mode throughout. Zero cloud dependency. Pure on-device compute from first principles in Go.
Open Source Tool
NeuralWave: 3D Neural Network Visualization & Weight Analysis
Real-time model discovery from HuggingFace, interactive 3D layer inspection, attention head visualization.
Built on Loom + Go backend + Three.js.
Star Loom on GitHub
Loom is free, open-source, and built in the open. Stars help others find it and fuel continued development.
Star openfluke/loom
Report an Issue
Star
Fork
