The dominant theory of how to build large language models assumes their weights need to be expressed in floating-point precision. We don't think that's true.
Neural networks appear to compute through structures that are fundamentally discrete: combinations of signs, gates, and routing decisions that current training methods happen to encode in continuous representations. A decade of work on extreme compression has been chipping away at the precision assumption from multiple directions, and the gap between compressed and full-precision capability has been closing faster than the field acknowledges. The pieces are scattered across pruning, low-bit quantization, and recent work on training natively in compressed weight spaces. No single team has put them together at the scale where the question actually gets answered.
We're a research team training ternary-weight language models from scratch at scale. The goal is to show that frontier-grade capability does not require frontier-grade precision. If we're right, the implications reach across training, deployment, and the long-term shape of AI hardware. If we're wrong, the failure mode tells the field where the precision-capability frontier actually sits, which is also a useful answer.
AI capability is increasingly bottlenecked not by ideas but by access to compute and energy, and those constraints fall unevenly across the world. The current trajectory concentrates frontier capability in a small number of actors with the budget to operate enormous fleets of high-precision accelerators. Removing the precision tax on neural network weights is one of the few levers that changes that equation at the algorithmic layer, without waiting on a generation of new silicon. The work compounds with hardware progress without depending on it.
værn is a small team in Europe. We work across training methods, model architecture, and inference systems, and we're hiring across all three.
Want to be part of it? Write us.