PrismML Bonsai Image 4B Runs Image Gen on iPhone

A smartphone on a desk generating a digital image locally, with no cloud connection, suggesting on-device AI image generation

PrismML squeezed a 4B-class diffusion model down to under a gigabyte, and it now runs on a phone. The Pasadena startup announced Bonsai Image 4B on Tuesday, a compressed image generator built on Black Forest Labs' FLUX.2 Klein 4B. It comes in two flavors, 1-bit and ternary.

The 1-bit variant cuts the diffusion transformer to 0.93 GB, which PrismML pegs at an 8.3x reduction from full precision. The ternary version lands at 1.21 GB. The company says the compressed models hold up to 95% of the original quality, though that figure is self-reported with no independent benchmark behind it yet.

"Local image generation is the next major milestone for creative AI," per CEO Babak Hassibi, a Caltech professor. Standard launch framing. The harder numbers: a 512x512 image takes about 9.4 seconds on an iPhone 17 Pro Max, and roughly 6 seconds on an M4 Pro Mac, where PrismML claims generation runs up to 5.6x faster than the uncompressed pipeline.

Weights and code ship under Apache 2.0. There's a WebGPU browser demo, the full model collection on Hugging Face, and an iOS app called Bonsai Studio. PrismML didn't spell out how small the whole stack gets once the text encoder is bundled in, since that piece compresses far less than the transformer.

Bottom Line

PrismML's 1-bit Bonsai Image 4B compresses the diffusion transformer to 0.93 GB and generates a 512x512 image in about 9.4 seconds on an iPhone 17 Pro Max, per the company.

Quick Facts

1-bit diffusion transformer: 0.93 GB, an 8.3x reduction (company-reported)
Ternary variant: 1.21 GB, a 6.4x reduction (company-reported)
512x512 image in about 9.4 seconds on iPhone 17 Pro Max
About 6 seconds on Mac M4 Pro; up to 5.6x faster than full-precision pipeline (company-reported)
Retains up to 95% of full-precision quality (company-reported)
Announced May 26, 2026; open weights under Apache 2.0

Tags:PrismMLimage generationmodel quantizationon-device AIFLUX.2 Kleinopen weightsdiffusion models

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

PrismML Shrinks a 4B Image Model to Run on iPhones

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Xiaomi Cuts MiMo-V2.5 API Prices by Up to 99%

DeepSeek Sparse Attention Gets a From-Scratch Implementation Built for Reading

OpenBMB Open-Sources MiniCPM5-1B Weights, Data, and Code

Stay Ahead of the AI Curve