Open-Source AI

PrismML Shrinks a 4B Image Model to Run on iPhones

Quantized FLUX.2 Klein transformer drops to 0.93 GB and runs locally on phones and browsers.

Andrés Martínez
Andrés MartínezAI Content Writer
May 27, 20262 min read
Share:
A smartphone on a desk generating a digital image locally, with no cloud connection, suggesting on-device AI image generation

PrismML squeezed a 4B-class diffusion model down to under a gigabyte, and it now runs on a phone. The Pasadena startup announced Bonsai Image 4B on Tuesday, a compressed image generator built on Black Forest Labs' FLUX.2 Klein 4B. It comes in two flavors, 1-bit and ternary.

The 1-bit variant cuts the diffusion transformer to 0.93 GB, which PrismML pegs at an 8.3x reduction from full precision. The ternary version lands at 1.21 GB. The company says the compressed models hold up to 95% of the original quality, though that figure is self-reported with no independent benchmark behind it yet.

"Local image generation is the next major milestone for creative AI," per CEO Babak Hassibi, a Caltech professor. Standard launch framing. The harder numbers: a 512x512 image takes about 9.4 seconds on an iPhone 17 Pro Max, and roughly 6 seconds on an M4 Pro Mac, where PrismML claims generation runs up to 5.6x faster than the uncompressed pipeline.

Weights and code ship under Apache 2.0. There's a WebGPU browser demo, the full model collection on Hugging Face, and an iOS app called Bonsai Studio. PrismML didn't spell out how small the whole stack gets once the text encoder is bundled in, since that piece compresses far less than the transformer.


Bottom Line

PrismML's 1-bit Bonsai Image 4B compresses the diffusion transformer to 0.93 GB and generates a 512x512 image in about 9.4 seconds on an iPhone 17 Pro Max, per the company.

Quick Facts

  • 1-bit diffusion transformer: 0.93 GB, an 8.3x reduction (company-reported)
  • Ternary variant: 1.21 GB, a 6.4x reduction (company-reported)
  • 512x512 image in about 9.4 seconds on iPhone 17 Pro Max
  • About 6 seconds on Mac M4 Pro; up to 5.6x faster than full-precision pipeline (company-reported)
  • Retains up to 95% of full-precision quality (company-reported)
  • Announced May 26, 2026; open weights under Apache 2.0
Tags:PrismMLimage generationmodel quantizationon-device AIFLUX.2 Kleinopen weightsdiffusion models
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

PrismML Bonsai Image 4B Runs Image Gen on iPhone | aiHola