How We Shipped a 13 MB Image Converter That Runs Entirely in Your Browser

How We Shipped a 13 MB Image Converter That Runs Entirely in Your Browser

A technical deep-dive into building SciZone: compiling ImageMagick, libwebp, libheif and exiv2 to 13 MB of WebAssembly, adaptive PSNR/SSIM quality search, and a multi-worker pipeline that keeps the UI at 60fps.

How We Shipped a 13 MB Image Converter That Runs Entirely in Your Browser

Early on, we asked ourselves a simple question: does this converter actually need a server?

The server would receive images, run a C++ library against them, and send results back. The hard part — the actual codec work — happens in a native library that runs just as well in WebAssembly as it does in a server process. So we removed the server. Everything runs in the user’s browser tab. No upload, no queue, no round-trip.

This post is the story of how we got there — what we compiled, the quality problem we had to solve, and the surprising number of ways a 1000-image batch can break.

If you want to see the result before reading about it, open scizone.dev and drop a folder of photos on the page. Come back when you’re curious how it works.

Why we didn’t go server-side

Server-side image conversion is the obvious path. sharp, imagemagick, libvips — mature, well-tested, trivial to deploy. Write a small service, expose a POST endpoint, done.

The problems surface when people actually use it:

Bandwidth is real money. 1 GB in + 500 MB out, multiplied by daily traffic, gets expensive fast. This is why every free converter caps batch sizes — not a technical limitation, a cost containment measure.

Upload time dominates the experience. On a home connection, uploading 100 photos takes longer than converting them. Your users spend most of their time watching an upload progress bar before anything useful happens.

Privacy is a promise you can’t prove. Even a well-intentioned service can’t guarantee that files didn’t end up in a crash log or a backup snapshot. “We don’t store your files” is a policy, not something verifiable from the outside.

You become the bottleneck. Scale is your problem. One slow batch degrades the experience for everyone.

Running client-side fixes all four: zero bandwidth cost, no upload wait, privacy that’s verifiable in DevTools, and the user’s own CPU doing the work.

What’s in the 13 MB binary

Getting a real image converter into the browser meant compiling the native codec stack with Emscripten. Here’s what ended up in the binary:

  • ImageMagick 7.1.2 — decodes basically every image format ever made: JPEG, PNG, TIFF, GIF, BMP, PSD, HEIF, and more
  • libwebp 1.6.0 — the WebP encoder, with SIMD paths enabled
  • libheif 1.21.2 + libde265 1.0.18 — HEIC/HEIF decoding; libde265 is the same HEVC decoder Apple uses in macOS Preview
  • libjpeg-turbo 3.1.4 — faster JPEG decoding than ImageMagick’s built-in path
  • libpng, giflib, libtiff — format-specific decoders for quality-critical paths
  • exiv2 0.28.8 — reads and writes EXIF, IPTC, and XMP metadata
  • libaom 3.13.3 — pulled in transitively via libheif; also available for future AVIF output

Plus supporting libraries: libzstd, libdeflate, brotli, libxml2, and a few others.

The total compresses to about 5 MB over the wire, cached aggressively by a Service Worker so it loads once per device. After that, it’s there even offline.

The quality problem with fixed-quality converters

Most converters give you a quality slider. You pick 80, that gets applied to everything.

This is a poor default because image complexity varies enormously. Quality 80 on a flat logo wastes bytes — you could cut the file in half with no visible change. Quality 80 on a high-detail portrait introduces artifacts — you needed 87 to stay above the “looks the same” threshold.

Our approach is to search for the right quality per image automatically. We target two perceptual metrics:

  • PSNR ≥ 44.5 dB: above this, differences from the original are imperceptible on natural photographs
  • SSIM ≥ 0.95: above this, edges, textures, and gradients are preserved intact

The algorithm:

  1. Find the hardest part of the image. Locate the highest-entropy region — the block with the most complex detail. This is the worst case for compression quality.
  2. Binary search the quality setting. Encode that block at various quality levels, measuring PSNR and SSIM, until we find the lowest setting that passes both thresholds.
  3. Full encode at the found quality. Run a final libwebp encode of the whole image, then use exiv2 to copy EXIF and ICC color data onto the output.

The overhead is about 1.2–1.5× a single encode. The payoff: every output file is at the optimal size for its content — no wasted bytes on simple images, no artifacts on complex ones.

The worker pool

A single WebAssembly instance can only use one CPU core, and running it on the main thread freezes the UI. Neither is acceptable.

The fix is Web Workers. SciZone spawns one worker per logical CPU core (navigator.hardwareConcurrency), and each worker runs its own full copy of the WASM converter. Images are distributed across workers as they come in; results come back via postMessage.

A few things that bit us along the way:

Memory grows over time. Emscripten’s allocator doesn’t fully release heap between allocations. On long-running workers, this adds up to a slow leak. Solution: recycle each worker after 32 jobs. Cold-start is ~100ms with the WASM module cached — cheap enough to do freely.

Transferable buffers are essential. Passing an ArrayBuffer with the transfer list moves it to the worker without copying. For a 50 MB TIFF, the difference between copy and transfer is the difference between a smooth UI and a stalled one.

IndexedDB as a safety net. Conversion results go into IndexedDB before the ZIP builder. If the tab crashes mid-batch, everything that finished is recoverable.

OffscreenCanvas for previews. Thumbnail generation happens in the worker, so the main thread never touches raw pixel buffers — it just receives a rendered thumbnail.

What breaks at 1000+ images

Running large batches surfaced problems that don’t appear in demos:

Memory at the extremes. Very large inputs — 200+ MB TIFF scans — can push a worker’s heap toward the 4 GB per-instance limit. We added a pre-check: if the estimated decoded size is above a threshold, the file routes to a fresh “big file” worker.

ZIP streaming. Building a 10 GB ZIP in memory works until it doesn’t. We stream ZIP entries out as each file completes, so the browser’s save dialog opens long before the last image finishes.

Cancellation. If a user cancels a batch, cancellation has to work without corrupting a currently-encoding worker. Our approach: terminate and recreate the worker. Simple, and the restart is fast enough to be unnoticeable.

Quality warnings. Some images can’t hit the PSNR/SSIM thresholds even at maximum quality — typically very high-ISO noise or heavily pre-compressed files. We surface a visible warning rather than silently shipping a file that missed the target.

Building it all

The Emscripten build pipeline isn’t trivial, but it’s more manageable than it sounds.

build_deps.sh compiles each third-party library via Emscripten’s emconfigure/emmake wrappers and installs them into a shared sysroot. Most libraries just work; a few need configure-flag adjustments. build.sh compiles our own C++ against the sysroot, links everything statically, and emits imgproc.js + imgproc.wasm.

For incremental updates, scripts/rebuild_dep.sh rebuilds a single dependency in a few minutes rather than the full ~15-minute clean build. A CI check flags any size regression above 300 KB so we notice when an upstream bump starts pulling in bloat.

Two optimization wins we came back to: re-enabling libwebp’s SIMD paths cut JPEG encode time by 14.5%. Enabling libde265’s SSE4.1 backend cut HEIC decode time by 18%. Both had been disabled early for simplicity.

What’s coming next

A few things on the roadmap we’re genuinely excited about:

AVIF output. libaom is already in the binary — we just haven’t wired it into the user-facing pipeline yet. We’re waiting until the encoding time is practical for batch use (either hardware acceleration or a faster encoder).

Animated WebP from GIF and APNG inputs. The codec supports it; the pipeline doesn’t yet.

A native CLI and MCP server sharing the same C++ core — the same conversion quality available from a terminal or an LLM tool chain, not just a browser tab.

RAW support via libraw. The most-requested missing format.

The takeaway

Browsers in 2026 can run the full native image processing stack. A well-tuned WebAssembly binary isn’t meaningfully slower than a native process for most image workloads — the only real cost is the one-time ~5 MB download, which the Service Worker caches permanently after that.

If you’re building anything image-adjacent and your first instinct is “I’ll spin up a processing server,” ask whether that server actually needs to exist. Often it doesn’t.

The result is live at scizone.dev. Drop a folder of photos and watch your browser do the work.