Product goal
Give a user an interactive way to see where rivers form, adjust layout settings, and compare baseline text against river-aware alternatives.
Project Case Study
A browser-first system for finding typographic rivers in justified text, then using those detections to compare better line breaks.
Typographic rivers are the vertical or diagonal whitespace channels that appear when justified text stretches gaps into accidental paths. They are easy for a reader to notice and hard for a layout engine to reason about because they emerge across lines, not inside a single word or sentence.
I built Riverbreak as a complete research-to-demo loop: a Knuth-Plass-style browser line breaker, a real-time geometric detector, synthetic weak-label data generation, a small U-Net segmentation model, hand-labeled validation and test sets, ONNX browser inference, and a detector-guided reranker that compares baseline and optimized paragraph layouts.
Justified text layout is usually optimized around line badness: avoid awkward spacing, avoid extreme stretch, and choose a visually balanced set of line breaks. That misses a cross-line failure mode. A paragraph can look acceptable line by line while repeated whitespace gaps align into a visible river.
Riverbreak asks a practical question: can a detector make this failure mode inspectable enough to influence layout decisions? The project is not only a model. It is a toolchain for rendering paragraphs, detecting rivers, comparing detectors, and feeding the signal back into line-breaking choices.
Give a user an interactive way to see where rivers form, adjust layout settings, and compare baseline text against river-aware alternatives.
Connect classic layout logic, computer vision, annotated evaluation, and browser inference without hiding which parts are proven and which parts are still prototype-grade.
I split the project into four surfaces so each part could be tested and explained independently: the browser layout demo, the heuristic detector, the neural detector, and the reranking prototype.
The demo implements a Knuth-Plass-style paragraph layout simulator with beam search. It renders justified lines with explicit per-line spacing, which makes gap geometry available for inspection instead of leaving it implicit in browser text layout.
The first detector traces inter-line whitespace corridors using gap-center alignment, diagonal drift tolerance, chain length, and confidence scoring. It is fast enough to update interactively while users change width, tolerance, and river-weight settings.
The ML path uses a small U-Net trained as a binary segmentation model. It takes a 256 x 256 grayscale paragraph image and returns a per-pixel probability map for river-like regions.
The reranker generates bounded candidate layouts, scores each one with a combined classical and river-severity objective, then shows baseline and optimized paragraphs side by side.
I started with geometry because the browser demo needed instant feedback. A model that takes a second to run is useful for analysis, but it is the wrong default for dragging a slider. The heuristic detector gave the project a fast baseline and a way to generate weak labels for early model training.
The neural detector is intentionally small and reproducible. I trained a focal-loss SmallUNet on 128 synthetic weak-label paragraphs, then evaluated it against separate hand-labeled validation and test images. The training signal is imperfect by design, so the annotated split is the evidence that matters.
Each sample is normalized to a 256 x 256 grayscale paragraph render. The model outputs a probability map, allowing overlay visualization and pixel-level comparison against masks.
Focal loss fits the task because positive river pixels are sparse. The model needs pressure on the difficult minority class instead of optimizing mostly for background.
Synthetic weak labels are used for training. Hand-labeled validation and test sets are kept separate so evaluation is not just measuring agreement with the heuristic label generator.
The repo tracks annotations, the selected benchmark report, browser samples, and the ONNX model. Large PyTorch checkpoints and generated synthetic data are regenerated locally.
The current best checkpoint is a focal-loss SmallUNet trained on synthetic weak labels. Threshold selection is done on validation data and then applied unchanged to the test split, so the test numbers are not tuned after the fact.
Best test result: neural focal model Dice 0.5018 and IoU 0.3349, compared with heuristic baseline Dice 0.4182 and IoU 0.2644. On the same test split, neural precision is 0.4177 and recall is 0.6282.
I exported the focal checkpoint to a single ONNX model and wired it into the browser as an optional live neural overlay. The goal was not to replace the heuristic toggle. It was to make the model inspectable on the exact paragraph the user is viewing.
web/models/river_detector.onnx model and runs CPU WASM inference against a rasterized
256 x 256 paragraph canvas.The most interesting product step was turning detection into action. The reranker does not rewrite the line breaker. It generates a small set of bounded candidate layouts around the current settings, scores them, and picks the best candidate for side-by-side comparison.
Scoring objective: classical_demerits + alpha * river_severity.
Classical demerits come from the line breaker. River severity currently comes from the geometry
detector, not from a live U-Net call.
Riverbreak is the project on my portfolio that best shows how I connect product demos, algorithms, and ML evaluation without treating any one layer as the whole system.
The line-breaking simulator, beam search, layout demerits, and candidate reranking show comfort working with algorithmic product behavior, not only UI presentation.
The model is scoped to a concrete inspection task, measured against hand-labeled data, and documented with caveats instead of oversized claims.
ONNX export, lazy runtime loading, canvas rasterization, and local overlays turn the model into something a user can actually try in a browser.
The repo separates tracked evidence from regenerable artifacts, documents benchmark limits, and keeps lightweight verification checks for published files.
The project is intentionally transparent about what is done and what still needs work. That matters because typography quality is subjective, and the current benchmark is small.