Back in 2016 I wrote a small Java app to parse EXIF data from my photos so I could figure out which focal lengths I actually shoot at. Ten years later, I have a lot more cameras, a lot more lenses, and a photo library sitting at 1-2TB. Time to revisit the problem, this time in Rust.
The tool is called extractor-rust. Same idea as the Java version: crawl a photo library, pull focal length and lens data out of every file, and surface patterns in how I actually shoot. The difference is scale. The old library was a few thousand files from a single trip to Tokyo and Seoul. This one is a decade of shooting across multiple camera systems. I’m about to go to Tokyo and Seoul again soon, and wanted to build something that could analyze if any of my shooting habits changed.
Why Rust
I wanted to learn it. Python would have been faster to write and the EXIF ecosystem there is more mature. But I was curious whether Rust was practical for this kind of utility work, and a file-processing tool felt like a good way to find out. The short answer is yes, with caveats. The borrow checker will find every gap in your mental model and it will not be polite about it. I leaned on Claude Code during the trickier concurrent sections, it’s good at explaining why Rust wants you to do something a certain way rather than just handing you a fix, which made it genuinely useful for learning the language rather than just getting past errors.
Making It Fast
My first version processed files sequentially and at TB scale that’s a real problem: 50,000+ files one at a time adds up. The stack I landed on has three libraries doing most of the work. walkdir handles directory traversal, simple API, memory efficient, handles symlinks and permission errors cleanly without loading the whole tree upfront. kamadak-exif does the EXIF extraction. It’s pure Rust with no C bindings, so it compiles everywhere without native dependencies. Coverage of standard tags is solid, focal length, lens model, camera body, 35mm equivalent all come through cleanly. The rough edges show up with video files and manufacturer MakerNote tags, which are opaque, and you encounter enough malformed EXIF in the wild that error handling ends up more verbose than you’d like.
The interesting one was rayon. It’s a data parallelism library that spreads iterator work across all available CPU cores by swapping .iter() for .par_iter(). That’s basically it – no thread pool configuration, no manual work distribution. Since EXIF extraction is CPU-bound and each file is completely independent, it’s a near-perfect fit. Scan times dropped dramatically. The one caveat is that it doesn’t help if your bottleneck is the disk rather than the CPU, which on slower storage it can be.
What the Data Said
Same conclusion as 2016: I love shooting wide. Running it against the full library, 24mm comes in at 17,516 shots – nearly double 35mm in second place. The grey bars below are phone focal lengths – 7.9mm and 5mm combined add up to over 12,000 shots, which was more than I expected. The long end of my zoom barely registers.

The lens data confirms it. The EF 24-70mm f/2.8L II USM accounts for 26,794 shots — more than three times the second place LUMIX 20mm f/1.7. A few lenses I own barely show up at all, which is useful information on its own.

The camera breakdown was the most interesting output. The Canon EOS 5D Mark III dominates at 36,292 shots, which makes sense, it’s been my primary body for years. But seeing the Sony Cybershot, Sony DSC-P200, and DSC-P10 all in the top five was a reminder of how much history is in this library. Those are early 2000s point-and-shoots. The data goes back further than I was consciously thinking about.

Ten years ago I ran this analysis on a few thousand photos from one trip. Running it on a decade of shooting across every camera I’ve owned tells a more complete story, not just what I’m reaching for now, but where I started and how my shooting has evolved. The old Sony point-and-shoots showing up in the camera list wasn’t just a curiosity, it was a reminder that this library spans a lot of life. The 5D Mark III pulling 36,000 shots tells you everything about which camera actually stuck. And 24mm being that dominant across all of it, that hasn’t changed.
Code is up on GitHub if you want to run it against your own library.




