Documentation
File Hunter is a self-hosted, web-based file cataloging and deduplication tool. It catalogs files across transient storage locations — USB drives, network shares, DVD-ROMs, local directories — so they can be searched, compared, and deduplicated even when the media is disconnected.
Getting started
If you’re new to File Hunter, start with the installation guide to get up and running, then read the usage guide for a tour of features.
Key concepts
Locations
A location is any folder registered with File Hunter — a USB drive mount point, a network share, a local directory. Each location gets its own entry in the navigation tree and can be scanned independently. When the underlying media is disconnected, the location goes offline but its catalog remains fully browsable and searchable.
Scanning
Scanning walks a location’s directory tree, recording every file and computing hashes for duplicate detection. File Hunter uses a three-gate hashing strategy: file size as a pre-filter, xxHash64 partial hash (first 64KB) for speed, and SHA-256 full hash for certainty. Only files that pass each gate advance to the next, minimising disk I/O.
Consolidation
Once duplicates are identified, consolidation keeps one canonical copy and replaces
every duplicate with a lightweight .moved stub file. A .sources
file is created alongside the canonical copy, recording every original path. Nothing
is silently deleted — there is always a full audit trail.
The catalog
Everything is stored in a single SQLite database (file_hunter.db) in WAL
mode. The catalog persists file metadata, hashes, tags, descriptions, and folder
structure. It is designed to handle millions of files efficiently.
Architecture
File Hunter is a Python ASGI application built with Starlette and uvicorn. The frontend is vanilla JavaScript with no framework dependencies and no build step. Real-time updates are delivered over WebSocket. The entire application runs on a single machine with no external services required.
File Hunter Pro
File Hunter Pro adds remote agent support — install a lightweight agent on any machine and its storage appears in your catalog alongside local drives. Content streaming, remote scanning, and cross-machine deduplication are all included. The free version stays free forever; Pro builds on top of it.