The Brain Collector (TBC) is a CLI tool designed to search for machine learning model weight files on an Android device via ADB. It scans common directories for known ML model file extensions and signatures and can also extract APK files to analyze their contents for embedded ML models.
- Scan Android devices connected via ADB for ML model files
- Support various model file formats: .tflite, .onnx, .pt, .pth, .pb, .h5, .hdf5, .caffemodel, .weights, .mlmodel, .gguf, .safetensors
- Detect files based on extensions and magic byte signatures
- Extract APK files and search for embedded ML models
- Provide device information (serial number, model, manufacturer)
- Display file sizes in human-readable format
- Support local directory scanning
- Export summary to CSV
- Cleanup temporary files
- iOS device scanning
- Remote device scanning (non-ADB)
- Model analysis or decompilation
- Auto-download of models
- GUI interface
tbc [--file FILE] [--local-dir LOCAL_DIR] [--export-csv EXPORT_CSV] [--cleanup]
| Option | Type | Description |
|---|---|---|
--file |
str | Specify a single file to scan |
--local-dir |
str | Specify a local directory to scan |
--export-csv |
str | Filename to export summary to CSV (default: model_report.csv) |
--cleanup |
flag | Clean up tmp/ directory after execution |
Returns device serial number, model, and manufacturer.
Searches connected Android device for ML model files. Returns set of found files.
Scans a list of files for ML model or APK extensions.
Extracts APK and scans contents for ML models.
Scans a local directory for APKs and ML models.
Converts bytes to human-readable format (e.g., "1.23 MB").
Calculates MD5 hash of a file.
Returns file type based on magic bytes.
Exports found models summary to CSV file.
- Files from Android device via ADB
- Local files from filesystem
- Console output with colored findings
- CSV file with columns: MD5, Size, File Path, Dst File Path
- TFlite (multiple versions)
- ONNX
- PyTorch
- TensorFlow .pb
- HDF5 (Keras)
- Apple CoreML
- GGUF
- SafeTensors
- No ADB device connected - Should handle gracefully with informative message
- Permission denied on device - Skip file, continue scanning
- APK extraction fails - Log to FAILED set, continue
- Duplicate files - Use FOUND set to avoid duplicates
- Large files - Stream MD5 calculation in chunks
- Empty device - No results, no crash
- Non-existent local directory - Handle with error message
- File not found during MD5 - Return None
- MD5 mismatch between local/remote APK - Re-pull the file
- Filesystem type detection - Handle non-Linux systems gracefully
- ADB commands are I/O bound - accept as is
- MD5 calculation uses 4KB chunks for memory efficiency
- File discovery uses
findcommand for efficiency - Target: Python 3.11+
- No heavy third-party dependencies (only colorama, click)