Metadata-Version: 2.4
Name: media-indexer
Version: 0.2.0
Summary: Local-first image and video archive search with SQLite, FAISS, FastAPI, and Typer.
Author: OpenAI Codex
Project-URL: Homepage, https://github.com/sachin1705s/media-indexer
Project-URL: Repository, https://github.com/sachin1705s/media-indexer
Project-URL: Issues, https://github.com/sachin1705s/media-indexer/issues
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: fastapi<1.0.0,>=0.115.0
Requires-Dist: faiss-cpu>=1.8.0
Requires-Dist: httpx<1.0.0,>=0.27.0
Requires-Dist: numpy<3.0.0,>=1.26.0
Requires-Dist: pillow<12.0.0,>=10.4.0
Requires-Dist: PyYAML<7.0.0,>=6.0.2
Requires-Dist: python-multipart<1.0.0,>=0.0.9
Requires-Dist: torch<3.0.0,>=2.4.0
Requires-Dist: tqdm<5.0.0,>=4.66.0
Requires-Dist: transformers<5.0.0,>=4.46.0
Requires-Dist: typer<1.0.0,>=0.12.5
Requires-Dist: uvicorn[standard]<1.0.0,>=0.30.0
Provides-Extra: dev
Requires-Dist: pytest<9.0.0,>=8.3.0; extra == "dev"

# Media Indexer

A local-first image and video search tool for your own folders.

Point it at a folder of media, let it build a local index, then search your archive in plain English from a localhost web UI.

Examples:

```text
dragon
warm editorial portrait
sunset over water
video with a person walking on a beach
```

No cloud upload. No external APIs. Your media stays on your machine.

## What It Does

- Indexes local image and video folders recursively.
- Supports `.jpg`, `.jpeg`, `.png`, `.mp4`, `.mov`, `.m4v`, `.avi`, `.mkv`, and `.webm`.
- Creates local thumbnails for fast browsing.
- Generates local CLIP embeddings for text search and similarity search.
- Adds simple local auto-labels like content type, style tags, and object tags.
- Uses sparse frame selection for videos so search stays useful without indexing every frame.
- Lets you search from a browser at `http://127.0.0.1:8000`.
- Lets you upload an image to find visually similar media.
- Skips unchanged files when you index the same folder again.

## How To Use

Install locally from source:

```bash
git clone https://github.com/sachin1705s/media-indexer.git
cd media-indexer
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e ".[dev]"
```

Video indexing requires:

```bash
brew install ffmpeg
```

Run the guided workflow:

```bash
media-indexer run
```

Or run directly from source without an editable install:

```bash
PYTHONPATH=backend python -m image_archive.cli run
```

When the server starts, open:

```text
http://127.0.0.1:8000
```

## Folder Picker

Inside the terminal picker:

```text
↑ / ↓     move
Space     select or unselect a folder
Enter     open folder
→         open folder
←         go up
d         done, start indexing
q         cancel
```

## Other Commands

```bash
# Guided setup, folder selection, indexing, and optional server start
media-indexer run

# Index one folder directly
media-indexer index ~/Pictures

# Start the local web app for already-indexed media
media-indexer serve

# Show archive status
media-indexer status

# Recheck configured folders and index only changed or missing files
media-indexer reindex

# Delete local index data, thumbnails, vectors, and cached video frames
# This does not delete your original files.
media-indexer reset
```

## What Gets Stored

For each indexed asset, the app stores local metadata such as:

- file path
- source path for video-backed frame hits
- file hash or stable video-frame signature
- width and height
- created and modified timestamps when available
- thumbnail path
- local search embedding
- local labels and tags
- matched timestamp for video frame assets

The original media files are not modified.

By default, app data is stored in your user app-data folder:

```text
macOS:   ~/Library/Application Support/media-indexer/
Linux:   ~/.local/share/media-indexer/
Windows: %LOCALAPPDATA%/media-indexer/
```

## Why Local-First?

This tool is meant for personal and creative archives where privacy matters.

- Your media is indexed locally.
- Search runs locally.
- The web UI is served from localhost.
- No account is required.
- No cloud media upload is required.

The first run may download local model weights, then reuse them from your machine afterward.

## Current Limitations

- Video search works through sparse representative frames, not full clip understanding.
- Very large archives may still need future optimization work.
- CLIP labels are useful but not as detailed as a large vision-language model.
- Deleted files are not fully cleaned up automatically yet.

## Development

Run tests:

```bash
PYTHONPATH=backend python -m pytest -q
```

Run the app from source:

```bash
PYTHONPATH=backend python -m image_archive.cli run
```

## Project Structure

```text
backend/image_archive/   Python package, CLI, API, indexing, search
backend/image_archive/frontend/
                         Packaged localhost web UI
frontend/                Source copy of the minimal UI
tests/                   Test suite
config.example.yaml      Example config
pyproject.toml           Python package metadata
```

## Building

Build distributions locally:

```bash
python -m pip install --upgrade build twine
rm -rf dist build *.egg-info backend/*.egg-info
python -m build
python -m twine check dist/*.whl dist/*.tar.gz
```
