Metadata-Version: 2.4
Name: sheaf-serve
Version: 0.10.0
Summary: Unified serving layer for non-text foundation models
Author-email: Alex Korbonits <alexkorbonits@gmail.com>
License: Apache-2.0
License-File: LICENSE
Keywords: foundation-models,inference,mlops,ray,serving
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: httpx>=0.27.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: ray[serve]>=2.10.0
Provides-Extra: all
Requires-Dist: chronos-forecasting>=1.0.0; extra == 'all'
Requires-Dist: faster-whisper>=1.2.1; extra == 'all'
Requires-Dist: feast>=0.40.0; extra == 'all'
Requires-Dist: mace-torch>=0.3.0; extra == 'all'
Requires-Dist: open-clip-torch>=3.0.0; extra == 'all'
Requires-Dist: openai-whisper>=20240930; extra == 'all'
Requires-Dist: pymilvus>=2.4.0; extra == 'all'
Requires-Dist: sam2>=1.0; extra == 'all'
Requires-Dist: tabpfn>=2.0.0; extra == 'all'
Requires-Dist: timesfm[torch]>=1.0.0; extra == 'all'
Requires-Dist: torch>=2.0.0; extra == 'all'
Requires-Dist: transformers>=4.31.0; extra == 'all'
Requires-Dist: transformers>=4.37.0; extra == 'all'
Provides-Extra: audio
Requires-Dist: faster-whisper>=1.2.1; extra == 'audio'
Requires-Dist: openai-whisper>=20240930; extra == 'audio'
Provides-Extra: audio-generation
Requires-Dist: torch>=2.0.0; extra == 'audio-generation'
Requires-Dist: transformers>=4.37.0; extra == 'audio-generation'
Provides-Extra: batch
Requires-Dist: pandas>=2.0.0; extra == 'batch'
Requires-Dist: pyarrow>=14.0.0; extra == 'batch'
Provides-Extra: dev
Requires-Dist: cloudpickle>=3.0.0; extra == 'dev'
Requires-Dist: fakeredis>=2.20.0; extra == 'dev'
Requires-Dist: httpx>=0.27.0; extra == 'dev'
Requires-Dist: mypy>=1.9.0; extra == 'dev'
Requires-Dist: pre-commit>=3.7.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Requires-Dist: ty>=0.0.0; extra == 'dev'
Provides-Extra: diffusion
Requires-Dist: diffusers>=0.30.0; extra == 'diffusion'
Requires-Dist: torch>=2.0.0; extra == 'diffusion'
Requires-Dist: transformers>=4.37.0; extra == 'diffusion'
Provides-Extra: earth-observation
Requires-Dist: torch>=2.0.0; extra == 'earth-observation'
Requires-Dist: transformers>=4.37.0; extra == 'earth-observation'
Provides-Extra: feast
Requires-Dist: feast>=0.40.0; extra == 'feast'
Provides-Extra: genomics
Requires-Dist: torch>=2.0.0; extra == 'genomics'
Requires-Dist: transformers>=4.37.0; extra == 'genomics'
Provides-Extra: kokoro
Requires-Dist: kokoro>=0.9.2; extra == 'kokoro'
Requires-Dist: soundfile>=0.12.1; extra == 'kokoro'
Provides-Extra: lidar
Requires-Dist: torch>=2.0.0; extra == 'lidar'
Provides-Extra: materials
Requires-Dist: mace-torch>=0.3.0; extra == 'materials'
Provides-Extra: metrics
Requires-Dist: prometheus-client>=0.20.0; extra == 'metrics'
Provides-Extra: milvus
Requires-Dist: pymilvus>=2.4.0; extra == 'milvus'
Provides-Extra: modal
Requires-Dist: cloudpickle>=3.0.0; extra == 'modal'
Requires-Dist: modal>=0.60.0; extra == 'modal'
Provides-Extra: moirai
Requires-Dist: uni2ts>=2.0.0; extra == 'moirai'
Provides-Extra: molecular
Requires-Dist: esm>=3.0.0; (python_full_version >= '3.12') and extra == 'molecular'
Provides-Extra: multimodal
Requires-Dist: torch>=2.0.0; extra == 'multimodal'
Requires-Dist: torchaudio>=2.0.0; extra == 'multimodal'
Requires-Dist: torchvision>=0.15.0; extra == 'multimodal'
Provides-Extra: multimodal-generation
Requires-Dist: diffusers>=0.26.0; extra == 'multimodal-generation'
Requires-Dist: pillow>=9.0.0; extra == 'multimodal-generation'
Requires-Dist: torch>=2.0.0; extra == 'multimodal-generation'
Provides-Extra: optical-flow
Requires-Dist: torch>=2.0.0; extra == 'optical-flow'
Requires-Dist: torchvision>=0.15.0; extra == 'optical-flow'
Provides-Extra: pose
Requires-Dist: torch>=2.0.0; extra == 'pose'
Requires-Dist: transformers>=4.46.0; extra == 'pose'
Provides-Extra: small-molecule
Requires-Dist: torch>=2.0.0; extra == 'small-molecule'
Requires-Dist: transformers>=4.37.0; extra == 'small-molecule'
Provides-Extra: tabular
Requires-Dist: tabpfn>=2.0.0; extra == 'tabular'
Provides-Extra: time-series
Requires-Dist: chronos-forecasting>=1.0.0; extra == 'time-series'
Requires-Dist: timesfm[torch]>=1.0.0; extra == 'time-series'
Provides-Extra: tracing
Requires-Dist: opentelemetry-api>=1.20.0; extra == 'tracing'
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20.0; extra == 'tracing'
Requires-Dist: opentelemetry-sdk>=1.20.0; extra == 'tracing'
Provides-Extra: tts
Requires-Dist: torch>=2.0.0; extra == 'tts'
Requires-Dist: transformers>=4.31.0; extra == 'tts'
Provides-Extra: video
Requires-Dist: torch>=2.0.0; extra == 'video'
Requires-Dist: transformers>=4.40.0; extra == 'video'
Provides-Extra: vision
Requires-Dist: open-clip-torch>=3.0.0; extra == 'vision'
Requires-Dist: sam2>=1.0; extra == 'vision'
Requires-Dist: transformers>=4.37.0; extra == 'vision'
Provides-Extra: weather
Requires-Dist: dm-haiku>=0.0.12; extra == 'weather'
Requires-Dist: graphcast<1.0.0,>=0.1.0; extra == 'weather'
Requires-Dist: jax>=0.4.25; extra == 'weather'
Requires-Dist: xarray>=2024.1.0; extra == 'weather'
Provides-Extra: worker
Requires-Dist: httpx>=0.27.0; extra == 'worker'
Requires-Dist: redis>=5.0.0; extra == 'worker'
Description-Content-Type: text/markdown

# Sheaf

[![PyPI](https://img.shields.io/pypi/v/sheaf-serve)](https://pypi.org/project/sheaf-serve/)
[![Downloads](https://img.shields.io/pypi/dm/sheaf-serve)](https://pypi.org/project/sheaf-serve/)
[![CI](https://github.com/korbonits/sheaf/actions/workflows/ci.yml/badge.svg)](https://github.com/korbonits/sheaf/actions/workflows/ci.yml)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![Python](https://img.shields.io/pypi/pyversions/sheaf-serve)](https://pypi.org/project/sheaf-serve/)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

**Unified serving layer for non-text foundation models.**

vLLM solved inference for text LLMs by defining a standard compute contract and optimizing behind it. The same problem exists for every other class of foundation model — time series, tabular, molecular, geospatial, diffusion, audio — and nobody has solved it. Sheaf is that solution.

Each model type gets a typed request/response contract. Batching, caching, and scheduling are optimized per model type. [Ray Serve](https://docs.ray.io/en/latest/serve/index.html) is the substrate. [Feast](https://feast.dev) is a first-class input primitive.

> *In mathematics, a sheaf tracks locally-defined data that glues consistently across a space. Each model type defines its own local contract; Sheaf ensures they cohere into a unified serving layer.*

---

## Install

```bash
pip install sheaf-serve                           # core only
pip install "sheaf-serve[time-series]"            # + Chronos2 / TimesFM / Moirai
pip install "sheaf-serve[tabular]"                # + TabPFN
pip install "sheaf-serve[molecular]"              # + ESM-3  (Python 3.12+)
pip install "sheaf-serve[genomics]"               # + Nucleotide Transformer
pip install "sheaf-serve[small-molecule]"         # + MolFormer
pip install "sheaf-serve[materials]"              # + MACE-MP
pip install "sheaf-serve[audio]"                  # + Whisper / faster-whisper
pip install "sheaf-serve[audio-generation]"       # + MusicGen
pip install "sheaf-serve[tts]"                    # + Bark
pip install "sheaf-serve[vision]"                 # + DINOv2 / OpenCLIP / SAM2 / Depth Anything / DETR
pip install "sheaf-serve[earth-observation]"      # + Prithvi
pip install "sheaf-serve[weather]"                # + GraphCast
pip install "sheaf-serve[feast]"                  # + Feast feature store integration
pip install "sheaf-serve[modal]"                  # + Modal serverless deployment
pip install "sheaf-serve[batch]"                  # + offline batch inference (Ray Data)
pip install "sheaf-serve[all]"                    # everything
```

## Quickstart

**Direct backend inference:**

```python
from sheaf.api.time_series import Frequency, OutputMode, TimeSeriesRequest
from sheaf.backends.chronos import Chronos2Backend

backend = Chronos2Backend(model_id="amazon/chronos-bolt-tiny", device_map="cpu")
backend.load()

req = TimeSeriesRequest(
    model_name="chronos-bolt-tiny",
    history=[312, 298, 275, 260, 255, 263, 285, 320,
             368, 402, 421, 435, 442, 438, 430, 425],
    horizon=12,
    frequency=Frequency.HOURLY,
    output_mode=OutputMode.QUANTILES,
    quantile_levels=[0.1, 0.5, 0.9],
)

response = backend.predict(req)
# response.mean, response.quantiles
```

**Ray Serve (production, autoscaling):**

```python
from sheaf import ModelServer
from sheaf.spec import ModelSpec, ResourceConfig
from sheaf.api.base import ModelType

server = ModelServer(models=[
    ModelSpec(
        name="chronos",
        model_type=ModelType.TIME_SERIES,
        backend="chronos2",
        backend_kwargs={"model_id": "amazon/chronos-bolt-small"},
        resources=ResourceConfig(num_gpus=1),
    ),
])
server.run()  # POST /chronos/predict, GET /chronos/health
```

**Feast feature store (resolve features at request time):**

```python
# ModelSpec wires Feast — no history needed in the request
spec = ModelSpec(
    name="chronos",
    model_type=ModelType.TIME_SERIES,
    backend="chronos2",
    feast_repo_path="/feast/feature_repo",
)

# Client sends feature_ref instead of raw history
{
    "model_type": "time_series",
    "model_name": "chronos",
    "feature_ref": {
        "feature_view": "asset_prices",
        "feature_name": "close_history_30d",
        "entity_key": "ticker",
        "entity_value": "AAPL"
    },
    "horizon": 7,
    "frequency": "1d"
}
```

**Modal (serverless, zero-infra):**

```python
from sheaf import ModalServer

server = ModalServer(models=[spec], app_name="my-sheaf", gpu="A10G")
app = server.app  # modal deploy my_server.py
```

**Docker:**

```dockerfile
FROM ghcr.io/korbonits/sheaf-serve:v0.10.0
RUN pip install --no-cache-dir 'sheaf-serve[time-series]==0.10.0'
COPY server.py .
CMD ["python", "server.py"]
```

The base image is sheaf-serve core only; extend with the backend extras you need.  See `examples/docker/` for a worked example with a runnable `server.py`.

**Kubernetes (KubeRay):**

`examples/k8s/` ships a `RayService` manifest that deploys the same `ModelSpec` shape via the KubeRay operator.  `sheaf.build_app(spec)` returns the Ray Serve Application directly, so it slots into KubeRay's `serveConfigV2.applications[].import_path`:

```python
# app.py — referenced by the manifest as `import_path: app:app`
from sheaf import build_app
from sheaf.spec import ModelSpec
spec = ModelSpec(name="chronos", ...)
app = build_app(spec)
```

**Typed Python client:**

```python
from sheaf.client import SheafClient
from sheaf.api.time_series import Frequency, TimeSeriesRequest

with SheafClient(base_url="http://localhost:8000") as client:
    resp = client.predict(
        "chronos",
        TimeSeriesRequest(
            model_name="chronos",
            history=[1.0, 2.0, 3.0, 4.0, 5.0],
            horizon=3,
            frequency=Frequency.HOURLY,
        ),
    )
# resp is a typed TimeSeriesResponse — same Pydantic class the server returned
print(resp.mean)
```

`AsyncSheafClient` is the async-mirror; `client.stream(deployment, request)` yields SSE events for streaming backends like FLUX.

See [`examples/`](examples/) for time series comparison, tabular, audio, vision, and the Feast feature store quickstart.

---

## Supported model types

| Type | Status | Backends |
|---|---|---|
| Time series | ✅ v0.1 | Chronos2, Chronos-Bolt, TimesFM, Moirai |
| Tabular | ✅ v0.1 | TabPFN v2 |
| Audio transcription | ✅ v0.3 | Whisper, faster-whisper |
| Audio generation | ✅ v0.3 | MusicGen |
| Text-to-speech | ✅ v0.3 | Bark |
| Vision embeddings | ✅ v0.3 | OpenCLIP, DINOv2 |
| Segmentation | ✅ v0.3 | SAM2 |
| Depth estimation | ✅ v0.3 | Depth Anything v2 |
| Object detection | ✅ v0.3 | DETR / RT-DETR |
| Protein / molecular | ✅ v0.3 | ESM-3 (Python 3.12+) |
| Genomics | ✅ v0.3 | Nucleotide Transformer |
| Small molecule | ✅ v0.3 | MolFormer-XL |
| Materials science | ✅ v0.3 | MACE-MP-0 |
| Earth observation | ✅ v0.3 | Prithvi (IBM/NASA) |
| Weather forecasting | ✅ v0.3 | GraphCast |
| Cross-modal embeddings | ✅ v0.3 | ImageBind (text, vision, audio, depth, thermal) |
| Feast feature store | ✅ v0.3 | Any Feast online store (SQLite, Redis, DynamoDB, …) |
| Modal serverless | ✅ v0.3 | `ModalServer` — zero-infra GPU deployment |
| Diffusion / image gen | ✅ v0.4 | FLUX (schnell, dev) |
| Video understanding | ✅ v0.4 | VideoMAE, TimeSformer |
| LiDAR / 3D point cloud | ✅ v0.5 | PointNet (pure PyTorch; embed + ModelNet40 classify) |
| Pose estimation | ✅ v0.5 | ViTPose (COCO 17-keypoint, optional person bboxes) |
| Optical flow | ✅ v0.5 | RAFT (raft_large / raft_small via torchvision) |
| Multimodal generation | ✅ v0.5 | SDXL img2img + inpainting |
| Speech synthesis | ✅ v0.5 | Kokoro (voice + speed per request) |
| Offline batch inference | ✅ v0.6 | `BatchRunner` (Ray Data; tasks + actor-pool modes) |
| Async-job worker | ✅ v0.7 | `SheafWorker` (Redis Streams; pluggable queue/result ABCs) |
| LoRA adapter multiplexing | ✅ v0.8 | FLUX, SDXL via `ModelSpec.lora` (local paths + HF Hub sources) |

## Roadmap to production

**v0.2 — serving layer (complete)**
- [x] Ray Serve integration tested end-to-end
- [x] Async `predict()` handlers
- [x] HTTP API with proper request validation (422 on bad input)
- [x] Health check and readiness probe endpoints
- [x] Batching scheduler (BatchPolicy wired into `@serve.batch` per deployment)
- [x] Error handling at the service boundary (backend exceptions → structured HTTP 500)
- [x] Model hot-swap without restart (`ModelServer.update()`)
- [x] Container-friendly auth for TabPFN v2 (`TABPFN_TOKEN` env var)

**v0.3 — model types + integrations (complete)**
- [x] ESM-3 protein embeddings
- [x] Nucleotide Transformer genomics embeddings
- [x] MolFormer-XL small molecule embeddings
- [x] MACE-MP-0 materials (energy, forces, stress)
- [x] Whisper / faster-whisper audio transcription
- [x] MusicGen audio generation
- [x] Bark text-to-speech
- [x] OpenCLIP image/text embeddings
- [x] DINOv2 image embeddings
- [x] SAM2 segmentation
- [x] Depth Anything v2 depth estimation
- [x] DETR / RT-DETR object detection
- [x] Prithvi earth observation embeddings
- [x] GraphCast weather forecasting
- [x] ImageBind cross-modal embeddings (text, vision, audio, depth, thermal)
- [x] Feast feature store integration (`feature_ref` in requests, `FeastResolver`, `feast_repo_path` on `ModelSpec`)
- [x] Modal serverless deployment (`ModalServer` — zero-infra alternative to Ray Serve)

**v0.4 — generation + video (complete)**
- [x] FLUX diffusion / image generation
- [x] VideoMAE / TimeSformer video understanding

**v0.5 — observability + new modalities**

Ops / DX:
- [x] PyPI publish (v0.4.0)
- [x] Prometheus metrics endpoint per deployment
- [x] Structured logging with request IDs end-to-end
- [x] OpenTelemetry traces through the request path

Serving / infra:
- [x] Streaming responses (`POST /{name}/stream` → SSE; FLUX emits per-step progress events)
- [x] Request caching (`CacheConfig` on `ModelSpec` — in-process LRU, optional TTL)
- [x] `bucket_by` batching — group requests by field value before `@serve.batch`

New model types:
- [x] LiDAR / 3D point cloud (PointNet — pure-PyTorch, no torch-geometric; embed + ModelNet40 classify; install with `pip install 'sheaf-serve[lidar]'`)
- [x] Pose estimation (ViTPose — COCO 17-keypoint skeleton, optional person bboxes; install with `pip install 'sheaf-serve[pose]'`)
- [x] Optical flow (RAFT — raft_large/raft_small via torchvision; (H, W, 2) float32 flow field; install with `pip install 'sheaf-serve[optical-flow]'`)
- [x] Multimodal generation — text+image-conditioned (SDXL img2img + inpainting; install with `pip install 'sheaf-serve[multimodal-generation]'`)
- [x] Speech synthesis with fine-grained control (Kokoro — voice + speed per request; install with `pip install 'sheaf-serve[kokoro]'`)

**v0.6 — offline batch inference (complete)**

- [x] `BatchRunner` — same backend, same typed contract, offline batch mode; Ray Data `map_batches` substrate, stateless tasks with a worker-local backend cache so `load()` fires once per worker (not once per batch); install with `pip install 'sheaf-serve[batch]'`
- [x] `BatchSpec` — mirrors `ModelSpec` for backend selection; `JsonlSource`/`JsonlSink` in v1; new sources/sinks (S3, Parquet, Delta) slot in as additional `BatchSource`/`BatchSink` subclasses without changing the runner API
- [x] Actor-pool execution mode for warm loads on expensive backends (FLUX, GraphCast, SDXL) — opt-in via `BatchSpec.compute="actors"` + `num_actors=N`; `load()` runs once per actor at `__init__` and persists for the actor's lifetime ([#13](https://github.com/korbonits/sheaf/issues/13))
- [ ] Resumable checkpointing across process restarts ([#12](https://github.com/korbonits/sheaf/issues/12))

**v0.7 — async-job queue (complete)**

- [x] `SheafWorker` — queue-consumer pattern for long-running inference; v1 ships Redis Streams + consumer groups (horizontal scaling), pluggable `JobQueue` / `ResultStore` ABCs for SQS / Kafka follow-ups; install with `pip install 'sheaf-serve[worker]'`
- [x] Job lifecycle: enqueue → processing → result / dead-letter; at-least-once delivery via XACK-after-persist; per-job webhook on completion (best-effort POST)
- [ ] Priority lanes + per-tenant fair queuing

**v0.8 — LoRA adapter multiplexing (complete)**

- [x] `ModelSpec.lora = LoRAConfig(adapters={...}, default="...")` — declare per-deployment adapter registry; one GPU deployment serves many fine-tunes
- [x] Per-request adapter selection via `DiffusionRequest.adapters` / `MultimodalGenerationRequest.adapters` (with optional `adapter_weights` for fusion)
- [x] First targets: FLUX (FLUX.1-schnell + FLUX.1-dev), SDXL (img2img + inpaint)
- [x] Local paths and HF Hub sources both supported (`hf:org/repo[:weight_file]` convention)
- [x] Bucket-by-resolved-adapter inside Ray Serve batch windows: `set_active_adapters` is called exactly once per homogeneous sub-batch
- [ ] Hot-add adapters at runtime without `ModelServer.update(spec)` (deferred — adds VRAM-eviction / index-sync surface area)
- [ ] Expose `enable_sequential_cpu_offload` on `FluxBackend` so FLUX + LoRA fits on 16-24 GB GPUs (currently only `enable_model_cpu_offload`, which leaves ~22 GB resident — Modal LoRA quickstart needs A100 today, this would unlock A10G)

**v0.9 — typed Python client (complete)**

Ships as `sheaf.client` inside `sheaf-serve` (not a separate `sheaf-client` PyPI package — schemas stay in one tree, no codegen, no drift).  Splittable into its own package later if external client contributors arrive or install footprint becomes a real cost.

- [x] `SheafClient` (sync) + `AsyncSheafClient` (async, `httpx`-backed); typed `predict(deployment, request) -> response` against the discriminated `AnyResponse` union
- [x] `health()` / `ready()` helpers; structured exceptions (`ValidationError` for 422, `ServerError` for 5xx, `ClientError` for transport / decode failures)
- [x] SSE streaming via `client.stream(deployment, request)` async generator
- [x] `RetryConfig` with exponential backoff: configurable status codes, connection-error retry toggle, and `max_attempts` cap.  Streams bypass retry by design (re-running yields interleaved progress events).
- [x] Server-side `request_id` (the UUID minted on the request) is attached to every raised `SheafError` subclass so callers can log-correlate without holding the original request object.
- [x] OpenAPI export via `python -m sheaf.openapi --specs my_module:specs > openapi.json` (or `sheaf.openapi.generate(specs)` programmatically) — backends are not loaded during generation, so it runs without GPU.

**v0.10 — container + Kubernetes deployment**

Today sheaf ships three deployment paths: `ModelServer` (a local Ray cluster you bring), `ModalServer` (Modal serverless), and `BatchRunner` / `SheafWorker` (offline / async).  Production K8s clusters running their own Ray are common and have no first-class story yet — every team rolls their own image.

- [ ] Reference `Dockerfile` (multi-stage, uv-based; CPU base + CUDA variant) so teams aren't building this from scratch.  Pinned to a sheaf release; rebuilt on tag.
- [ ] `examples/k8s/` with a `RayService` manifest — KubeRay's canonical Ray-on-K8s shape — and a short `README.md` covering prereqs (KubeRay operator installed), `kubectl apply`, and a port-forward smoke test.
- [ ] GitHub Actions workflow that builds + pushes the Dockerfile to `ghcr.io/korbonits/sheaf-serve:vX.Y.Z` on `v*` tag push, mirroring the PyPI publish flow.

---

## Architecture

```
┌─────────────────────────────────────────┐
│           API Layer                      │  typed contracts per model type
│  TimeSeriesRequest  TabularRequest  ...  │
├─────────────────────────────────────────┤
│         Scheduling Layer                 │  model-type-aware batching
│  BatchPolicy  RequestQueue               │
├─────────────────────────────────────────┤
│          Backend Layer                   │  pluggable execution + Ray Serve
│  ModelBackend  CacheManager  Feast       │
└─────────────────────────────────────────┘
```

**Adding a new backend** takes one class:

```python
from sheaf.backends.base import ModelBackend
from sheaf.registry import register_backend

@register_backend("my-model")
class MyModelBackend(ModelBackend):
    def load(self) -> None:
        self._model = load_my_model()

    def predict(self, request):
        ...

    @property
    def model_type(self):
        return "time_series"
```

---

## Contributing

Issues and PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup.

## License

Apache 2.0
