FTI Architecture
Theory
FTI Architecture divides every ML system into exactly three pipeline types, each with a single responsibility, connected through versioned shared storage.
| Pipeline | Input | Output | Stores to |
|---|---|---|---|
| Feature | Raw data | Features + labels | Feature store |
| Training | Feature store snapshot | Trained model | Model registry |
| Inference | Feature store + model registry | Predictions | Client / DB |
Raw Datacrawl · ingest
Feature Pipelineclean · chunk · embed
Feature StoreQdrant / versioned
Feature Storeread snapshot
Training PipelineSFT · DPO · merge
Model RegistryHuggingFace / versioned
Model Registryload model
Inference Pipelineretrieve · generate
ResponseAPI · batch
Why it matters:
- No training-serving skew — the feature pipeline writes to the store once; both training and inference read from that same store.
- Reproducibility — the model registry records which feature-store version was used; any training run can be replayed exactly.
- Isolation — each pipeline can use a different tech stack, scale independently, and be tested without touching the others.
LLMOps forward link: monitoring, drift detection, and automated retraining triggers are the operational layer that wraps FTI pipelines in production.