
Mlflow
Ship a trained MLflow model from the registry to local REST, Docker, cloud managed endpoints, or batch jobs with a repeatable deployment checklist.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill mlflowWhat is this skill?
- Compares seven deployment targets (local server, REST API, Docker, SageMaker, Azure ML, Kubernetes, batch) with complexi
- Documents `mlflow models serve` for registry paths and run artifacts, including host, port, and worker flags
- Includes curl examples for single and batch `/invocations` JSON payloads against a local server
- Covers REST API serving, Docker images, cloud managed paths, batch offline inference, and production monitoring sections
- Table-of-contents structure spans deployment options through monitoring for end-to-end ML ops
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars.
Recommended Skills
Paper Context Resolverlllllllama/ai-paper-reproduction-skill
Repo Intake And Planlllllllama/ai-paper-reproduction-skill
Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill
Minimal Run And Auditlllllllama/ai-paper-reproduction-skill
Analyze Projectlllllllama/rigorpilot-skills
Ai Research Reproductionlllllllama/rigorpilot-skills
Journey fit
Primary fit
Production serving and monitoring are classic operate/infra work; the skill’s spine is taking a registered model live rather than ideation or training-only notebooks. Infra subphase covers containerized deploy, K8s-style orchestration patterns, and post-deploy monitoring hooks described in the deployment guide.
SKILL.md
READMESKILL.md - Mlflow
# Deployment Guide Complete guide to deploying MLflow models to production environments. ## Table of Contents - Deployment Options - Local Serving - REST API Serving - Docker Deployment - Cloud Deployment - Batch Inference - Production Patterns - Monitoring ## Deployment Options MLflow supports multiple deployment targets: | Target | Use Case | Complexity | |--------|----------|------------| | **Local Server** | Development, testing | Low | | **REST API** | Production serving | Medium | | **Docker** | Containerized deployment | Medium | | **AWS SageMaker** | Managed AWS deployment | High | | **Azure ML** | Managed Azure deployment | High | | **Kubernetes** | Scalable orchestration | High | | **Batch** | Offline predictions | Low | ## Local Serving ### Serve Model Locally ```bash # Serve registered model mlflow models serve -m "models:/product-classifier/Production" -p 5001 # Serve from run mlflow models serve -m "runs:/abc123/model" -p 5001 # Serve with custom host mlflow models serve -m "models:/my-model/Production" -h 0.0.0.0 -p 8080 # Serve with workers (for scalability) mlflow models serve -m "models:/my-model/Production" -p 5001 --workers 4 ``` **Output:** ``` Serving model on http://127.0.0.1:5001 ``` ### Test Local Server ```bash # Single prediction curl http://127.0.0.1:5001/invocations \ -H 'Content-Type: application/json' \ -d '{ "inputs": [[1.0, 2.0, 3.0, 4.0]] }' # Batch predictions curl http://127.0.0.1:5001/invocations \ -H 'Content-Type: application/json' \ -d '{ "inputs": [ [1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0] ] }' # CSV input curl http://127.0.0.1:5001/invocations \ -H 'Content-Type: text/csv' \ --data-binary @data.csv ``` ### Python Client ```python import requests import json url = "http://127.0.0.1:5001/invocations" data = { "inputs": [[1.0, 2.0, 3.0, 4.0]] } headers = {"Content-Type": "application/json"} response = requests.post(url, json=data, headers=headers) predictions = response.json() print(predictions) ``` ## REST API Serving ### Build Custom Serving API ```python from flask import Flask, request, jsonify import mlflow.pyfunc app = Flask(__name__) # Load model on startup model = mlflow.pyfunc.load_model("models:/product-classifier/Production") @app.route('/predict', methods=['POST']) def predict(): """Prediction endpoint.""" data = request.get_json() inputs = data.get('inputs') # Make predictions predictions = model.predict(inputs) return jsonify({ 'predictions': predictions.tolist() }) @app.route('/health', methods=['GET']) def health(): """Health check endpoint.""" return jsonify({'status': 'healthy'}) if __name__ == '__main__': app.run(host='0.0.0.0', port=5001) ``` ### FastAPI Serving ```python from fastapi import FastAPI from pydantic import BaseModel import mlflow.pyfunc import numpy as np app = FastAPI() # Load model model = mlflow.pyfunc.load_model("models:/product-classifier/Production") class PredictionRequest(BaseModel): inputs: list class PredictionResponse(BaseModel): predictions: list @app.post("/predict", response_model=PredictionResponse) async def predict(request: PredictionRequest): """Make predictions.""" inputs = np.array(request.inputs) predictions = model.predict(inputs) return PredictionResponse(predictions=predictions.tolist()) @app.get("/health") async def health(): """Health check.""" return {"status": "healthy"} # Run with: uvicorn main:app --host 0.0.0.0 --port 5001 ``` ## Docker Deployment ### Build Docker Image ```bash # Build Docker image with MLflow mlflow models build-docker \ -m "models:/product-classifier/Production" \ -n product-classifier:v1 # Build with custom image name mlflow models build-docker \ -m "runs:/abc123/model" \ -n my-registry/my-model:latest # Build and enable MLServer (for KServe/Seldon) mlflow models build-docker \ -m "models:/my-model/Production" \ -n my-mode