FastAPI Gunicorn Integration
Introduction
When deploying a FastAPI application to production, using the built-in development server (Uvicorn) is not recommended due to its limitations in handling concurrent requests and managing resources effectively. This is where Gunicorn (Green Unicorn) comes into the picture.
Gunicorn is a WSGI HTTP server for Python web applications that acts as a process manager, helping to:
- Manage multiple worker processes
- Handle concurrent requests efficiently
- Automatically restart failed processes
- Provide better resource utilization
In this tutorial, we'll learn how to integrate FastAPI with Gunicorn to achieve a production-grade deployment setup.
Prerequisites
Before we begin, ensure you have:
- Python 3.7+ installed
- Basic knowledge of FastAPI
- Understanding of virtual environments
Setting Up Your Environment
Let's start by creating a virtual environment and installing the necessary packages:
# Create a virtual environment
python -m venv venv
# Activate the environment
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
# Install required packages
pip install fastapi uvicorn gunicorn
Creating a Simple FastAPI Application
First, let's create a basic FastAPI application that we'll deploy with Gunicorn:
# main.py
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def root():
return {"message": "Hello World"}
@app.get("/items/{item_id}")
async def read_item(item_id: int):
return {"item_id": item_id}
Understanding Gunicorn Workers
Gunicorn uses a pre-fork worker model, which means it creates multiple worker processes to handle incoming requests. Each worker can process one request at a time.
For FastAPI applications, we need to use Uvicorn workers with Gunicorn, as FastAPI is built on ASGI (not WSGI). Gunicorn provides a way to use worker classes from other servers.
Worker Types for FastAPI
There are several worker types you can use with FastAPI:
uvicorn.workers.UvicornWorker
: Standard Uvicorn workeruvicorn.workers.UvicornH11Worker
: Uses H11 for HTTP protocoluvicorn.workers.UvicornWorkerProcess
: Enhanced worker designed for use with Gunicorn
Configuring Gunicorn for FastAPI
Basic Command Line Configuration
The most basic way to run FastAPI with Gunicorn is using the command line:
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker
This command:
- Starts Gunicorn
- Uses your
main.py
file and loads theapp
object - Creates 4 worker processes (
-w 4
) - Uses the Uvicorn worker class (
-k uvicorn.workers.UvicornWorker
)
Creating a Gunicorn Configuration File
For more control, create a gunicorn_conf.py
file:
# gunicorn_conf.py
import multiprocessing
# Gunicorn config variables
workers_per_core_str = "1"
max_workers_str = "8"
web_concurrency_str = None
# Gunicorn worker configuration
worker_class_str = "uvicorn.workers.UvicornWorker"
loglevel = "info"
bind = "0.0.0.0:8000"
keepalive = 120
timeout = 120
errorlog = "-"
# Worker processes
cores = multiprocessing.cpu_count()
workers_per_core = float(workers_per_core_str)
default_web_concurrency = workers_per_core * cores
if web_concurrency_str:
web_concurrency = int(web_concurrency_str)
else:
web_concurrency = max(int(default_web_concurrency), 2)
if max_workers_str:
web_concurrency = min(web_concurrency, int(max_workers_str))
workers = web_concurrency
Now you can run Gunicorn using this configuration:
gunicorn main:app -c gunicorn_conf.py
Optimizing Worker Count
The number of workers is a critical configuration for performance. A common formula is:
workers = (2 * CPU cores) + 1
This formula ensures efficient use of CPU resources while leaving some capacity for the OS and other processes.
Example for a 4-core server:
workers = (2 * 4) + 1 = 9 workers
Creating a Start-Up Script
For better management, create a start-up script that configures and launches Gunicorn:
# start.py
import subprocess
import sys
def main():
# Command to run Gunicorn
cmd = [
"gunicorn",
"main:app",
"--workers", "4",
"--worker-class", "uvicorn.workers.UvicornWorker",
"--bind", "0.0.0.0:8000",
"--timeout", "120"
]
# Execute the command
try:
subprocess.run(cmd, check=True)
except subprocess.CalledProcessError as e:
print(f"Error running Gunicorn: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()
Run the script with:
python start.py
Practical Example: Creating a Production-Ready API
Let's create a more realistic example with multiple endpoints and demonstrate how to deploy it with Gunicorn:
# advanced_app.py
from fastapi import FastAPI, Depends, HTTPException, status
from pydantic import BaseModel
from typing import List, Optional
import time
app = FastAPI(title="Product API")
# Simulate database
fake_db = {}
item_id_counter = 0
class Item(BaseModel):
name: str
description: Optional[str] = None
price: float
tax: Optional[float] = None
@app.middleware("http")
async def add_process_time_header(request, call_next):
start_time = time.time()
response = await call_next(request)
process_time = time.time() - start_time
response.headers["X-Process-Time"] = str(process_time)
return response
@app.get("/")
async def root():
return {"status": "API is running"}
@app.post("/items/", status_code=status.HTTP_201_CREATED)
async def create_item(item: Item):
global item_id_counter
item_id_counter += 1
fake_db[item_id_counter] = item.dict()
return {"id": item_id_counter, **fake_db[item_id_counter]}
@app.get("/items/", response_model=List[dict])
async def list_items():
return [{"id": id, **item} for id, item in fake_db.items()]
@app.get("/items/{item_id}")
async def read_item(item_id: int):
if item_id not in fake_db:
raise HTTPException(status_code=404, detail="Item not found")
return {"id": item_id, **fake_db[item_id]}
@app.put("/items/{item_id}")
async def update_item(item_id: int, item: Item):
if item_id not in fake_db:
raise HTTPException(status_code=404, detail="Item not found")
fake_db[item_id] = item.dict()
return {"id": item_id, **fake_db[item_id]}
@app.delete("/items/{item_id}")
async def delete_item(item_id: int):
if item_id not in fake_db:
raise HTTPException(status_code=404, detail="Item not found")
del fake_db[item_id]
return {"message": "Item deleted successfully"}
To deploy this with Gunicorn:
gunicorn advanced_app:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
Handling Graceful Shutdowns
Gunicorn can handle graceful shutdowns, which is important for production applications. When you send a SIGTERM signal to Gunicorn, it will:
- Stop accepting new requests
- Wait for active requests to complete (up to the timeout value)
- Shutdown workers gracefully
You can configure shutdown behavior in your Gunicorn configuration:
# Add to gunicorn_conf.py
graceful_timeout = 30 # Time for workers to complete active requests
Advanced Configuration with Environment Variables
For production deployments, it's common to use environment variables for configuration:
# gunicorn_conf_env.py
import os
import multiprocessing
# Get environment variables or use defaults
workers_per_core_str = os.getenv("WORKERS_PER_CORE", "1")
max_workers_str = os.getenv("MAX_WORKERS", "8")
web_concurrency_str = os.getenv("WEB_CONCURRENCY", None)
host = os.getenv("HOST", "0.0.0.0")
port = os.getenv("PORT", "8000")
bind_env = os.getenv("BIND", None)
use_loglevel = os.getenv("LOG_LEVEL", "info")
# Gunicorn config variables
loglevel = use_loglevel
workers_per_core = float(workers_per_core_str)
cores = multiprocessing.cpu_count()
default_web_concurrency = workers_per_core * cores
# Configure binding
if bind_env:
bind = bind_env
else:
bind = f"{host}:{port}"
# Worker processes
if web_concurrency_str:
web_concurrency = int(web_concurrency_str)
else:
web_concurrency = max(int(default_web_concurrency), 2)
if max_workers_str:
web_concurrency = min(web_concurrency, int(max_workers_str))
# Gunicorn settings
workers = web_concurrency
worker_class = "uvicorn.workers.UvicornWorker"
keepalive = 120
timeout = 120
Running Behind a Reverse Proxy
In production, Gunicorn should typically run behind a reverse proxy like Nginx:
Client → Nginx → Gunicorn → FastAPI
Basic Nginx configuration for FastAPI:
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Summary
In this tutorial, we've learned:
- Why Gunicorn is essential for production FastAPI deployments
- How to integrate FastAPI with Gunicorn using Uvicorn workers
- Different ways to configure Gunicorn (command line, config file, environment variables)
- Best practices for optimizing worker count
- Creating startup scripts for better management
- Running FastAPI and Gunicorn behind a reverse proxy
By using FastAPI with Gunicorn, you can achieve a production-grade deployment that efficiently handles concurrent requests, manages processes, and optimizes server resources.
Additional Resources
Exercises
- Create a FastAPI application with multiple endpoints and deploy it with Gunicorn.
- Write a script that calculates the optimal number of workers based on your system's CPU cores.
- Create a Docker setup that runs your FastAPI application with Gunicorn.
- Implement a health check endpoint in your FastAPI app and configure Gunicorn to use it.
- Set up a complete production environment with Nginx as a reverse proxy in front of your Gunicorn-powered FastAPI application.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)