Python is the backend you reach for when the work touches data or AI. This track moves from a virtual environment and type hints to a typed FastAPI service, async I/O, the data/ML ecosystem, and pytest — tagged by level so you can read only as deep as you need. Where Go ships a tiny binary, Python ships you the unmatched libraries that make models and data tractable.
Create an isolated environment with venv
BeginnerMake a project folder, create a virtual environment with python -m venv, and activate it.
Why every project gets its own environment
A virtual environment is a private copy of Python’s package directory, so each project pins its own
dependency versions without colliding with the system Python or other projects. The standard-library
venv ships with Python 3, so there’s nothing to install first. Many teams now use uv (a fast,
drop-in installer/resolver) instead of plain pip — it creates the same kind of environment but resolves
and installs much faster. Either is fine; this track shows venv + pip and notes the uv equivalent.
Create and activate a venv
# Verify the interpreter (3.11+ for this track)
python3 --version
mkdir helix-api && cd helix-api
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# uv equivalent (optional, faster): uv venv && source .venv/bin/activate
python -m pip install --upgrade pipWrite typed Python with type hints
BeginnerWrite a small function with parameter and return type hints, then check it with a type checker.
Hints are documentation the tools can enforce
Python is dynamically typed, but type hints (def total(items: list[int]) -> int:) let editors and
checkers catch mistakes before runtime. They don’t change execution — the interpreter ignores them — but
tools like mypy or pyright read them to flag bugs, and frameworks like FastAPI and Pydantic use
them to drive real behaviour (validation, docs, serialisation). Hints are how a fast-to-write language
keeps large codebases honest.
A typed function, checked
# money.py
def line_total(unit_price_cents: int, quantity: int) -> int:
"""Total in integer cents; never use float for money."""
return unit_price_cents * quantitypython -m pip install mypy
python -m mypy money.py # reports type errors, exits non-zero on failureChat prompt — paste into a chat to get the code
Role: Python teacher. The reader has no repo access here — return complete code.
Task: Show a small, fully type-hinted module with a function that groups a list of orders by customer id.
Requirements:
- Use modern built-in generics (list[...], dict[...]), no `typing.List`.
- Signature: group_by_customer(orders: list[Order]) -> dict[str, list[Order]].
- Define Order as a typed dataclass with id: str, customer_id: str, total_cents: int.
- No third-party dependencies; standard library only.
Tests / acceptance (describe, since no repo):
- Two orders sharing a customer_id land in the same list.
- `python -m mypy <file>` reports no errors.
Output: the complete module, no commentary.Stand up a FastAPI app with uvicorn
BeginnerInstall fastapi and uvicorn, write an app with a GET /healthz route, and run it.
Why FastAPI and what ASGI means
FastAPI is the modern Python web framework for typed APIs: you declare routes as functions with
type-hinted parameters, and it derives request parsing, validation, and OpenAPI docs from those hints.
It runs on ASGI (the async server gateway interface), the async successor to WSGI, served by
uvicorn. Because it’s async-native, a single process can hold thousands of concurrent connections that
are mostly waiting on I/O. Open /docs after starting and you get interactive Swagger UI for free.
Minimal FastAPI app
python -m pip install "fastapi[standard]" uvicorn# app/main.py
from fastapi import FastAPI
app = FastAPI()
@app.get("/healthz")
def healthz() -> dict[str, str]:
return {"status": "ok"}uvicorn app.main:app --reload
# in another terminal:
curl -i localhost:8000/healthz
# interactive docs: open http://localhost:8000/docsAgent prompt — paste into an agent with repo access
Role: Senior Python engineer working in this repo.
Context: Fresh project helix-api with an active venv; FastAPI and uvicorn installed; Python 3.11+.
Task: Create a minimal, idiomatic FastAPI app in ./app/main.py.
Requirements:
- A FastAPI() instance named `app`.
- GET /healthz returns 200 with JSON {"status": "ok"}.
- The handler is type-hinted (-> dict[str, str]); no third-party deps beyond fastapi/uvicorn.
Tests / acceptance:
- `uvicorn app.main:app` starts without error.
- `curl -s -o /dev/null -w "%{http_code}" localhost:8000/healthz` prints 200.
Output: a unified diff plus a one-paragraph summary of the design.Validate request bodies with Pydantic v2
BeginnerDefine a BaseModel for the request, type a POST handler with it, and let FastAPI validate.
Pydantic v2 validates at the edge
Pydantic v2 turns a typed class into a validator. Subclass BaseModel, declare fields with hints, and
add constraints with Field(...) (e.g. gt=0, min_length=1). When a FastAPI handler takes that model
as a parameter, FastAPI parses the JSON body, validates it, and returns a structured 422 with field-level
errors automatically — you never hand-write that code. Note the v2 spelling: use model_config /
model_validate (the v1 class Config and .parse_obj are deprecated).
A validated POST endpoint
# app/main.py
from fastapi import FastAPI
from pydantic import BaseModel, Field
app = FastAPI()
class ProductIn(BaseModel):
name: str = Field(min_length=1)
unit_price_cents: int = Field(gt=0) # money as integer cents
class ProductOut(ProductIn):
id: str
@app.post("/products", status_code=201)
def create_product(body: ProductIn) -> ProductOut:
return ProductOut(id="p1", **body.model_dump())# valid -> 201; empty name or non-positive price -> 422 with field errors
curl -s -X POST localhost:8000/products \
-H 'content-type: application/json' \
-d '{"name":"Widget","unit_price_cents":1999}'Agent prompt — paste into an agent with repo access
Role: Senior Python engineer in this repo.
Context: FastAPI app in ./app/main.py; Pydantic v2; Python 3.11+.
Task: Add a POST /products endpoint that validates input with a Pydantic v2 BaseModel.
Requirements:
- ProductIn: name (str, min_length=1), unit_price_cents (int, gt=0).
- ProductOut extends ProductIn with id: str; respond 201 on success.
- Use Pydantic v2 idioms only (Field, model_dump); no deprecated v1 Config or parse_obj.
Tests / acceptance:
- POST {"name":"","unit_price_cents":10} returns 422.
- POST {"name":"Widget","unit_price_cents":1999} returns 201 with an "id" field.
- `python -m mypy app` is clean.
Output: a unified diff plus a one-paragraph summary.Do concurrent I/O with asyncio
IntermediateWrite async def handlers and call multiple awaitables together with asyncio.gather.
async is for waiting, not for CPU
asyncio runs an event loop on a single thread: an async def coroutine can await an I/O operation
(a database call, an HTTP request to a model API) and yield control while it waits, so the loop services
other requests meanwhile. asyncio.gather(*coros) runs several awaitables concurrently and returns their
results in order. This shines for I/O-bound work — exactly what an API gateway in front of databases and
model providers does. It does not speed up CPU-bound code (see the GIL step); for that, use processes.
Use an async HTTP client like httpx so the calls actually yield.
Concurrent awaits with gather
import asyncio
import httpx
async def fetch_one(client: httpx.AsyncClient, url: str) -> int:
resp = await client.get(url)
return resp.status_code
async def fetch_all(urls: list[str]) -> list[int]:
async with httpx.AsyncClient(timeout=5.0) as client:
return await asyncio.gather(*(fetch_one(client, u) for u in urls))python -m pip install httpx
python -c "import asyncio, app.fetch as f; print(asyncio.run(f.fetch_all(['https://example.com'])))"Agent prompt — paste into an agent with repo access
Role: Senior Python engineer in this repo.
Context: FastAPI app; Python 3.11+ with httpx installed. We need to enrich N items via an external API concurrently.
Task: Implement async def enrich(ids: list[str]) -> list[Product] that fetches each id concurrently.
Requirements:
- Use a single httpx.AsyncClient and asyncio.gather; results preserve input order.
- A timeout on the client; on any fetch failure, raise (do not silently drop results).
- No blocking calls (requests, time.sleep) inside the coroutine — those would stall the event loop.
Tests / acceptance:
- A pytest-asyncio test with a mocked transport asserts order is preserved and a failure propagates.
- `pytest -q` passes.
Output: a unified diff plus a one-paragraph rationale for using gather over a sequential loop.Crunch data with numpy and pandas
IntermediateLoad tabular data into a pandas DataFrame and compute an aggregate with vectorised operations.
Why this is Python's home turf
This is what Python is genuinely best at. numpy gives you fast, C-backed n-dimensional arrays;
pandas builds labelled tables (DataFrames) on top of them. Vectorised operations push the loop into
optimised native code, so df.groupby("customer_id")["total_cents"].sum() is both faster and clearer than
a hand-written Python loop. No other mainstream backend language has an equivalent data ecosystem — this is
the core reason teams choose Python for anything analytical.
Group and aggregate a DataFrame
# analyze.py
import pandas as pd
def revenue_by_customer(df: pd.DataFrame) -> pd.Series:
# df has columns: customer_id (str), total_cents (int)
return df.groupby("customer_id")["total_cents"].sum().sort_values(ascending=False)python -m pip install pandas numpy
python -c "import pandas as pd, analyze as a; \
print(a.revenue_by_customer(pd.DataFrame({'customer_id':['x','x','y'],'total_cents':[10,5,3]})))"Agent prompt — paste into an agent with repo access
Role: Senior Python data engineer in this repo.
Context: pandas and numpy installed; Python 3.11+. Orders arrive as a CSV with columns id, customer_id, total_cents.
Task: Implement revenue_by_customer(path: str) -> pd.Series returning total revenue per customer, descending.
Requirements:
- Read the CSV with pandas; do the aggregation with a vectorised groupby (no Python for-loop over rows).
- total_cents must be treated as int; coerce and fail loudly on non-numeric values.
- Return a Series indexed by customer_id, sorted by revenue descending.
Tests / acceptance:
- A pytest test with a small in-memory CSV asserts the top customer and the exact totals.
- `pytest -q` passes.
Output: a unified diff plus the before/after of how you avoided a row-wise loop.Test everything with pytest
IntermediateInstall pytest and FastAPI’s TestClient, then write assertions against your endpoints.
pytest is the de facto Python test runner
pytest discovers any test_*.py file and any test_* function, and uses plain assert — it rewrites
the assertion to print a helpful diff on failure, so you rarely need special matchers. FastAPI ships a
TestClient (built on httpx) that drives your app in-process, no running server needed. Fixtures
(@pytest.fixture) provide reusable setup, and dependency_overrides lets you swap a real dependency for
a fake in a test. Parametrise cases with @pytest.mark.parametrize to keep them dense.
An endpoint test
# tests/test_products.py
from fastapi.testclient import TestClient
from app.main import app
client = TestClient(app)
def test_create_product_rejects_empty_name() -> None:
resp = client.post("/products", json={"name": "", "unit_price_cents": 10})
assert resp.status_code == 422
def test_create_product_happy_path() -> None:
resp = client.post("/products", json={"name": "Widget", "unit_price_cents": 1999})
assert resp.status_code == 201
assert resp.json()["id"]python -m pip install pytest
pytest -qAgent prompt — paste into an agent with repo access
Role: Senior Python engineer in this repo.
Context: FastAPI app in ./app with /healthz and /products; pytest installed.
Task: Add a tests/ package covering both routes using FastAPI's TestClient.
Requirements:
- Use TestClient(app); no live server or network sockets.
- Parametrise the /products validation cases (empty name, non-positive price -> 422; valid -> 201).
- Cover /healthz returning 200 and {"status": "ok"}.
Tests / acceptance:
- `pytest -q` passes with all cases green.
Output: a unified diff plus a summary of the cases covered.Call a model SDK to add an AI feature
AdvancedAdd a route that calls a hosted LLM through its official Python SDK and returns the result.
Why the model SDKs live in Python first
Every major model provider ships a first-class Python SDK, usually before any other language — this is the practical reason AI products are built in Python. The official SDKs expose async clients that fit straight into a FastAPI handler. Keep API keys in environment variables (read via your Settings dependency), never in code. Use the async client so the network wait yields the event loop, and always set a timeout. Check the provider’s own docs for the exact client and method names, since SDKs evolve — see the Gemini API track for the Gemini specifics this curriculum uses.
An async LLM-backed route (shape)
# app/ai.py — illustrative shape; use your provider's official async client and current method names
from fastapi import APIRouter, Depends
from pydantic import BaseModel, Field
from app.deps import Settings, get_settings
router = APIRouter()
class AskIn(BaseModel):
prompt: str = Field(min_length=1)
class AskOut(BaseModel):
answer: str
@router.post("/ask")
async def ask(body: AskIn, settings: Settings = Depends(get_settings)) -> AskOut:
# Construct the provider's async client with settings (API key from env),
# await the generate/messages call, then map its text into AskOut.
answer = await call_model(body.prompt, settings) # your thin wrapper
return AskOut(answer=answer)Agent prompt — paste into an agent with repo access
Role: Senior Python AI engineer in this repo.
Context: FastAPI app with a Settings dependency that reads the model API key from env; Python 3.11+.
The provider and its official async Python SDK are chosen separately — do not invent SDK names or methods.
Task: Add a POST /ask route that takes {"prompt": str} and returns {"answer": str} from a hosted LLM.
Requirements:
- async handler using the provider's official async client; API key comes from Settings (env), never hardcoded.
- Validate the prompt with Pydantic v2 (min_length=1 -> 422 on empty).
- Set a request timeout; on provider error, return 502 with a JSON error (do not leak the raw exception).
- Isolate the SDK call behind a thin wrapper so it can be faked in tests.
Tests / acceptance:
- A pytest test fakes the wrapper (no real network) and asserts /ask returns 200 with the faked answer.
- Empty prompt returns 422.
- `pytest -q` passes.
Output: a unified diff plus a one-paragraph note on where the API key lives and why.Work with the GIL, not against it
AdvancedFor CPU-heavy work, move it off the event loop: use a process pool or run it in a thread executor.
What the GIL does and doesn't stop
CPython has a Global Interpreter Lock: only one thread executes Python bytecode at a time. So threads
do not give you parallel speedup for CPU-bound code — they help only when threads are waiting on I/O
(which asyncio already handles more cleanly). For real CPU parallelism, use multiprocessing /
ProcessPoolExecutor, which run separate interpreters across cores. In an async handler, never run a heavy
loop inline — it blocks the whole event loop; offload it with loop.run_in_executor (a process pool for
CPU work, a thread pool for blocking I/O without a native async API). Many numeric libraries (numpy) also
release the GIL inside their C code, so vectorising is often the simplest fix.
Offload CPU work to a process pool
import asyncio
from concurrent.futures import ProcessPoolExecutor
def heavy(n: int) -> int:
return sum(i * i for i in range(n)) # CPU-bound; would block the event loop
async def compute(n: int) -> int:
loop = asyncio.get_running_loop()
with ProcessPoolExecutor() as pool:
return await loop.run_in_executor(pool, heavy, n)python -c "import asyncio, app.cpu as c; print(asyncio.run(c.compute(10_000_000)))"Agent prompt — paste into an agent with repo access
Role: Senior Python engineer in this repo.
Context: An async FastAPI handler currently runs a CPU-bound transform inline and stalls under load. Python 3.11+.
Task: Move the CPU-bound work off the event loop using ProcessPoolExecutor and loop.run_in_executor.
Requirements:
- The function offloaded must be top-level/importable (picklable) so it works with multiprocessing.
- The async handler awaits the executor result; it must not call the heavy function directly.
- Explain in a comment why a thread pool would NOT help here (the GIL).
Tests / acceptance:
- A pytest-asyncio test asserts the offloaded compute returns the same value as the direct call.
- Under a concurrent load test, /healthz still responds while a compute is in flight (describe the check).
- `pytest -q` passes.
Output: a unified diff plus a one-paragraph explanation of GIL vs process vs thread here.Package and ship a lean container
AdvancedPin dependencies, then build a slim multi-stage Docker image that runs uvicorn.
Why pin, and why multi-stage
Reproducible deploys start with pinned dependencies — freeze exact versions
(pip freeze > requirements.txt, or a lockfile with uv/Poetry) so the image you test is the image you
ship. A multi-stage Dockerfile installs into a layer, then copies into a slim runtime base
(python:3.12-slim), keeping the image small and the attack surface low. Run as a non-root user and start
uvicorn bound to 0.0.0.0. Python images are larger than a Go static binary — that’s the trade-off for the
ecosystem — but -slim plus good layer caching keeps them reasonable.
Multi-stage Dockerfile
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.12-slim
WORKDIR /app
COPY --from=build /install /usr/local
COPY app ./app
RUN useradd -m appuser
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]pip freeze > requirements.txt
docker build -t helix-api .
docker run -p 8000:8000 -e DATABASE_URL=... helix-apiAgent prompt — paste into an agent with repo access
Role: Senior Python platform engineer in this repo.
Context: FastAPI app in ./app served by uvicorn; dependencies in the active venv; Python 3.12.
Task: Produce a pinned requirements.txt and a multi-stage Dockerfile that runs the app as a non-root user.
Requirements:
- requirements.txt holds exact pinned versions (from pip freeze or a lockfile).
- Multi-stage build: install in one stage, copy into a python:3.12-slim runtime; no build tools in the final image.
- Final image runs as a non-root user and starts uvicorn on 0.0.0.0:8000.
Tests / acceptance:
- `docker build -t helix-api .` succeeds.
- `docker run -p 8000:8000 helix-api` then `curl -s -o /dev/null -w "%{http_code}" localhost:8000/healthz` prints 200.
Output: a unified diff plus a one-paragraph note on what you pinned and why.Where to take it next
- Build the AI service this track points at in Helix Assistant, where a FastAPI backend turns prompts into structured, validated responses.
- Pair this backend with a model provider in the Gemini API track — the AI lane this curriculum builds on.
- Need leaner, faster request serving with a tiny deploy footprint instead? Compare against Go.