alex@vesa:~/portfolio — zsh — 184×56v10.0.26--:-- BUC · open
[ok] boot sequence complete
[ok] loaded alex.vesa <senior_ai_engineer>
[ok] mounted ~/portfolio · 10y of uptime
[··] awaiting your_brief.md

Building AI systems
that earn their keep.

Ten years architecting, shipping and maintaining machine‑learning products across automotive, medtech and consumer services. I work with teams that need an outside operator — not a slide deck.

yrs production
10
years in production AI
models shipped
23
shipped models
verticals
4
verticals
book
1
book in progress
$ cat about.md378 bytes · md
# four positions

The product I sell is the boring discipline that makes the interesting work possible — retraining loops, eval infrastructure, observability, on‑call.

  1. 01.Trends are noise. Best practices are evidence. I pay attention to the second.
  2. 02.An AI system is a contract with reality, not a demo. If it can't be observed, retrained and rolled back, it doesn't ship.
  3. 03.MLOps is the boring discipline that makes the interesting work possible.
  4. 04.Strategy that doesn't survive contact with a Dockerfile isn't strategy.
$ ls writing/*.essay9 files · All
01Apr 2026MLOpsThe retraining loop is the product.Most ML projects fail at week 14, when the data drifts and nobody noticed. A field guide to building the loop before you build the model.11 min
02Mar 2026SystemsWhy I stopped writing LangChain in production.After three migrations I wrote down what an LLM application actually needs — and what frameworks keep getting in the way of.8 min
03Feb 2026MedtechSegmenting organs at risk: a 2.5D postmortem.Lessons from a year on a CNN that had to satisfy radiologists, regulators and a CT scanner from 2009. Mostly the third one.14 min
04Jan 2026HiringHire the engineer who has been on call.A short, unsentimental note on the difference between people who train models and people who run them.6 min
05Dec 2025ArchitectureVector databases are filesystems with opinions.An honest comparison of Qdrant, Chroma, pgvector and Elasticsearch — and when not to use any of them.12 min
06Nov 2025ConsultingThe fixed‑bid AI project is a moral hazard.Why I bill for the loop, not the launch — and how to write a statement of work that survives the first model run.9 min
07Oct 2025EvaluationEval sets are the only honest thing in your repo.Building eval infrastructure that catches regressions before your users do, and what that costs.7 min
08Sep 2025AutomotiveNight‑vision CNNs, three years later.What survived from the original architecture, what didn't, and what I'd build differently if the headlights were on me again.10 min
09Aug 2025NotesAgainst the AI roadmap.Roadmaps assume the territory holds still. Mine doesn't. A short defence of running an AI team like a research group with a budget.5 min
$ open books/2 titles · 1 in progress
A.V. · Manning · 26

Production AI

I
In progress · Manning, 2026

Production AI

A field manual for shipping machine learning into systems that have to keep working.

Eighteen chapters on the parts of ML that don't fit in a notebook — retraining, eval, drift, on‑call, cost, and the politics of model rollback. Drafts ship to subscribers monthly.

≈ 420 pp
A.V. · Self · 24

The Boring Discipline

II
Essay collection · 2024

The Boring Discipline

Twenty‑two essays on MLOps as a craft.

A self‑published collection of long‑form pieces written between 2021 and 2024, gathered and rewritten. Read by ~14,000 engineers and three regulators.

186 pp
$ talks --since 20257 rows · CSV
datevenuetitleformat
Jun 2026
PyData Bucharest
Bucharest
Eval‑first development for LLM systemsKeynote
May 2026
MLOps World
Berlin
When to throw the model awayTalk
Mar 2026
Stanford MedAI Seminar
Remote
Segmenting organs at risk: a postmortemLecture
Nov 2025
NeurIPS Workshop
Vancouver
The retraining loop is the productPaper + talk
Sep 2025
AI Engineer Summit
San Francisco
Production patterns for retrievalWorkshop
Jun 2025
Devoxx Romania
Bucharest
MLOps for backend engineersTalk
Mar 2025
EuroPython
Prague
Three years of PyTorch in productionTalk
$ engagements --list3 engagements · billable by the loop
01

AI Systems Architecture

4–12 weeks

From greenfield to legacy. I design the model, data and infrastructure layers as one system — not three separate procurement decisions.

  • Architecture review & technical due diligence
  • Model + data + serving topology
  • Cost, latency and observability budgets
  • Hiring plan for the team that will run it
02

MLOps Engagement

Retainer, 3–9 months

I embed with your team and build the boring scaffolding — eval, retraining, monitoring, on‑call — that makes the interesting work possible.

  • CI/CD for models, data and prompts
  • Eval infrastructure & regression gates
  • Drift, cost and quality dashboards
  • Runbooks and on‑call rotation design
03

Executive Advisory

Monthly retainer

For founders, CTOs and product leaders who need a working AI engineer in the room when the decisions are made. Quietly, on a schedule.

  • Monthly working sessions
  • Roadmap and hiring review
  • Vendor and architecture sanity checks
  • On‑call for 'should we?' decisions

The right time to call me is before the proof of concept is dressed up as the product.

$ ps aux | grep prod4 processes · status RUN
P‑01 · 2024–25Medtech

Workspace OS for clinical data

An AI workspace that turns scanned referral chaos into structured, queryable patient data. LLM extraction + OCR + a stubborn schema. Deployed across 14 hospital networks.

GPT‑4oTesseractPostgresQdrantFastAPIAWS
P‑02 · 2022–24Medtech

Organs‑at‑risk segmentation

2.5D and 3D CNN segmentation pipeline used in radiation therapy planning. Sub‑millimetre tolerance, traceable retraining, and a paper trail that survives an FDA conversation.

PyTorchMONAICUDAMLflowWeights & Biases
P‑03 · 2021–22Automotive

Night‑vision driver assist

Real‑time pedestrian and animal detection on infrared imagery, running on automotive‑grade silicon. Three OEM integrations, one of which still ships today.

TensorFlow LiteOpenCVC++ONNXEmbedded Linux
P‑04 · 2020–21Consumer

Social media post generator

Early LLM product, built on first‑gen GPT‑3. Brand voice fine‑tuning, scheduled generation, human approval queue. A useful lesson in what LLM products were actually for.

GPT‑3DjangoRedisCelery
$ which --all
Languages
  • Python
  • Go
  • Node.js
  • SQL
ML / DL
  • PyTorch
  • TensorFlow
  • Keras
  • MONAI
  • OpenCV
  • CUDA
Data & infra
  • Postgres
  • Elasticsearch
  • Redis
  • Kafka
  • Qdrant
  • Chroma
Cloud & ops
  • AWS
  • Docker
  • Terraform
  • CDK
  • MLflow
  • W&B
  • Langfuse
Frameworks
  • FastAPI
  • Django
  • Flask
$ grep -r vesa /press6 mentions · grep
$ ssh alex.vesa@cube-digital.iopublic key on request
→ open a session

Write to me plainly.
I read everything.

A short note about what you're building and where it's stuck is the fastest way to start. I usually reply within 48 hours.

locationTimisoara · Remote
engagementsArchitecture · MLOps · Advisory
languagesEnglish, Romanian
reply~ 48 hrs
timezoneEET / UTC+2