Lab notebook · vol. III Sheet 001 / 087
— Personal Log of an AI Engineer at Work —
UTC −6 Edition · current
SubjectA. Ponce
DisciplineAI engineering · backend infra
Years10 · production
Pages001 — 087
StatusOPEN · ACCEPTING
SHIPPED PROD · '15 — '26
Personal log · entry 087 · open notes from a working engineer

I ship the plumbing that makes LLMs production-ready.

Senior AI engineer, ten years in production. Gateways, RAG, agent orchestration, evals — the unsexy infrastructure that keeps a model honest. The notebook below is how I think — fewer numbers, more decisions.

Abel Ponce1
1Senior AI Engineer · Go · Python · AWS · Temporal · pgvector · freelance, US/LatAm.

§1.What I do

— in plain language, before the diagrams.

I write the unsexy half of an AI product: the gateway that doesn’t drop the request when a provider degrades, the retrieval layer that doesn’t lie to the model, the orchestrator that survives a three-minute LLM timeout, the evals that catch a regression before a user does. Most of my work is invisible by design — when it’s working, nothing happens.

I care about failure modes more than features. I write tests against real infrastructure, not in-memory imitations — and TDD matters more in the era of agents, not less. I’d rather defend a small decision in writing than ship a clever one in silence. The work is the pitch — the rest of this notebook is the work.

I’m available for senior, freelance engagements across the United States and Latin America. Contracts, fractional, or one hard AI problem you’d like me to take off your plate.

§2.Five things I believe

— opinions worth defending in writing.
i.
The right tool is the one whose failure mode you can describe in one sentence.
If you can’t name what breaks first, you don’t understand the choice — you’re repeating marketing. This is the filter behind every decision in §3.
ii.
TDD matters more in the era of agents, not less.
Non-deterministic systems need more tests, not fewer. Real Postgres in containers, factories over mocks, tests that read like English.
iii.
Ask before you build.
The best optimization for an agent is building the right one from the start. Most production wins come from a question, not a clever prompt.
iv.
Adaptability is a technical skill.
React, Rails, Go, Python, agents — the stack changes; how you absorb a new one shouldn’t. Learning is the engineering, not the reward for it.
v.
A short notes document outlives a long meeting.
Decisions written down survive turnover. Decisions said out loud rarely survive the week. The notebook in your hand is the proof.

§3.One thing I built

— an AI infrastructure platform, drawn here once, and then we move on.
DRAWING NO. SCALE PLAT-001/v4 1 : 1 DRAWN BY DATE A. Ponce 04 · 2026 FIG. 1 — A SINGLE PLATFORM ENVOY · AUTH (REDIS, TTL 60s) · ROUTING /v1/* A. M1 · GATEWAY multi-provider router cost-fallback Go · Envoy · Redis M2 · RAG hybrid retrieval + rerank async ingest Python · pgvector M3 · ORCH. durable agent workflows survives waits Python · Temporal M4 · EVAL LLM judges + drift aligned · CI Python · S3 M5 · MESH streaming inference replayable Kafka · Protobuf POSTGRES · pgBOUNCER · REDIS · KAFKA · PROMETHEUS B. tutorial branch first. then feat/. then notes. — a.p. LEGEND — control plane module field note
Fig. 1Five services, one Envoy data plane, one set of shared infrastructure. Each module solves a distinct problem and was preceded by a six-hour tutorial branch — a small, defensible draft before any production line of code.
Table 1.Modules, abridged. Stack-only — see §6 to talk specifics.
idServiceFunctionStack
M1AI GatewayMulti-provider LLM router with cost-based fallback & per-tenant rate limits.Go · Envoy · Redis
M2RAG PlatformHybrid semantic + BM25 retrieval, cross-encoder reranking, async ingestion.Python · FastAPI · pgvector · Kafka
M3Agent OrchestratorDurable, multi-tenant agent workflows; pgvector memory; OPA guardrails.Python · Temporal · pgvector · OPA
M4LLM EvalLLM-as-judge scoring with TPR/TNR alignment before any judge enters CI.Python · Streamlit · S3
M5Event MeshReal-time inference on streaming events for fraud and personalization.Kafka · Protobuf · Schema Registry

§4.Field log

— ten years in three chapters, qualitative.
CH.03

AWS

Senior AI Engineer · 2025 →

Designed and shipped an AI infrastructure platform: gateway, retrieval, durable orchestration, evaluation, event mesh — five services sharing one Envoy data plane and one control plane.

Established a discipline I now apply everywhere: each module begins with a tutorial branch — a small, time-capped draft whose only output is a notes document defending the architectural choices to follow.

field note
“tutorial first.
production second.
notes always.”
CH.02

Lovevery

Senior · RAG & on-call

Authored the RAG on-call playbook — what to look at, in what order, when retrieval starts behaving badly at 3 a.m.

The VP of Engineering picked it up and rolled it out across the wider organization. A single document escalating from team to org taught me that good notes travel further than good code.

field note
“a playbook is
a service —
it has uptime.”
CH.01

Nuvocargo

Senior · Cross-border platform

Mexico–US freight platform. Daily QA cycles for the PM team had grown into a quiet productivity tax — repetitive, error-prone, run by hand.

Built and maintained the Cypress automation that absorbed it. The task didn’t go away — it stopped costing anyone’s morning. A reminder that the best engineering often shows up as something nobody notices anymore.

field note
“invisible work
is still work —
especially then.”

§5.Method & reagents

— how I learn, how I ship, how I write it down.

When the technology is new — a new model, a new agent runtime, an unfamiliar retrieval technique — I start with a tutorial branch. Time-capped, never merged. Its only deliverable is a notes document: what I understood, what cost me time, what I’d apply in production. It’s how I absorb the speed of AI without breaking anything in the way.

When I’m in production — TDD by layer, tests against real Postgres in containers (not in-memory imitations), factories over mocks so tests read like English. For AI work the rule extends: evals before features. If I can’t measure whether it improved, I don’t ship it.

When I’m done — I write. Decision records, on-call playbooks, this notebook. Notes travel further than code; the playbook I wrote at Lovevery escalated from team to org without me in the room.

Reagents (selected)

  • PythonRAG · agents · evals
  • Gobackends, M:N goroutines
  • TypeScriptedge & tooling·
  • Temporaldurable execution·
  • pgvectorvectors with ACID·
  • Kafkareplayable log·
  • Envoyauth + routing·
  • FastAPItyped py services·
  • Kubernetesk3d local · EKS prod·
  • testcontainersreal infra in tests

§6.Correspondence

— the easiest way to start a conversation.

Address all correspondence to the author.

Senior AI engineer, freelance — United States and Latin America. Contracts, fractional, or one hard AI problem you’d like me to take off your plate. Full overlap with US-PT, 0–3h with LatAm.

Download CV — Ponce, A. (PDF)
A. Ponce A. Ponce · 04 · 2026