Available for opportunities Mexico

Bernardo VegaMachine Learning Engineer

I build end-to-end ML systems, from raw data to a service in production.

About

From data to product

Machine Learning Engineer focused on shipping models to production with engineering rigor: reproducible pipelines, honest evaluation, and deployable services. I come from full-stack development, so I design the model and the infrastructure that serves it.

My approach blends two worlds: the rigor of machine learning (honest evaluation, reproducibility, data at scale) with the software-engineering discipline that turns a model from an experiment into a reliable service.

Machine Learning

Learning-to-rankClusteringFeature engineeringEvaluation / metricsMLflow

Languages

PythonTypeScriptSQL

Backend / Serving

FastAPINestJSPostgreSQLMCPClaude Agent SDK

Frontend

Next.jsReactTailwind CSS

Platform

DockerGitHub ActionsAzureCloudflare

Career

Experience

Machine Learning Engineer · Neural GT

Design and production deployment of end-to-end machine learning models, from data ingestion to a deployed service.

2024 - Present

Remote

Built reproducible ML pipelines with experiment tracking and versioned evaluation.
Shipped containerized inference services (FastAPI) with CI/CD.
Worked on large-scale data with streaming feature engineering and leak-free splits.

PythonPyTorchLightGBMMLflowFastAPIDocker

Machine Learning Engineer · Fintech

Applied ML in the financial domain, focused on numerical correctness, traceability, and reliable deployment.

2022 - 2024

Remote

Predictive modeling over transactional data with an emphasis on robust validation.
Integrated models into product flows with monitoring and automated tests.
Full-stack collaboration: from the data backend to the interface consuming the model.

PythonTypeScriptPostgreSQLscikit-learnNext.js

Selected work

Projects

GeoPlay Recommender

Completed2025

Player segmentation and geospatial recommendation for location-based mobile games.

Repo

146M

Synthetic events

0.63 · +137%

NDCG@10 (ranking)

0.864

Clustering ARI (HDBSCAN)

10 / 10

Phases completed

End-to-end ML system for a synthetic Pokémon GO-style game. It generates realistic behavior data, segments players into archetypes via clustering, and ranks geographic hexes (H3) by visit probability using learning-to-rank. Built as a demonstration of production-grade ML engineering.

Streaming feature engineering over 146M partitioned events, without loading everything into memory.
Leak-free temporal splits: the ranker features are computed only over the training period.
Hard negative mining: negatives are drawn from each player’s visited-hex universe, not random.
LightGBM Ranker (LambdaRank) tuned by random search; MLflow for tracking and registry.
Containerized FastAPI service that serves from an exported bundle, with no MLflow dependency at runtime.
Honest analysis of the model plateau (NDCG@10 ≈ 0.63) and discarded features, documented.

Python 3.12LightGBMH3HDBSCANMLflowFastAPIDockerpytestRuff

LedgerLens

In progress2026

AI-native agentic financial analyst: extracts, categorizes, reconciles, and answers about your finances.

Repo

Determinism-first

Boundary

Custom MCP server

Core

Eval gate in CI

Quality

Phase 0 done

Status

An AI-native financial analyst built TypeScript-first. Upload statements and an agent extracts, categorizes, reconciles, and answers natural-language questions, with deterministic money math and a rigorous evaluation harness behind every LLM feature.

Determinism-first boundary (ADR-0004): the model decides what to compute and explains; pure functions do the money math. Wrong numbers are never acceptable in finance.
A custom MCP server exposing typed domain tools the agent calls.
An eval harness with regression tests for prompts, agent behavior, and tool use, wired into CI as a gate.
Strict TypeScript monorepo (Next.js web + NestJS API), specs before code, and ADRs for big decisions.

TypeScriptNext.jsNestJSPostgreSQLClaude Agent SDKMCPZodVitestAzureDocker

Method

How I work

Production, not notebooks

A model only matters if it can be served, versioned, and reproduced. I design the full pipeline: ingestion, features, training, registry, and a deployable service.

No data leakage

Temporal splits respect time: features are computed only over the training period. If a metric looks too good, there is almost always leakage.

Honest evaluation

I report what the model actually does, including its limits. I document the plateau, the discarded features, and why one algorithm beats another.

Determinism where it matters

In sensitive domains (finance), the LLM decides what to compute and explains; pure functions compute the numbers. Correctness is never delegated to a probabilistic model.

Quality as a gate, not a wish

Lint, strict types, tests, and evals run in CI and block the merge. Quality is a property of the pipeline, not a good intention.

Full-stack, end to end

I come from web development, so I don’t stop at the model: I also build the API, the UI, and the infrastructure that put it in front of users.

determinism-first.ts

// The model decides; a pure function computes the money.
function netBalance(tx: Transaction[]): Money {
  return tx.reduce((sum, t) => sum.add(t.amount), Money.zero());
}

Contact

Let's build something

Open to Machine Learning Engineering opportunities and collaborations. Email is the best way to reach me.

[email protected] GitHub

Also at: [email protected]