Realtime Captioning (Prod)
Low-latency ASR pipeline, streaming inference, and a React control panel for live monitoring. Deployed on K8s with autoscaling.
I build scalable AI systems, production-grade ML microservices and delightful front-ends. I bridge models and products: APIs, pipelines, inference infra, and user‑facing apps.
I combine deep learning knowledge with engineering practice. I build robust inference endpoints, monitoring, and intuitive frontends to expose model capabilities to users. I focus on reliability, latency, and delightful UX.
FastAPI, Node, DDD, REST/GraphQL, gRPC
PyTorch, Transformers, Fine-tuning, Serving
React, Three.js, GSAP, Tailwind
K8s, Docker, Terraform, CI/CD
ETL, Feature Stores, Airflow
Prometheus, Grafana, SLOs
Low-latency ASR pipeline, streaming inference, and a React control panel for live monitoring. Deployed on K8s with autoscaling.
Vector search service that fuses image & text embeddings, with a fast API and a polished front-end experience.
SDK for building task-specific agents with model orchestration, retry policies and unified metrics.
Interactive Three.js visualizations for model introspection and embeddings projection with animated transitions.
Available for remote roles and freelance. Typical engagements: prototype → production.