@drxim

· AI / LLM ops lead UTC+3

Founder of ZATVA. 12 years SRE and DevOps. I lead the team, keep a hand on on-call rotation, and own architecture for client deploys. On-call for GPU inference and LLM serving.

vLLM GPU ops Triton CUDA Ray Kubernetes

GitHub → ← Back to team

what I do at ZATVA

GPU inference fleets and LLM serving: on-call for vLLM and Triton, autoscaling and cost review on prod. Hold the UTC+3 on-call shift (00:00 to 08:00 UTC). On Discovery calls I talk directly with CTOs/founders, no sales layer.

What I've worked on most over the last few years: vLLM and Triton inference on A100/H100, multi-GPU autoscaling, KV-cache and batching tuning, GPU cost optimization.

background

Classic sysadmin since the 90s on FreeBSD, BSDi and RedHat at an internet provider. Then infra and SRE on high-traffic backend services, later moved into Web3 and AI infrastructure. Since 2024 we work as the ZATVA team on contract.

Fastest way to talk tech or scope a stack for your project - GitHub or Telegram via the contacts page.