~/gurunath  ·  senior ml · data · inference engineer  ·  chennai, in
github linkedin x substack
// GENERALIST ENGINEER  ·  8+ YEARS  ·  DATA → ML → PLATFORM → INFERENCE

Gurunath
L V

I build the machinery that makes data and intelligence move — from billion-row pipelines to the inside of an LLM's KV cache.

Data Engineering ML & MLOps Platform & Distributed Systems AI & LLM Systems Inference Engineering Web3 & DePIN
SCROLL

The short version

01 / WHO

I'm a generalist by instinct. Over the last eight years I've moved up and down the stack — wrangling billions of records a day through data lakehouses, shipping ML systems to production, building the platforms that hold them up, and lately optimizing the guts of LLM inference. I'm drawn to the hard, ambiguous problems that live between disciplines — the ones where nobody's quite sure whose job it is. Right now that means inference infrastructure at IO.net, and a growing fascination with web3, DePIN, and what happens when compute itself becomes a marketplace.

What I work on

02 / DOMAINS
01

Inference Engineering

Squeezing latency and cost out of LLM serving — KV cache optimization, distributed cache offloading, disaggregated prefill / decode.

vLLM · Aibrix · LMCache
02

Data Engineering

Lakehouses and streaming frameworks moving 1–5B records/day at sub-second query latency, benchmarked to 100B.

Spark · ClickHouse · Iceberg · Trino
03

ML & MLOps

End-to-end ML systems following MLOps best practices — research-to-production pipelines, retraining, model promotion.

PyTorch · XGBoost · MLflow · Ray
04

AI & LLM Systems

RAG pipelines, MCP servers, and production AI agents with end-to-end observability, tool use, and fault tolerance.

LangChain · RAG · MCP · OpenRouter
05

Platform & Distributed

Managed compute platforms on EMR, EKS and bare metal — Ray clusters, container-as-a-service, orchestration with Temporal.

Kubernetes · Ray · SkyPilot · AWS
06

Web3 & DePIN

Block-reward systems for DePIN GPU suppliers — designing and A/B testing distribution formulas behind a token launch on Solana.

DePIN · Solana · token economics

Currently

03 / NOW
ACTIVE Senior ML Software Engineer & Data Research Analyst  ·  IO.net  ·  Feb 2024 → now

Driving inference infrastructure for a decentralized GPU cloud.

Hosted & fine-tuned open LLMs (Qwen, GLM, DeepSeek, LLaMA) on vLLM + Aibrix for low-latency enterprise inference.
KV cache optimization & distributed offloading to cut memory footprint and latency at scale.
Disaggregated prefill–decode cluster deployments improving TTFT and token throughput.
Integrated io-intelligence as a provider in OpenRouter, expanding ecosystem reach.
Built a ChatGPT-style unified interface — web search, image & video gen, RAG.
Shipped an MCP server letting agents create & manage GPU clusters programmatically.
Pioneered a block-rewards system for DePIN GPU suppliers behind the IO-COIN launch on Solana.
Contributed to the cloud platform — Ray clusters, CaaS, bare metal, SkyPilot marketplace integrations.

The track record

04 / PATH
2024 — now
IO.net · Inference & DePIN
LLM inference infra, AI agents, and GPU-supplier reward systems for a decentralized compute network.
2022 — 2025
Chargebee · Data Platform
Enterprise lakehouse & streaming frameworks processing 1–5B records/day with sub-second latency.
2021 — 2022
Nike · Platform Team
Managed big-data service on AWS EMR + Spark; org-wide job orchestration on EKS.
2020 — 2021
Mercedes-Benz · Analytics
Built an analytics platform end to end — ingestion through interactive dashboards and PDF reporting.
2017 — 2020
Prodapt · ML & Data
Airflow ETL, streaming + batch systems, anomaly detection & time-series forecasting in production.

Out in the open

05 / OSS

Contributions and published packages across distributed computing and LLM tooling.

The archive

06 / WRITING
HEADS UP These are older notes from my early ML days — kept here for the curious. Fresh, deeper writing is brewing over on guruengineering.substack.com. New posts coming soon.