Builder, operator2024–PresentLocal-first, zero telemetry
CUDA llama.cpp Open WebUI Python
Why Local
The line between convenience and surveillance is thinner than most people think. Every prompt you send to a cloud LLM is logged, analyzed, and potentially used for training. For work-related queries — architecture decisions, code review, process improvements — that’s a real concern.
Modern GPUs are capable enough, and models like Llama have made local inference practical. The setup isn’t complicated.
Setup
llama.cpp server running on a CUDA GPU with Open WebUI as the frontend. The stack is intentionally simple — one inference server, one UI, zero orchestration. No vLLM, no Triton, no Ray. Overkill is a feature when you’re serving a cluster, not when you’re serving yourself.
The models are quantized to fit in VRAM with acceptable quality loss. GGUF format, mostly Q4_K_M quantization. Good enough for reasoning, code review, and drafting. Not good enough for anything that needs precision, but that’s not the use case.
What Works
Code review and refactoring suggestions
Drafting technical documents and emails
Explaining unfamiliar codebases
Brainstorming architecture approaches
What Doesn’t
Anything requiring factual accuracy without verification
Long-context tasks beyond the model’s window
Anything you’d trust without reading the output first
The local-first constraint means I can experiment freely. If a model hallucinates or gives bad advice, it doesn’t leak to anyone. That freedom is the whole point.
{"menu":[{"name":"Pages","items":[{"label":"Home","subtitle":"Overview","action":"navigate:/","icon":"page"},{"label":"Career","subtitle":"Timeline & principles","action":"navigate:/career","icon":"page"},{"label":"Projects","subtitle":"All projects","action":"navigate:/projects","icon":"page"},{"label":"About","subtitle":"About Meher","action":"navigate:/about","icon":"page"}]},{"name":"Settings","items":[{"label":"Toggle Theme","subtitle":"","action":"toggleTheme","icon":"theme"}]}],"fuse":{"threshold":0.6,"minMatchCharLength":2,"keys":["label","subtitle","searchableText"]},"projects":[{"title":"Global Payroll Platform","description":"Led end-to-end deployment of in-house payroll platform across 6 APAC markets processing $1.2B annually. Defined technical requirements for 2 engineering teams. Coordinated payments integrations via API/SFTP with ISO 20022 XML specifications.","tags":["Python","SQL","API Integration","ISO 20022"],"link":"#","image":null,"techStack":["Python","SQL","API Integration","ISO 20022"],"size":"large","domain":"fintech","icon":null,"featured":true},{"title":"Fraud Detection Engine","description":"Drove fraud detection initiative analyzing 1.5M payment transactions. Reduced manual validation touchpoints by 80% through automated rule-based detection and QuickSight analytics dashboards.","tags":["Python","QuickSight","Analytics"],"link":"#","image":null,"techStack":["Python","QuickSight","SQL"],"size":"medium","domain":"analytics","icon":null},{"title":"Background Check Revamp","description":"Cross-functional policy revamp with Legal, Compliance, and Business. Reduced 90th percentile turnaround time from 4 months to 1 month. Re-engineered verification workflows, cutting manual review time by 50%. Impacted 120K annual hires across India.","tags":["Operations","Compliance","Process Engineering"],"link":"#","image":null,"techStack":["SQL","QuickSight","VBA"],"size":"medium","domain":"data","icon":null},{"title":"Sovereign Homelab","description":"49-container self-hosted infrastructure: Caddy reverse proxy, DNSGuard local resolver, NetBird VPN, Vaultwarden, Redis, PostgreSQL, CouchDB. Full observability via Grafana, Loki, Prometheus, and Alloy. Running on bare metal—no cloud provider.","tags":["Docker","Linux","Caddy","NetBird"],"link":"#","image":null,"techStack":["Docker","Caddy","NetBird","Vaultwarden"],"size":"large","domain":"infrastructure","icon":null,"featured":true},{"title":"AI Node","description":"Local-first LLM orchestration on CUDA GPU. Running llama.cpp server with Open WebUI for private, on-device AI inference—no cloud APIs, no telemetry, no data leaving the box.","tags":["CUDA","llama.cpp","Open WebUI"],"link":"#","image":null,"techStack":["Python","CUDA","llama.cpp"],"size":"medium","domain":"ai","icon":null,"featured":true},{"title":"Matrix Citadel","description":"Self-hosted Matrix federation with full MatrixRTC voice/video via dual LiveKit servers. Twunnel federation bridges, JWT auth services, OpenClaw Gateway for AI agent orchestration. Running on cloudcitadel.in.","tags":["Matrix","LiveKit","Twunnel"],"link":"https://cloudcitadel.in","image":null,"techStack":["Matrix","LiveKit","Twunnel","OpenClaw"],"size":"medium","domain":"infrastructure","icon":null},{"title":"Media & Photo Stack","description":"Self-hosted media ecosystem: Jellyfin streaming, Immich photo library with ML-powered face recognition and reverse image search, full *arr automation pipeline (Radarr, Sonarr, Lidarr, Prowlarr, Bazarr), Jellystat analytics dashboard.","tags":["Jellyfin","Immich","*arr Stack"],"link":"#","image":null,"techStack":["Jellyfin","Immich","PostgreSQL"],"size":"medium","domain":"infrastructure","icon":null},{"title":"Observability Stack","description":"Full infrastructure telemetry: Grafana dashboards, Loki log aggregation, Prometheus metrics, Alloy collector, node_exporter, process_exporter, smartctl_exporter. Monitoring 49 containers and bare-metal health in real time.","tags":["Grafana","Prometheus","Loki","Alloy"],"link":"#","image":null,"techStack":["Grafana","Prometheus","Loki","Alloy"],"size":"medium","domain":"infrastructure","icon":null}],"skills":{"tools":["Python","SQL","QuickSight","VBA","Docker","Linux","Kubernetes","Grafana","Prometheus","Loki","Caddy","Matrix"],"standards":["ISO 20022"],"domains":["Payments","Compliance Engineering","Product Management"]},"contact":{"email":"hi@meherchaitanya.com","channels":[{"label":"Email","url":"mailto:hi@meherchaitanya.com","displayText":"hi@meherchaitanya.com","icon":"mail","external":false},{"label":"LinkedIn","url":"https://linkedin.com/in/meherchaitanya","displayText":"meherchaitanya","icon":"linkedin","external":true},{"label":"Matrix","url":"https://matrix.to/#/@meher:hanumara.online","displayText":"@meher:hanumara.online","icon":"matrix","external":true}]}}