Selected talks, presentations, and written work on technology, product management, and cloud native innovation.
AKS Engineering Blog
September 2025
Announcing native Grafana dashboards in Azure Portal for AKS clusters at no additional cost. This integration eliminates the complexity of maintaining separate visualization tools while delivering comprehensive cluster observability with Container Insights, Prometheus, and Azure Monitor metrics out-of-the-box.
Topics: AKS, Grafana, Observability, Azure Monitor
AKS Engineering Blog
August 2025
Introducing the CLI Agent for AKS - an AI-powered command-line experience for troubleshooting, optimizing, and operating AKS clusters. Built on open-source HolmesGPT (CNCF Sandbox project) and the AKS Model Context Protocol server, this human-in-the-loop tool brings intelligent agentic workflows directly to your terminal with a focus on security and transparency.
Topics: AI, AKS, Troubleshooting, Open Source, HolmesGPT
CNCF Blog
January 2026
Introduction to HolmesGPT, an open-source agentic AI framework for root cause analysis in cloud native environments. Co-authored with the Robusta.dev team, this post explores how AI agents can revolutionize Kubernetes troubleshooting through extensible toolsets, natural language prompts, and intelligent diagnostics.
Topics: Cloud Native, AI, Troubleshooting, Open Source
KubeCon EU 2024 - Azure Day
March 2024
Deep dive into the latest advancements from AKS for improved troubleshooting including new AI-based features. This session covers the AKS monitoring and troubleshooting stack, challenges with monitoring, best practices, and includes live demos of Retina for network observability. Co-presented with Pavneet Ahluwalia and Neha Aggrawal.
Topics: AKS, AI, Observability, Troubleshooting, Retina, Network Observability
YouTube
2025
Deep dive into AKS cluster troubleshooting techniques covering node saturation metrics for performance optimization, leveraging Kubernetes events as real-time cluster signals, and fine-tuning resource allocation with cluster autoscaler metrics.
Topics: AKS, Troubleshooting, Metrics, Performance, Autoscaling