Jeevanantham P

Site Reliability Engineer | DevOps Engineer

Professional Summary

Site Reliability Engineer with 6+ years of expertise in building and maintaining resilient, high-availability systems for enterprise-scale platforms. Currently ensuring 99.99% uptime for Cisco Webex Meetings, managing 50+ Kubernetes clusters across global data centers. Reduced MTTD by 40% and MTTR by 35% through proactive monitoring and automation. Skilled in CI/CD orchestration, infrastructure as code, and cloud platforms (AWS, Azure). Passionate about GenAI and AIOps innovation.

Experience

Site Reliability Engineer

Cisco Systems · Webex Meetings

2023 - Present

Bangalore, India

  • Maintain 99.99% uptime for Cisco Webex Meetings, managing 50+ Kubernetes clusters across 8 global data centers
  • Automate CI/CD workflows using Jenkins, GitLab CI/CD, GitHub Actions, ArgoCD, reducing deployment time by 40%
  • Design 25+ Grafana dashboards tracking service availability, error rates, latency, and capacity metrics
  • Leverage AppDynamics for transaction tracing; ThousandEyes for monitoring global endpoint reachability
  • Reduce MTTD by 40% and MTTR by 35% through proactive alerting and automated runbooks
  • Develop 20+ Python and Bash automation scripts saving 15+ hours weekly in manual operations
  • Conduct RCA for 100+ production incidents, reducing repeat incidents by 60%
  • Built and published 'Copilot Chat History Search' VS Code extension on VS Code Marketplace
  • Lead AIOps initiatives integrating ML models for anomaly detection

Software Engineer

Torry Harris Integration Solutions · Enterprise Solutions

Sep 2021 - 2023

Bangalore, India

  • Developed 10+ enterprise Java applications using AWS SDK for cloud integrations
  • Built CI/CD pipelines using Jenkins and Docker, achieving 95% deployment success rate
  • Designed Kafka clusters processing 500K+ messages daily on Kubernetes
  • Built PDF Q&A chatbot using LangChain and LLM models with 85% query accuracy
  • Integrated Hugging Face models for NLP tasks, reducing document processing time by 50%
  • Optimized deployment workflows, reducing release cycles by 30%

Associate Software Engineer

Torry Harris Integration Solutions · Cloud & Automation

Aug 2019 - Sep 2021

Bangalore, India

  • Provisioned Azure cloud infrastructure using Terraform, reducing provisioning time by 60%
  • Containerized 15+ Kafka applications using Docker for consistent deployments
  • Implemented configuration management standardizing infrastructure across 3 environments
  • Developed 10+ RPA automation workflows using UiPath, automating 200+ hours monthly
  • Participated in agile development with 95% sprint completion rate

Technical Skills

Cloud Platforms

AWS (EC2, EKS, VPC, IAM, ECR)Azure (DevOps, AKS, VMs)

Containers & Orchestration

DockerKubernetesArgoCDHelm

CI/CD

JenkinsGitLab CI/CDGitHub Actions

Infrastructure as Code

TerraformAnsible

Monitoring & Observability

PrometheusGrafanaELK StackEFK StackAppDynamicsThousandEyes

Programming

PythonBashShell ScriptingJava

Version Control

GitGitHubGitLab

Operating Systems

Linux (RHEL, Ubuntu, CentOS)System Administration

Messaging

Apache Kafka

Projects

Cloud Instance Manager (CIM)

Full-stack application with Spring Boot backend using AWS SDK for EC2 instance lifecycle management. Features JWT-based authentication with RBAC, instance start/stop/reboot, security groups management, and multi-region AWS resource management.

JavaSpring BootAWS SDKMySQLJWTREST API

Copilot Chat History Search

VS Code extension for GitHub Copilot chat history management. Published on VS Code Marketplace for enhanced developer productivity.

TypeScriptVS Code APINode.js

PDF Q&A Chatbot

AI-powered chatbot using LangChain and LLM models for PDF document analysis. Implements document chunking, embedding generation, and vector store for semantic search with natural language Q&A capabilities.

PythonLangChainOpenAI/LLMHugging FaceVector DBStreamlit

SLO Dashboards & Observability Platform

Comprehensive SLO/SLI dashboards in Grafana for tracking service reliability metrics. Includes error budget burn-rate alerts, availability tracking, and custom PromQL queries for SLO compliance.

GrafanaPrometheusPromQLAlertManager

Education

Bachelor of Engineering - Computer Science

Velammal Engineering College, Anna University

Chennai, India

Jun 2015 - May 2019

GPA: 74%

High School

Adhiyaman Matric H.R. Secondary School

Uthangarai, India

Jun 2014 - Apr 2015

GPA: 94%

Interests

Open SourceGenAI & LLMsAIOpsCloud-Native TechnologiesTechnical Writing

Get In Touch

I'm always open to discussing new opportunities, interesting projects, or just having a chat about technology and SRE practices.

© 2026 Jeevanantham P. All rights reserved.

Built with Astro & React