Nikhil Jha

Software Engineer | Site Reliability Engineer

LinkedIn

About

Software Engineer with 5+ years in SRE, DevOps, and R&D, skilled in Python, SQL, Linux, DevOps cloud tools, seeking a Software/DevOps/Data Engineering role to leverage automation, API or data-oriented skills for ETL, insights and visualizations.

Work Experience

Research Engineer

Nokia Solutions and Networks

Sep 2024 - Present

Contributing to the Research and Development team as a Software Engineer, focusing on enhancing system reliability, monitoring, and automation solutions for cutting-edge networking technologies.

  • Led comprehensive analysis of Commvault backup solutions, documenting advantages over traditional methods to support strategic adoption decisions and optimize data management.
  • Engineered and integrated Provisioner Node with Prometheus and Grafana, establishing real-time system monitoring, metric visualization, and automated alerting capabilities.
  • Developed robust VM health check scripts for OpenShift Container Platform (OCP) VMs, enhancing system stability through automated liveliness and readiness probe validation.
  • Automated Keycloak client configuration and identity/access management processes through scripting, significantly streamlining setup and modification workflows.
  • Designed scalable Helm templates for generic and dynamic parameterization of OpenShift network policies, improving deployment flexibility and consistency.

Senior Associate Consultant

Infosys

Sep 2022 - Aug 2024

Served as a Site Reliability Engineer, providing critical Lower Environment Support to ensure smooth end-to-end order placement flow in client's various Trading applications/channels, focusing on root cause identification and rapid issue resolution.

  • Functioned as a Site Reliability Engineer, providing critical lower environment support to ensure seamless end-to-end order placement across diverse client trading applications; consistently identified root causes and resolved issues reported by QA/testing teams.
  • Proactively monitored mail and Splunk alerts/dashboards to detect and address failures across the entire trade order lifecycle, ensuring operational continuity.
  • Authored comprehensive Confluence Runbook entries detailing remediation strategies and potential root causes, serving as a vital resource for team knowledge sharing and future troubleshooting.
  • Automated service management across multiple remote Windows/Linux VMs by implementing PowerShell scripts, significantly reducing manual effort and improving operational efficiency.
  • Developed a Python-based mailer service with HTML/CSS, automating the delivery of consolidated reports on application service status and API health checks during critical pre-market hours.
  • Engineered Python scripts for end-to-end flow testing of diverse financial orders (equity, mutual funds, contingent, stock bundles) across various channels; identified microservice issues and validated database persistence in MongoDB, DB2, and Aerospike, providing automated reports.
  • Orchestrated the implementation of automated password rotation for multiple service accounts using Safeguard, including comprehensive pre and post-impact analysis to ensure seamless transitions.
  • Developed Python scripts leveraging Swagger APIs to automate the purging of open orders from Aerospike and MongoDB, optimizing database performance and data hygiene.
  • Leveraged GCP Cloud Logging and Monitoring to conduct real-time analysis, generate alerts on log data, and track performance metrics, enhancing system observability.
  • Deployed a Flask application on GCP App Engine to monitor on-premise VMs, providing critical CPU, disk, and memory usage metrics for proactive resource management.

DevOps Engineer

Wipro Limited

Nov 2019 - Sep 2022

Understood the needs of clients from the banking domain on the DevOps platform, providing automated solution templates and add-ons using Python, PostgreSQL, Grafana, TeamCity, and various DevOps tools.

  • Collaborated with banking clients to understand DevOps platform needs, delivering tailored automated solution templates and add-ons using Python, PostgreSQL, Grafana, TeamCity, and other DevOps tools.
  • Designed and implemented a self-service workflow for assessing EAR/WAR file migration feasibility to OpenShift Container, leveraging RHAMT, Shell, Python scripting, and TeamCity as a pipeline engine.
  • Led the implementation of business logic, backend flows, and database schema designs for Data Collectors and Analysis Engines, enhancing DevOps KPI tracking across CODE, CI/CD, and IaC categories.
  • Engineered Python scripts to automate data collection from various DevOps tool REST APIs into PostgreSQL, and scheduled cron jobs within TeamCity and OpenShift Container Platform.
  • Developed comprehensive Grafana dashboards to visualize DevOps KPIs, trends, and patterns across 3000+ client applications, providing actionable insights into DevOps adoption and maturity.
  • Designed and deployed a Flask application on OpenShift with API endpoints, enabling the primary application to retrieve data dumps from TeamCity and Bitbucket webhooks into PostgreSQL/MongoDB for downstream microservice consumption.
  • Developed an automated pipeline for proactive product/tool upgrades across servers by integrating Python with pre-written IaC configurations and TeamCity build templates, ensuring system currency and compliance.
  • Built a Python/HTML/CSS mailer service to alert application managers about expiring products/applications and automated JIRA ticket creation for tracking upgrade statuses.
  • Orchestrated an automated pipeline for EOVS patching and Couchbase upgrades across servers using TeamCity and Ansible playbooks, ensuring consistent and efficient deployments.
  • Developed Python scripts to parse XML configuration files from IBM WebSphere Application Server (WAS), facilitating automated configuration management.
  • Designed a QlikView dashboard to visualize and showcase the inventory of DevOps and testing tools, improving asset visibility.
  • Managed production releases for non-critical applications using UDeploy, provided issue support, and monitored system health (engines, pipelines, servers, file integrity, CPU usage, logs).

Education

Computer Science & Engineering

ABES Institute Of Technology

69%

Aug 2015 - Jun 2019

Science

St. Karen's High School, Patna

82%

Aug 2013 - Mar 2015

General Studies

St. Karen's High School, Patna

88%

Mar 2000 - May 2013

Certificates

Google Cloud Certified Associate Cloud Engineer

Google Cloud

Mar 2024

Google Cloud Certified Cloud Digital Leader

Google Cloud

Jan 2024

AWS Certified Cloud Practitioner

Amazon Web Services (AWS)

Oct 2023

Languages

English (Fluent) , Hindi (Native)

Skills

Programming Languages

  • Python
  • SQL
  • HTML
  • CSS
  • Java

Databases & Frameworks

  • PostgreSQL
  • MongoDB
  • Flask
  • MySQL
  • DB2

DevOps & Cloud Tools

  • JIRA
  • Grafana
  • TeamCity
  • Postman
  • Bitbucket
  • SonarQube
  • Jenkins
  • OpenShift Container Platform
  • UDeploy
  • AWS
  • Docker
  • Pivotal Cloud Foundry
  • Ansible
  • Splunk
  • GCP
  • Kubernetes
  • Prometheus
  • Helm

Technical Skills

  • Site Reliability Engineering (SRE)
  • API Development
  • Automation Scripting
  • Dashboard Visualization
  • End-to-End Testing
  • Identity and Access Management (IAM)
  • Configuration Management
  • System Monitoring
  • Troubleshooting