IT Infrastructure Site Reliability Engineer

3 - 5 Years

you’ll be our: Site Reliability Engineer

you’ll be based at: IBC Knowledge Park, Bengaluru

you’ll be Aligned with: DevOps Lead

you’ll be the member of:IT CloudOps Team

What you’ll do at Ather:

  • Design, develop ,build and deploy world class Cloud Native Infrastructure for Enterprise SaaS

  • Improve infrastructure stability, reliability , performance and scalability of Cloud Native Platform Infrastructure & Applications to meet ever increasing customer demands

  • Build the observability stack using a combination of open source and industry standard tools

  • Write code and apply engineering best practices and tools to automate operational tasks

  • Be responsible for the overall reliability and stability of Cloud Services

  • Build end to end diagnostics and tooling to troubleshoot complex issues affecting performance and scaling.

  • Refactor existing code and service infrastructure to ensure scalability and reliability..

  • Identify process gaps and implement process improvements to increase operational efficiency.

  • Participate in the development of tools, systems and processes aimed at improving product supportability and overall support productivity.

  • Documentation for processes and applications to be used by TechOps team

  • Participate actively in detecting, remediating and reporting on Production incidents, ensuring the SLAs are met and driving Problem Management for permanent remediation.

  • Participate in on-call rotation to ensure coverage for planned/unplanned events.

  • Engage with other Engineering organizations to implement processes, identify improvements, and drive consistent results

Here’s what we are looking for: 

  • Experience working as a developer and/or Site Reliability Engineer/ DevOps

  • Experience with Devops practices

  • Experience of Golang OR Python

  • Proven track record building/supporting/scaling a high transactional 24x7 SaaS solution on any Cloud layer (GCP Preferred)

  • Experience with Security as it applies to infrastructure, systems and network engineering

  • Strong Linux administration, internals, and network troubleshooting

  • Experience of infrastructure automation, such as Terraform or Ansible, and Helm, building/using/deploying Containers.

  • Experience with containerization technologies such as Docker, Kubernetes

  • Experience with logging and monitoring tools such as Grafana, Prometheus, Splunk

  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions

  • Strong fundamentals in HTTP including HTTP headers and web servers

  • Experience of queued or pipelined cloud services.

  • Experience of Agile development, DevOps models or similar methodologies

  • Experience working with Agile methodologies (Scrum) and cross-functional teams

  • Understanding of SLA, SLI, SLO

  • Passionate about SRE, DevOps, Automation and infrastructure platforms. Must excel with agile and lean development practices and manage multiple priorities and multiple roles.

  • Understanding the basic monitoring principle RED, USE

  • Experience in DevOps, Site Reliability, or infrastructure engineering

  • Proficiency with a programming language like Python and shell scripting to automate tasks

  • Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory

You bring to Ather: 

  • Minimum of 3 years of work experience

  • Experience with source control, including pull requests, branching and merging (github).

  • Knowledge on AWS Lambda, React JS.

  • Experience with cloud security concepts and tool

  • Familiarity with Open Tracing/Open Telemetry

Enter Details