Requirements
- Technical expertise: Experience in roles such as Site Reliability Engineer (SRE), Platform Engineer, Cloud Operations, or Sys Admin
- Kubernetes proficiency: Strong hands‑on experience with Kubernetes (k8s) and containerization in production environments
- Linux expertise: Proficiency in maintaining and troubleshooting Linux operating systems and distributed systems at scale
- Infrastructure knowledge: Solid understanding of core infrastructure services such as DNS, Identity Management, load balancers, and web servers
- Networking basics: Foundational knowledge of computer networking
- IaC and scripting: Experience with Infrastructure-as-Code tools like Terraform/CloudFormation and configuration management tools like Puppet. Proficiency in at least one programming or scripting language
- Proactive problem-solver: A proactive attitude with a willingness to take on new challenges, deliver results, and learn new technologies quickly
- Collaboration skills: Outstanding communication and social skills, with the ability to connect with others and problem‑solve effectively
What the job involves
- This is a hands‑on role in a team responsible for ensuring Mimecast’s Hybrid Cloud infrastructure remains secure, resilient, and scalable
- You’ll work with cutting‑edge technologies like Kubernetes, AWS, and Infrastructure‑as‑Code, all while collaborating with a supportive and talented team
- Ensure platform reliability: Configure and maintain infrastructure to ensure optimal performance, security, and availability
- Manage containerization platforms: Operate and improve Kubernetes and AWS EKS to support Mimecast’s microservices architecture
- Automate infrastructure: Build reusable Infrastructure‑as‑Code (IaC) and automation to standardize infrastructure provisioning and deployment processes
- Troubleshoot and resolve issues: Identify and resolve infrastructure issues across Linux and Kubernetes systems in a hybrid‑cloud environment, minimizing downtime
- Enhance security: Strengthen the security posture of infrastructure configurations, aligning with the latest cybersecurity standards
- Improve processes: Collaborate with the team to continuously refine operational processes and documentation
- Maintain observability tools: Manage and operate monitoring and observability tools like Graphite, Prometheus, Grafana, Elastic, Nagios, and LogScale
- Support engineering teams: Provide exceptional support to internal Product and Engineering teams, meeting their requirements for the Mimecast Cloud Platform
- Participate in on‑call rotations: Support the team by participating in on‑call rotations and performing out‑of‑hours maintenance as necessary
- Develop deep expertise in hybrid‑cloud infrastructure, Kubernetes, and automation
- Build strong Infrastructure‑as‑Code and automation skills in a production environment that demands them
- Grow into senior engineering or team lead roles within Cloud Platform
- Work alongside experienced engineers and leaders who are committed to your development
- The Site Reliability Engineering team exists to ensure the reliability and resiliency of Mimecast’s Hybrid Cloud infrastructure. The team’s purpose is to deliver a platform that is secure, performant, and scalable, enabling Product and Engineering teams to innovate faster and deliver more value to customers
- Day‑to‑day, you’ll work in a collaborative environment, tackling complex challenges and contributing to the continuous improvement of Mimecast’s infrastructure. The expectation is hands‑on technical credibility combined with a proactive approach to problem‑solving and teamwork
#J-18808-Ljbffr…
