Senior/Staff Site Reliability Engineer

Apply for this job
Team:
IT
Locations:
Berlin, Mountain View, Portland, San Francisco

The Mozilla IT team is looking for site reliability software engineer to join our growing SRE team at Mozilla. Are you someone that lives and breathes automation? Do you enjoy building tools for increased reliability? Do you love to deeply engage with engineering teams on ensuring the success of their products and services? If writing code and working closely with product teams, including sharing responsibility for the entire stack of a service, is what you live for, we would love to talk to you.

We're looking for new team members that are excited about working across a broad range of product areas, both those that we've bought or integrated, like our data platform, and ones we've built in-house, such as our user facing web properties. Through partnerships across the entire organization, we find ways to embed people like you into other teams to help bring engineering and operational excellence to projects and products. You will find yourself helping both with the development and deployment of specific technologies, as well as bringing back key insights and creating reusable patterns for future engagements. Furthermore, we value flexibility, so you're someone that is open to rotating across products and learning new skills regularly.

At Mozilla you will…

  • Partner with, and embed in, product development teams to build their services quickly, reliably, and securely. You’re comfortable with one or more programming languages, as well as system design and security fundamental
  • Tackle the engineering and operations responsibilities of services, including making changes to the OS all the way through writing application code
  • Enhance automation of deployments, particularly insisting on CI/CD wherever possible, and creating repeatable processes. You’re always looking to find a way to code a manual step out of existence
  • Improve the reliability of services through service design reviews, good architecture, and continual improvement. You’ll stress test environments and apply chaos engineering regularly, always ensuring that the system works the way it should in the face of failure scenarios
  • Write runbooks, improve documentation, and act as an authority in a domain. While a lot of the code you write will be self-documenting, you also value the ability to share information and provide transparency into all of the phenomenal work you do

Your Professional Profile

  • Fluent, and proven, ability to code in Python, Go, or Ruby
  • Three or more years of experience with AWS or GCP and a strong grasp of using cloud platforms to deploy infrastructure, including serverless patterns
  • Have developed solutions using containers, particularly Kubernetes
  • Hands-on experience with Infrastructure as Code tooling such as Terraform and CloudFormation
  • Worked in an Agile fashion, using Scrum, Kanban, or similar and delivering iteratively
  • Familiarity with data management, data analytics, and data tooling
  • Experience with configuration management tooling, such as Chef, Puppet, or Ansible
  • Knowledge of monitoring and logging solutions, such as Prometheus, CloudWatch, New Relic, DataDog, and CloudWatch Logs

About Mozilla

Mozilla exists to build the Internet as a public resource accessible to all because we believe that open and free is better than closed and controlled. When you work at Mozilla, you give yourself a chance to make a difference in the lives of Web users everywhere. And you give us a chance to make a difference in your life every single day. Join us to work on the Web as the platform and help create more opportunity and innovation for everyone online.

We are an equal opportunity employer and value diversity. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Level: P4

#LI-ZD1