Careers at Facebook

Software Engineering

Cluster Operations Engineer

स्थलMenlo Park, CA
Facebook was built to help people connect and share, and over the last decade our tools have played a critical part in changing how people around the world communicate with one another. With over a billion people using the service and more than fifty offices around the globe, a career at Facebook offers countless ways to make an impact in a fast growing organization.
Cluster Operations Engineers at Facebook are hybrid software/systems engineers who ensure that Facebook's core infrastructure run smoothly and have the capacity for future growth. They work with our core infrastructure teams and own cluster deployments. Our team is comprised of varying levels of experience and backgrounds, from new grads to industry veterans. Relevant industry experience is important, but ultimately less so than your demonstrated abilities and attitude. We sail into uncharted waters every day at Facebook and we are always learning. This position is full-time and located in our Menlo Park office.

Responsibilities

  • Own the cluster deployments, decommissions and testing and the core infrastructure that it takes to run them
  • Write and review code, develop documentation and capacity plans, and debug the hardest problems, live, on some of the largest and most complex systems in the world
  • You will share an on-call rotation, backed by our 24x7 Site Reliability Operations team
  • Partnered alongside the best engineers in the industry on the coolest stuff around, the code and systems you work on will be in production and used by millions of users all around the world

Requirements

  • BS or MS in Computer Science, Engineering, or a related technical discipline or equivalent experience
  • Extremely sound knowledge of UNIX and TCP/IP network fundamentals
  • Ability to code really well in at least one language (even if it is not one that Facebook uses)
  • Ability to rapidly learn new development languages (PHP and Python are all in heavy use)
  • Ability to pick up new software, frameworks and APIs quickly
  • Sharp and tenacious troubleshooting skills: you can fix anything
  • Good knowledge of basic large-scale internet service architectures (such as load balancing, LAMP, CDN's), even if you haven't worked on one
  • Configuration and maintenance of common applications such as Apache, Memcached, Squid, MySQL, NFS, DHCP, NTP, SSH, DNS, CHEF, and SNMP
  • A healthy respect for our motto 'Move Fast and Break Things,' but always make sure you know how to fix them too
  • Good communications skills
  • Detail oriented and careful
EOE Minorities/Females/Protected Veterans/Individuals with a disability.
Apply now
Apply now
Please limit to 3 applications.

Other positions in Systems

Back to All Jobs