Meta (formerly Facebook) (501+ Employees, 34% 2 Yr Employee Growth Rate)
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The Infrastructure Services Engineer will take on complex technical problems, delivering effective and impactful solutions, working, and communicating with distributed teams and key stakeholders, across multiple disciplines. The person should enjoy working in a complex, highly technical environment where innovative design, planning, execution and communication is key to success. The candidate must be able to work collaboratively with cross functional teams to bring innovative infrastructure designs and initiatives from engineering concept to solution, implementing them in new and operational data centers across the globe.
Data Center Infrastructure Services Engineer Responsibilities:
- Represents Site Operations in leading work to define and architect new solutions on global initiatives, by working with key partner teams across multiple disciplines.
- Assemble and lead cognitively diverse teams to address complex engineering challenges, requiring a deep technical expertise as well as a broad understanding of Meta’s overall infrastructure.
- Acts as key Subject Matter Expert and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within the Site Operations environment.
- Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define plans for how to address and mitigate these.
- Effectively bridge between the logical and physical world, ensuring a holistic understanding of the full infrastructure stack.
- Acts as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet, and facilitates resolution of issues drawing on local expertise and global support partners.
- Ability to address issues that often are ambiguous and of global nature, requiring leadership and collaboration across time zones, teams and technical domains.
- Leverages data-driven methodologies to understand a problem at the onset, defining a plan and being able to measure progress throughout a project. Provides data supplied narratives, and ensures a strong focus on continuous improvement.
- Builds and supports strong cross-functional connections with teams across the globe and serves as an advocate for the Site Operations Team with key partners, influencing policies and procedures to improve global data center operations.
- Ability to travel up to 20% to 30% required.
- BS, BEng or BA in technical field or commensurate experience.
- 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment.
- Experience building globally scalable solutions and translating global strategic initiatives into local executable projects.
- Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems.
- Experience building, operating and scaling with Linux or Unix Operating systems.
- Strong understanding of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure.
- Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl.
- Experience in providing technical guidance to external vendors and partners.
- Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management.
- Data Center Design and Expansion. Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure.
- Strong knowledge of storage and AI/ML related services and the hardware that supports them.
- Very strong communication skills and experience working in a highly distributed environment, across teams/department boundaries.
- Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams.