About The Role
An engineering role at Cloudflare that provides an opportunity to address some big challenges, at scale. In this role, you will be focused on ensuring the stability of our global network. You'll work closely with Cloudflare’s SRE (Site Reliability Engineering) team, Network Engineering team, and with various vendors and partners (including hardware vendors, datacenter and network providers, and ISPs) to maintain and improve our global infrastructure. You will further be responsible for the development and implementation of consistent processes and visibility measurements for consistent and effective management of our infrastructure. This is a highly visible position that requires deep technical understanding of datacenter infrastructure, physical and logical networking, linux, and basic experience with data analysis and project management.
To be successful in this position, you should have excellent technical skills, communication skills, and be able to navigate a range of challenges and constraints (e.g. schedule adherence, time zones, and cultures). You will have the opportunity to (literally) build a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month.
Other Responsibilities May Include
- Monitoring, repairing, and maintaining hardware, software, and network in Cloudflare data centers.
- Creating documentation and managing remote contractors to complete datacenter related work, including hardware manufacturers, datacenter and network providers, logistics partners and other service providers in support of our 250+ growing datacenter locations
- Aggressively seek opportunities to introduce cutting-edge technology and automation solutions that are effective, efficient and scalable in order to improve our ability maintain our global. infrastructure.
- Providing technical leadership and guidance during deployment activities.
- Creating and maintaining documentation, plans, SOP’s, MOP’s etc.
- Collaborating with internal teams (infrastructure engineering, network engineering and SRE) for day to day activities.
- Assisting with the definition, documentation and implementation of consistent processes across all regions.
- Limited travel
Required Experience
- Minimum of 2 years of related data center or Linux systems administration experience
- Network hardware administration
- Linux/Unix systems administration
- Basic configuration management tool experience like Saltstack, Chef, Puppet or Ansible
- Familiarity with day-to-day tasks and projects common in Data Center Operations
- Ability to write scripts for internal tool
Examples Of Desirable Skills, Knowledge And Experience
- Bachelor’s degree; technical background in engineering, computer science, or MIS a plus.
- Direct experience executing on datacenter / infrastructure projects with many moving parts.
- Previous experience installing / maintaining datacenter (and other IT) infrastructure and DCIM tools.
- Professional level network certification(s) (JNCIP, CCNP, etc) or higher
- Good working knowledge of Junos, NX-OS and EOS
- Strong understanding of BGP and anycast routing
- Experience with optical transport technologies such as CWDM/DWDM
- Experience running and improving operational processes in a rapidly changing environment.
- Strong verbal and written communication skills, problem-solving skills, attention to detail, and interpersonal skills.
- Must be proactive with proven ability to learn fast and execute on multiple tasks simultaneously.
- Ability to manage MS excel and Google spreadsheets.
- Comfortable handling basic program management responsibilities (prioritization, planning, scheduling, status reporting) such as JIRA
- Experience managing remote contractors
- Must be a team player.
Bonus Points
- Experience working in a 24/7/365 service environment
- Comfortable with remote “lights-out” and out-of-band access to data center resources
- Linux certifications
- Knowledge of the OSI-model and experience isolating network, hardware and software issues