HPC Storage Engineer (System), NSCC
3 weeks ago
Job Summary:
The HPC Storage Engineer will be responsible for managing the storage infrastructure within HPC environments. This role involves monitori..
Job Summary:
The HPC Storage Engineer will be responsible for managing the storage infrastructure within HPC environments. This role involves monitoring storage performance and optimizing through tuning and troubleshooting. Successful candidate with demonstrated experience in the HPC storage field may be considered for a Senior position.
Responsibilities:
Storage administration and optimization
- Work with Managed Services teams in managing and administering the storage infrastructure.
- Ensure storage reliability and availability within HPC system.
- Provide support on technical queries and troubleshooting storage-related issues.
- Implement best practices for storage monitoring and reporting.
- Monitor utilization rate, allocation and analyse the report for capacity planning.
- Conduct storage performance test and analysis.
- Collaborate with other teams for enhancement of storage performance and scalability.
- Develop and maintain comprehensive documentation for HPC storage infrastructure and processes.
Data management
- Create and execute a data placement strategy to optimize storage performance.
- Maintain data protection and disaster recovery strategies.
- Implement and maintain security measures to protect data integrity and prevent unauthorized access.
Designing and planning
- Assess future requirements and plan for storage expansion.
- Assist in the designing of future HPC system acquisition.
- Explore and evaluate emerging solutions in HPC storage technology.
Qualifications:
- Degree in a Computer Science, Engineering, IT or other relevant areas.
- At least 3 years of experience in managing storage infrastructure.
- Proficient in UNIX/Linux environments and command line interface (CLI).
- Good knowledge of HPC storage principles.
- Experience in managing parallel file system (Lustre, GPFS, BeeGFS).
- Good knowledge of RDMA-based interconnect (InfiniBand, RoCE).
- Demonstrate ability to analyse complex issues and develop effective solutions.
- To be considered for Senior position, candidates should have at least 5 years of experience in roles that involve the deployment of HPC storage infrastructure, covering key areas such as designing, installing, configuring, documentation and providing admin/user training.
Official account of Jobstore.