x
Get our mobile app
Fast & easy access to Jobstore
Use App
Congratulations!
You just received a job recommendation!
check it out now
Browse Jobs
Companies
Campus Hiring
Download App
Jobs in Singapore   »   Jobs in Singapore   »   Engineering Job   »   Senior Infrastructure Engineer (HPC)
 banner picture 1  banner picture 2  banner picture 3

Senior Infrastructure Engineer (HPC)

Ncs Pte. Ltd.

Ncs Pte. Ltd. company logo

RESPONSIBILITIES:

  • Support High Power Computing and ITSM
  • The System engineer is responsible in specializing in High-Performance Computing (HPC), you will be a key contributor to the design, implementation, and optimization of complex computational systems. Leveraging your expertise in HPC technologies, you will collaborate with cross-functional teams to ensure the seamless integration and performance of high-performance computing environments.

System Design and Implementation:

  • Design, implement, and maintain high-performance computing systems to meet the organization's computational needs.
  • Collaborate with stakeholders to understand performance requirements and hardware specifications.

Parallel Computing:

  • Implement and optimize parallel computing techniques to enhance system performance.
  • Leverage parallel programming languages and frameworks for efficient task execution.

Cluster Management:

  • Manage and optimize HPC clusters, ensuring scalability and reliability.
  • Implement and maintain cluster management tools for efficient resource utilization.

Performance Tuning:

  • Analyze and fine-tune system configurations, hardware, and software for optimal performance.
  • Identify and resolve performance bottlenecks in HPC applications.

Job Scheduling:

  • Utilize job scheduling systems to allocate computational resources and manage workloads efficiently.
  • Collaborate with users to understand job requirements and prioritize computing tasks.

Networking and Interconnects:

  • Configure and optimize high-speed interconnects, such as InfiniBand, for fast data transfer between nodes.
  • Collaborate with network administrators to ensure seamless communication within HPC environments.

Distributed File Systems:

  • Implement and manage distributed file systems for efficient data storage and retrieval.
  • Optimize data access and transfer mechanisms to support large-scale computations.

Fault Tolerance and Reliability:

  • Implement strategies for fault tolerance to ensure system reliability during long-running computations.
  • Troubleshoot and resolve system issues to minimize downtime.

Documentation:

  • Create and maintain detailed documentation of HPC system configurations, processes, and best practices.
  • Develop user guides and training materials for HPC users.

Stay Updated:

  • Keep abreast of emerging trends and advancements in HPC technologies.
  • Evaluate and recommend new hardware and software solutions to enhance system capabilities.


REQUIREMENTS

  • Bachelor’s or master’s degree in computer science, Information Technology, or a related field.
  • Proven experience as a Systems Engineer with a focus on High-Performance Computing.
  • Knowledge of HPC architectures, technologies, and parallel programming languages.

Technical Proficiency:

  • Familiarity with cluster management tools, job scheduling systems, and distributed file systems.
  • Experience with high-speed interconnects (e.g., InfiniBand) and networking in HPC environments.

Problem-Solving Skills:

  • Strong analytical and problem-solving skills to address complex HPC challenges.

Communication:

  • Excellent communication and collaboration skills to work effectively in interdisciplinary teams.
✱   This job post has expired   ✱

Sharing is Caring

Know others who would be interested in this job?