Only Singaporean needs to apply.
Job Scope:
System Reliability and Performance:
Maintain the availability and reliability of services across distributed systems.
Implement monitoring and alerting systems to detect issues before they affect users. Proactively analyze and resolve performance bottlenecks, ensuring optimal system performance.
Automation and Process Improvement:
Develop automation scripts for tasks like deployments, monitoring, backups, scaling, etc. Automate the provisioning and scaling of infrastructure using Infrastructure as Code (IaC) tools. Create self-healing systems that detect and recover from failures automatically.
Monitoring and Performance Management Tools:
Hands-on experience with system and application monitoring tools such as:
AWS CloudWatch, CloudTrail.
Capacity Planning and Scaling:
Monitor resource usage and plan for future capacity based on growth projections. Implement systems to dynamically scale resources based on traffic patterns and system loads.
Log Management and Analysis:
Expertise in managing logs for audit, security, and performance purposes using tools such as AWS CloudWatch Logs.
Security and Compliance:
Knowledge of security best practices for system hardening, patching, and vulnerability management. Familiarity with AWS security tools such as AWS Security Hub, AWS IAM, and Amazon Inspector.
Skillset Requirements:
- AWS resource utilization analysis, capacity planning, and forecasting.
- AWS Administration proficiency
- Proficiency in at least 1 scripting language (e.g. Python, bash)
- Strong understanding of Linux/Unix systems, including performance tuning and kernel optimization
- Experience with load balancing, caching, and database performance optimization
- Min 1-2 Years Exp