Your responsibilities will include, but are not limited to:
- System Reliability: Design, build, and maintain scalable and reliable infrastructure to ensure high availability, performance, and resilience of systems and applications.
- Automation and Tooling: Develop automation tools and scripts to streamline operational tasks, deployment processes, and monitoring of system health.
- Incident Responses: Lead and participate in incident response and resolution, including root cause analysis, post-incident reviews, and proactive measures to prevent recurrence.
- Performance Optimisation: Identify and address performance bottlenecks, capacity planning, and optimisation of system resources to meet service level objectives.
- Monitoring and alerting: Implement and maintain monitoring systems to proactively detect and respond to system anomalies, performance degradation, and security threats.
- Collaboration: Collaborate with development teams to ensure reliability considerations are integrated into the software development lifecycle and infrastructure design.
- Continuous Improvement: Drive continuous improvement through the implementation of best practices, reliability engineering principles, and the adoption of new technologies.
We regret that only shortlisted candidates will be notified.
Interested applicants please send your updated resume to noga.lim@peopleprofilers.com.
Noga Lim Wei Loong
Registration Number: R1329872
EA License Number: 10C3804
People Profilers Pte Ltd, 20 Cecil St, #08-09, PLUS Building, Singapore 049705
http://www.peopleprofilers.com
Required Skills:
Root Cause Analysis, High Availability, Automation, Security, Software Development, Reliability, Continuous Improvement, Infrastructure Design