SITE RELIABILITY ENGINEER

Be among the first applicants.
CTOS Data Systems
Kuala Lumpur
MYR 200,000 - 250,000
Be among the first applicants.
2 days ago
Job description

We are Malaysia’s leading Credit Reporting Agency (CRA) and we are aggressively expanding our business, looking for dynamic, driven, and motivated individuals to join our team. Our Direct-To-Consumer segment (D2C) is one of our fastest-growing product areas in the market, with an abundance of expansion plans and innovative ideas on hand.

A Site Reliability Engineer (SRE) is an advanced DevSecOps role that combines software engineering and system administration to ensure the scalability, performance, and reliability of large-scale, cloud-based applications and infrastructure.

Objectives of the Role:

  1. Run the production environment by monitoring availability and taking a holistic view of system health.
  2. Build software and systems to manage platform infrastructure and applications such as:
  • CI/CD Deployment (such as Gitlab)
  • Monitoring System (such as ELK, Grafana, etc.)
  • Provide solution architecture between product, business, and delivery teams mostly on cloud infrastructure before it can be established automatically via IAC (Infrastructure as Code).
  • Improve reliability, quality, and time-to-market of our suite of software solutions.
  • Measure and optimize system performance by anticipating customer needs and fostering innovation for continual improvement.
  • Provide primary operational support and engineering for multiple large-scale distributed software applications.
  • Responsibilities:

    1. Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
    2. Partner with development teams to improve services through rigorous testing and release procedures.
    3. Participate in system design consulting, platform management, and capacity planning.
    4. Create sustainable systems and services through automation and uplifts.
    5. Balance feature development speed and reliability with well-defined service-level objectives.

    Minimum Requirements:

    1. Bachelor’s degree (or equivalent) in Computer Science or related courses.
    2. Experience in DevOps/DevSecOps CI/CD pipeline implementation for both deployment and automation operation.
    3. Experience with cloud provision automation such as Terraform.
    4. Experience with containerization such as Docker, Docker Swarm, Amazon ECS, Amazon EKS, as well as dynamic resource management frameworks (such as Kubernetes).
    5. Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.

    Preferred Skills and Qualifications:

    1. Consistent solution architecture with technical engineering experience.
    2. Programming experience including shell scripting.
    3. Familiar with Linux commands, especially basic Linux commands.
    4. Able to troubleshoot issues and identify the root cause.
    Get a free, confidential resume review.
    Select file or drag and drop it
    Avatar
    Free online coaching
    Improve your chances of getting that interview invitation!
    Be the first to explore new SITE RELIABILITY ENGINEER jobs in Kuala Lumpur