Site Reliability Engineer (remote)

Be among the first applicants.
Iqtalent
South Africa
Remote
ZAR 400 000 - 500 000
Be among the first applicants.
4 days ago
Job description
Job Description
SUMMARY: InvestEdge is seeking a Site Reliability Engineer. Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other InvestEdge production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the InvestEdge codebase.

SREs specialize in systems (web application stacks, operating systems, storage subsystems, networking), while implementing best practices for availability, reliability, and scalability, with varied interests in algorithms and distributed systems.

SREs work on the Production Support Team. The team’s experience feeds back into other Engineering groups within the company, as well as to InvestEdge resellers running self-managed installations.

SRE Responsibilities

  1. Be on an on-call rotation to respond to incidents that impact InvestEdge.com availability and provide support for Customer Success staff with customer incidents. On-call shifts may include weekend and overnight work.
  2. Use your on-call shift to prevent incidents from ever happening.
  3. Investigate incidents with MSSQL, Log Analysis, RDP, and other monitoring tools.
  4. Build monitoring that alerts on symptoms rather than on outages.
  5. Document every action so your findings turn into repeatable actions and then into automation.
  6. Improve operational processes (such as deployments and upgrades) to make them as boring as possible.
  7. Design, build, and maintain core infrastructure with our Infrastructure, Engineering, and DevOps teams that enables InvestEdge scaling to support hundreds of thousands of concurrent users.
  8. Debug production issues across services and levels of the stack.
  9. Develop and debug configurations for our ETL tooling.
  10. Debug and troubleshoot logical issues in database code for a large existing relational data set.
  11. Debug and troubleshoot performance issues in database code.
  12. Understand the business domain of the application.
  13. Work in an Agile environment on a cross-functional team.

Required Skills:

  • Ability to work 12am-9am EST, Monday to Friday (e.g. 5am-2pm GMT, 6am-3pm CET, etc.)
  • Strong programming skills: Shell and MSSQL.
  • Ability to collaborate and communicate asynchronously.
  • Desire to document all processes to avoid repetitive learning.
  • Enthusiastic, proactive attitude towards fixing issues.
  • Ability to deliver quickly and effectively, with a focus on rapid iteration.
  • Familiarity with VS Code, SSMS, and other IDEs.
  • Previous deployment experience leveraging CI/CD practices and toolchains.
  • Ability to use Gitlab or other VCS (Git, SVN, etc.).
Get a free, confidential resume review.
Select file or drag and drop it
Avatar
Free online coaching
Improve your chances of getting that interview invitation!
Be the first to explore new Site Reliability Engineer (remote) jobs in South Africa