Enable job alerts via email!

Senior Site Reliability Engineer | London, UK

Tradeweb Markets

London

Remote

GBP 50,000 - 90,000

Full time

15 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a skilled Site Reliability Engineer to join their dynamic team. This role involves enhancing platform reliability through effective incident management, observability tools, and automation practices in a collaborative environment. With a focus on security and cloud technologies, you will play a crucial role in maintaining high system availability and performance. If you are passionate about technology and thrive in a fast-paced setting, this opportunity offers the chance to make a significant impact while working remotely with a talented group of professionals.

Qualifications

  • 6+ years in technology operations and engineering experience.
  • 4+ years of scripting experience in Python preferred.

Responsibilities

  • Participate in agile environment, contributing to platform improvements.
  • Ensure system reliability and availability through monitoring tools.

Skills

Scripting/Coding
Agile Methodology
Reliability Engineering
Incident Triage and Resolution
Communication and Collaboration
Observability Tools
Cloud Architecture
Linux/Unix Tools

Education

Bachelor's Degree in Computer Science

Tools

AWS
GitSecOps

Job description

Job Description

ICD is treasury's trusted provider of investment technology and the corporate client channel of Tradeweb, a leading global operator of electronic marketplaces for rates, credit, equities and money markets. ICD provides tools for organizations to independently research, trade, analyze, and report on investments. With ICD Portal, over 500 organizations across 65 industries in more than 45 countries gain unbiased access to the market for managing liquidity. Organizations can manage risk across their entire investment portfolio with the AI-driven solution, ICD Portfolio Analytics. All of ICD's award-winning technology solutions are co-innovated with clients, making ICD a preferred provider among corporate treasury professionals.

At ICD, our team of dedicated professionals is passionate about fostering a creative and collaborative culture that leads to company success. As part of Tradeweb, we share a commitment to prioritize the needs of our clients to help continually deliver innovative, best-in-class solutions. Our work environment is fast-paced, dynamic, and fun and filled with individuals from diverse backgrounds and experiences.

Group Details

Job Responsibilities

  • Practicing an Agile Methodology: Participate in an agile environment, contribute to epic planning, backlog grooming, creating stories and sprints that will lead to iterative improvements of our platform.
  • Security: Prioritize security in all aspects of work, ensuring that it is the foundational consideration in every task performed.
  • IaC Automation and Tooling: Practice GitSecOps by contributing to the development and delivery of a highly available platform through automation. Continually improve the reliability and efficiency of systems through iterative processes while reducing toil.
  • Reliability Engineering: Work to ensure the reliability and availability of systems. Develop and maintain monitoring tools, analyze system performance, and implement solutions to improve overall system reliability.
  • Incident Triage and Resolution: Triage issues, assess risk, and prioritize remediation with service teams. Take full ownership and drive resolution of production, quality engineering and development-related infrastructure issues.
  • Communication and Collaboration: Effectively communicate issue statuses to both R&D and non-technical audiences. Ability to manage context switching when required. Collaborate closely with software development teams to influence architecture and design decisions that impact the reliability and performance of systems.
  • Observability: Develop observability tools to fulfill the needs of SLOs. Define and measure Service Level Objectives (SLOs) to ensure that the systems meet reliability standards.
  • On-call Responsibilities: Fulfill regular on-call duties to enable high system availability.

Qualifications
  • 6+ years of equivalent technology operations and engineering experience.
  • 4+ years of scripting/coding experience in any modern language (Python Preferred).
  • 4+ years as an SRE or similar individual-contributor role supporting public cloud (AWS) and cloud native technologies (Lambda, EKS, SNS, SMS, etc.).
  • Bachelor's Degree or higher in Computer Science or related field.
  • Cloud-based virtualization expertise, particularly with AWS native services.
  • Strong multitasking skills in a dynamic environment.
  • Proven ability to work independently with a proactive, task-ownership approach, applying critical and creative thinking.
  • Collaborative mindset, adept at negotiating, influencing, and developing partnerships within a team environment.
  • Knowledge in information, network and Internet security, including threat modeling, cloud architecture, web protocols, and common attack surfaces.
  • Deep understanding of Linux/Unix tools and architecture.
  • Demonstrated proficiency in designing, implementing, and troubleshooting diverse network infrastructures, with comprehensive knowledge of protocols, performance and routing.

Work Environment
This position will be primarily remote.

Equal Opportunity Employer
Tradeweb Markets LLC ("Tradeweb") is proud to be an EEO Minorities/Females/Protected Veterans/Disabled/Affirmative Action Employer.

Privacy Policy
Private Policy Statement Link: View Policy
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.