We are in search of a highly motivated reliability engineer to work in the Service Reliability team of the Application Delivery section at ECMWF. In this role, you will support some of ECMWF’s most essential platform services in the areas of observability and identity and access management.
The role requires experience of both IT systems and software development with a focus on maintaining effective operations. Key skills include cloud native, automation, IT observability and service and application performance monitoring, logging and analytics.
Our reliability engineers engage with, advise, steer and support services relevant to the lifecycle of application deployment and hosting, including technical strategy, design, infrastructure, software development, tooling, service transition, service operation and use.
Day-to-day, you will be working as a bridge between the ECMWF Computing Department, in-house and community system/service providers and application developers, and your technical peers of our WEkEO partners (EUMETSAT, Mercator Ocean, and EEA) advocating good practice and building a greater understanding of architecture and design to enable reliable and performant operations of the WEkEO distributed platform.
The role sits in the Service Reliability team, within the Application Delivery section of our Computing Department. The Section provides platforms and services that enable ECMWF teams to consume computing resources at different levels (PaaS, SaaS) and to consistently deploy applications with different levels of support to a high degree of quality and reliability.
The section achieves this through innovation in the areas of computer systems administration automation, application deployment and operation, reliability engineering, identity and access management, container orchestration, observability (monitoring, logging, and tracing), and PaaS/SaaS application development.
Within the Section, the Service Reliability Team is responsible for IT observability, service monitoring, configuration management, centralised logging and analytics, identity and access management, and helping services to run reliably and with good performance.
Demonstrable knowledge and skills in some of the following:
A working knowledge in some of the following is desirable:
Candidates must be able to work effectively in English and interviews will be conducted in English. A good knowledge of one of the Centre’s other working languages (French or German) is an advantage but not required.
Grade remuneration The successful candidate will be recruited at the A2 grade, according to the scales of the Co-ordinated Organisations. ECMWF also offers a generous benefits package, including a flexible teleworking policy. The position is assigned to the employment category STF-C or STF-PL, as defined in the ECMWF Staff Regulations.
Starting date: As soon as possible
Contract duration: For STF-C: 4 years or for STF-PL: Approx. 3.5 years to 30 Sept 2028
Location: Bologna, Italy (Candidates are expected to relocate to the duty station)
As a multi-site organisation, ECMWF has adopted a hybrid working model that allows flexibility to staff to mix office working and teleworking.
Applicants are invited to complete the online application form by clicking on the apply button below.
At ECMWF, we consider an inclusive environment as key for our success. We are dedicated to ensuring a workplace that embraces diversity and provides equal opportunities for all.
Applications are invited from nationals from ECMWF Member States and Cooperating States (for STF-C), as well as from all EU Member States (for STF-PL).
In these exceptional times, we also welcome applications from Ukrainian nationals for this vacancy.
Applications from nationals from other countries may be considered in exceptional cases.