Software Developer - Reliability

Robinhood
Old Toronto
CAD 80,000 - 100,000
Job description

About the team + role

Join our Reliability Engineering team, focused on designing, evolving, and maintaining large-scale distributed systems. As a Software Developer, you'll collaborate across teams to build robust, scalable systems that ensure high availability and low latency.

The Reliability team currently has two significant areas of focus

  1. Building our company wide software system that tracks all outages/SEVs for the organization. A sophisticated tool where all product/infra teams collaborate to identify, remediate, and track significant outages.
  2. Track and monitor our most critical workflows for the business, identifying issues early and being a critical component of long term reliable performance of these core workflows.

As Software Developer, you will combine your software and systems knowledge to engineer distributed systems that are reliable, scaleable and fault-tolerant for Robinhood. You will be part of the larger infrastructure organization where you will work cross functionally with other infra teams to make that a reality.

Our technology stack is primarily built using Python/Go and implemented using container orchestration technologies such as Kubernetes. We also build our systems using microservice-oriented architectures and related OSS technologies (e.g Kafka, Celery/RabbitMQ, nginx, Redis, Postgres, Airflow, Consul, etc.). Our systems are primarily built within AWS.

What you’ll do

  • Design and Implement new features and services with a focus on high availability, low latency, and scalability.
  • Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD and observability.
  • Act as an owner and leader of Robinhood's infrastructure by ensuring project infrastructure needs are met and working proactively with customer teams to help them improve reliability.

What you bring

  • Fluent in one or more programming languages (e.g. Go, Python, Java).
  • Experience authoring and operating high-scale services.
  • Experience with scalable distributed systems, either built from scratch or on public Cloud (e.g. AWS) primitives.
  • Pluses if you have experience with Python/Django/Go and AWS

Our team is here to enable an inclusive and welcoming interview experience for all candidates. If you need additional assistance throughout the interview process related to a physical or mental condition, or if there is something our team can do to enable a more accessible experience at any time, please notify our team by completing this Applicant Accommodation Form.

Get a free, confidential resume review.
Select file or drag and drop it
Avatar
Free online coaching
Improve your chances of getting that interview invitation!
Be the first to explore new Software Developer - Reliability jobs in Old Toronto