Develop software and software fixes to integrate internal systems. Ensure code quality, test and distribute code updates, and monitor the health and stability of the servers.
What you'll do:
Site availability and incident response
Monitoring, reacting, building and automating code deployment systems
Create deployment systems and tools that simplify development work for Chrome River’s SDLC as it relates to infrastructure
Scripting, networking, some db functionality, reading writing and optimizing highly distributed systems
Scripting in: Bash, Shell, Python will be used daily
Monitoring and reporting of issues/incidents
Be an advocate for reliability, scalability and system availability
Accountable for post-incident documentation
Monitoring and providing remediation of production incidents
Provides high level technical synopsis, sharing technical details for incidents in Slack
Communicating daily with engineers and non-engineering resources.
Manages Sev2 incidents
Other duties as assigned
Mentoring associate developers in best practices on all aspects of reliability, scalability and high availability
What we're looking for:
Bachelor’s degree in Computer Science, Information Technology, or similar field required
Minimum of 5 years’ experience in an engineering role required
Working knowledge of Ansible and Terraform tools highly desirable
Experience with JIRA, scripting languages like Java and Python, programming, knowledge of enterprise-level applications, basic accounting knowledge highly preferred.