We are looking for a Site Reliability Engineer who can help us operate and scale a reactive, event-driven system based on bleeding edge technologies (Scala, Akka, Spray, Reactive Programming, iOS, Swift, Docker…), a modern architectural style (Micro Services, CQRS, Event Sourcing, Eventual Consistency), and a clean codebase (Clean Code, Domain Driven Design…).
We currently use the following technologies:
On the DevOps side of the things, we use Docker and Ruby to automate everything on AWS/EC2. We orchestrate our Docker-ready micro services using an internal Ruby tool that we call OMfleet, based on the ideas of CoreOS’s fleet and consul.io. We use InfluxDB for time series and metrics, and Elastic Search for monitoring.
Operational constraints are a first-class citizen in our development process and taking care of them is part of our definition of DONE. In terms of our culture, we get some inspiration from the Open Source model, to achieve high cohesion (within teams) and low coupling (between teams): small, empowered teams, systematic pull requests, developer autonomy.
Your qualifications and experience:
We are not recruiting based on specific technology experience, but you are expected to be able to quickly contribute.
Show us what you can do:
If you have a github/bitbucket account, we would love to take a look at what you like to do (and if you’re not thrilled with it in retrospect, don’t worry! Simply explain to us what you’d improve if you were to do it now!)