Site Reliability Engineer (Kubernetes)
Imagine writing the code at the core of your company’s success
G-Research is a leading quantitative research and technology company. By using the latest scientific techniques, we produce world-beating predictive research and build advanced technology to analyse the world’s data.
Software Engineering is core to our business. By designing and implementing real-time systems, our engineers are solving some of the world’s most complex financial problems.
We are a fast growing software organisation with large distributed systems and extensive testing requirements. We are looking for operationally-minded software engineers or infrastructure specialists who can help us make our developer environments and workflows as effective as possible.
Our current infrastructure is mostly .NET on Windows but we are migrating to a heterogeneous Windows and Linux environment with an emphasis on open source and using the best tools and platform for the job. The team will focus on automation of testing and infrastructure, breaking large projects into smaller components and helping development and core infrastructure teams increase the pace of delivery.
As an SRE, you will be responsible for:
- Acting as a conduit between technical operations and development teams, being sympathetic to the concerns and priorities of both
- Operational support and engineering for multiple large distributed software applications
- Improving all aspects of software reliability, release management and integration testing
- Engaging with our software engineering teams on support issues and improvements to our tools, processes, and software
- Gathering and analysing metrics from both infrastructure and applications to assist in performance tuning and fault finding
Who are we looking for?
The ideal candidate will have:
- Experience with dynamic resource management frameworks, ideally Kubernetes
- In-depth knowledge and experience in at least one of: host based networking, systems administration, systems programming, distributed systems, databases, cloud computing, and a desire to learn more
- The ability to quickly leverage off the shelf and open source systems and utilities to rapidly provision production systems in a variety of domains, especially for multi-tenant use
- A proven track record of automation
- A proactive approach to spotting problems, areas for improvement, performance bottlenecks, etc
- An understanding of the operational concerns in a demanding environment; ideally, but not necessarily, finance
- The ability to understand the inherent trade-offs between various software architectures as it relates to performance, resiliency/fault tolerance, load balancing, data consistency
- Ability to profile and debug applications in real time
Why should you apply?
- Highly competitive compensation plus annual discretionary bonus
- Informal dress code and excellent work/life balance
- Comprehensive healthcare and life assurance
- 25 days holiday
- Contributory pension scheme
- Cycle-to-work scheme
- Subsidised gym membership
- Monthly company events
- Central London office close to 5 stations and 6 tube lines