Codacy is a code quality platform that helps thousands of developers ship billions of lines of code per day by automating and standardizing code reviews. Our mission is to help software development teams make great engineering decisions and create productivity through quality.
We are a small team of highly dedicated and ambitious people. We are curious, funny, radically honest yet kind, and we thrive on collaboration and transparency. Our main focus is on creating value for our customers.
Whether you’re skilled in building, selling, marketing, or supporting, we want you to help us change the developer tools industry.
We are looking for a Site Reliability Engineer to join our Product Team.
What will be your day-to-day?
Monitoring: contribute to the improvement of the monitoring and measurement systems that support our operational scale and continuous delivery. This goes from setting up and maintaining the right tools, to help the different engineering teams on the correct instrumentation of their code;
Availability: work to measure and increase the mean-time-between-failures and decrease the mean-time-to-repair of public-facing systems;
Operations: help the engineering team to operate their systems;
Performance, Efficiency & Latency: contribute to the measurement techniques that assist in the performance tuning of the applications stack, use the monitoring systems to help maintain application performance at acceptable levels, and recommend and implement performance improvements across the stack;
Security & Risk: participate in the ongoing process to identify and mitigate risk in our systems;
Capacity Planning: use our monitoring to advise on capacity requirements;
Engineering Tools: create and maintain tools that can help engineering teams improve their day to day work.
What are the skills and experience needed to do the job successfully?
Datadog, APM , Grafana, Prometheus, Cloudwatch - or similar;
Application development experience with at least one programming language (Java, Scala, Go, python...);
Experience managing systems with daily deployments that have to handle millions of requests;
An understanding that managing systems at scale require end to end infrastructure tools and automation;
Broad knowledge of system administration, networking, databases, security, storage and performance and have expertise in at least one of these disciplines;
Experience aligning with the goals of the DevOps movement in the sense that teams own the full cycle of the development process from design to operation;
Has provided a positive contribution to both operations-focused and development-focused work;
Has built and maintained cloud-based applications and infrastructure;
Has worked with tools and frameworks for automating infrastructure;
Passion for and experience in best practices in systems operations tools and techniques.