Grafana Labs

Staff Software Engineer - Grafana Cloud k6

Grafana Labs · Canada (Remote)

🔥22 people viewed this job

About the Role

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. Grafana Cloud, our fully managed observability platform, is flexible and built for scale. With Grafana Cloud's actually useful AI, organizations can see, understand, and act on all their disparate data to move at the speed of their ambitions. Today, more than 35 million users and 7,000+ customers – including Anthropic, Bloomberg, NVIDIA, Microsoft, and Salesforce – trust Grafana Labs to ensure reliability of their applications and systems, resolve incidents quickly, and optimize their telemetry to reduce noise and cost. We are a 100% remote company with 1,600+ team members across 40+ countries, and we're backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital. Learn more at grafana.com and follow us on LinkedIn and X. We're scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do. You may not meet every requirement, and that's okay. If this role excites you, we'd love you to raise your hand for what could be a truly career-defining opportunity. This is a remote opportunity, and we would be interested in applicants in Canadian time zones (EST or CST). Staff Software Engineer - Grafana Cloud k6 The Opportunity We are the team behind Grafana k6, Grafana Cloud k6, and Grafana Cloud Synthetics, used by teams globally to ensure resilient, high-performing systems. This opportunity is with the Grafana Cloud k6 squad, who build and operate our performance testing product. Grafana Cloud k6 is built around the OSS k6 and targeted at users looking to run performance tests at scale. Our enterprise and SaaS offerings allow customers to load test their systems by running distributed tests from 15+ regions worldwide, using hundreds of thousands of virtual users sending millions of requests per second. We ingest huge volumes of data generated by k6, which can be used to view, correlate and analyze metrics from each test. k6 is a product used by other engineers, and as such, we are looking for people enthusiastic about building high-quality tools they would want to use themselves. Due to our small teams and fast development pace, you will have a substantial and immediate impact on how the end product is architected, developed, and how the engineering team operates. Your role will focus on establishing and scaling a cross-team culture of engineering excellence by setting standards and guiding adoption of strong engineering practices that improve reliability and operational ownership. As this foundation matures, the role is expected to expand into broader application and product development leadership, contributing architectural and technical depth beyond operational excellence. What will you be doing? • Contribute hands-on to the codebase by designing and implementing production-quality software. • Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems. • Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability. • Help mature SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management. • Establish reliability frameworks such as SLIs/SLOs and error budgets, and use them to guide prioritization and engineering trade-offs. • Provide visibility into system health through clear operational metrics and reliability reporting. • Participate in the on-call rotation as a primary escalation point and contribute to incident resolution. • Influence product and system direction through design reviews, architectural discussions, and cross-team collaboration. • Share knowledge through clear, high-quality documentation and technical communication—internally and, where appropriate, externally—to help teams build and operate systems more effectively. • As the reliability foundation matures, grow into broader application and product development leadership, contributing architectural and technical depth beyond operations. We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction. We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups—always paired with strong code review and quality standards. You'll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro). Re
Observability500-1000 employeesNew York, NYFounded 2014💰 Series D

Grafana Labs has 43 open positions on Remote Vibe Coding Jobs.

GoTypeScriptReactPrometheusGrafana
Fully remote · Equity · Flexible PTO

💬 Developer Questions

Ask the team a question — answers show up here

🎯

What does the interview process look like?

🤖

What AI/vibe coding tools does the team use daily?

👥

How big is the engineering team?

Is the team fully async or are there required meetings?

🚀

What does onboarding look like for remote hires?

🔧

Can you share more about the tech stack and architecture?

📈

What does career growth look like in this role?

📅

What does a typical day look like?

💰

Is there a salary range you can share?

📊

Is equity or stock options part of the package?

🌍

Are there timezone requirements or preferences?

🛂

Do you sponsor work visas?

🏢 Is this your listing? Claim it to answer questions

Similar Jobs

Helpful resources

Hiring for a similar role? Post your job here — it's free →