← Back to results

Site Reliability Engineer

Location
San Francisco
Compensation
Not disclosed
Level
senior
Type
full time

Requirements

Experience
7+ years

Joblaze summary

In this role, the Site Reliability Engineer will manage the infrastructure for Airbyte's Data Replication platform, ensuring reliability and efficiency across millions of sync jobs. Key skills include expertise in Kubernetes, Terraform, and observability tools, along with a strong background in infrastructure or DevOps. This position is ideal for seasoned engineers who thrive in fast-paced, startup environments and are eager to leverage AI tools to enhance operational processes. Airbyte's focus on product-led growth and innovation creates a dynamic atmosphere for engineering talent.

Joblaze insights

Quick facts

Is the Site Reliability Engineer role remote?
No — this is an on-site role in San Francisco.
How much experience is required?
At least 7 years of relevant experience for this Site Reliability Engineer role.
Where is the role based?
Airbyte is hiring for this position in San Francisco.
What's the tech stack?
Joblaze extracted these technologies from the posting: Terraform, LLMs, Datadog, Grafana, Prometheus, CI/CD.
What seniority level is this role?
Airbyte targets senior candidates for this position.
Is this full-time or contract?
Full-time for this Site Reliability Engineer role at Airbyte.

From the original posting

Airbyte is the open‑source standard for data movement. We've enabled data teams to move data from applications, APIs, unstructured sources and databases to data warehouses, lakes, and AI applications. With tens of thousands of connectors built and hundreds of thousands of companies adopting Airbyte, we've proven the economics of data integration at scale. And now Airbyte is building the frontier agentic data infrastructure, purpose-built for AI agents that need fast, accurate access to data across hundreds of sources. Our mission: make data available and actionable, everywhere.

We've raised $181M from the world's top investors (Benchmark, Accel, Altimeter, Coatue, Y Combinator, etc.) and we believe in product-led growth, where we build something awesome that all our users love. We’ve raised enough capital to explore boldly, but we still choose to move quickly, stay scrappy, and experiment constantly as we find the right paths in an AI-native landscape.

The Role:

You'll be the infrastructure and reliability engineer on the Data Replication team - a full-stack product team running over 3 million sync jobs a week powering thousands of data use cases across multiple regions and clouds. You’ll build and maintain the infrastructure, set reliability standards, drive down incidents, and make it easier and safer for engineers to ship through tooling. You're equally comfortable in a Terraform file, a Kubernetes cluster, and a postmortem doc.


We expect engineers here to actively use AI as a force multiplier - agentic tools to automate toil, augment incident response, and build smarter internal tooling. If you're not already doing this, you should be excited to start. We care as much about how you work as what you build. Trust, directness, and craftsmanship matter here.

What You’ll Do:

  • Own the infrastructure underpinning the Data Replication platform - Kubernetes clusters, CI/CD pipelines, secrets management, networking, and cloud resource configuration across AWS and GCP.

  • Partner with product engineers to reliably integrate product features with infrastructure.

  • Maintain and enhance observability, alerting, and anomaly detection with an eye towards LLM automation.

  • Maintain and enhance AI-augmented release and internal tooling: canary deployments, progressive rollouts, automated release qualification, and rollback automation - with an eye towards LLM automation.

  • Set the infrastructure bar for the team - build self-serve tooling, write runbooks, and coach engineers to own more of their stack.

What You’ll Need:

  • 7+ years in infrastructure, platform engineering, SRE, or DevOps.

  • Hands-on ownership of Kubernetes, Helm, and Terraform in production environments.

  • Deep experience with observability stacks (Prometheus, Grafana, Datadog) and on-call operations.

  • Experience with CI/CD pipeline ownership and developer tooling.

  • Ability & willingness to read backend code to understand how systems break and instrument them correctly.

  • Fluency with AI tools - LLMs and agentic frameworks to automate, debug faster, and reduce toil.

  • A startup-ready mindset: comfortable with ambiguity, moving fast, and owning problems end-to-end.

Nice To Have:

  • Data pipelines, replication systems, or ETL/ELT platforms.

  • Control plane / data plane architectures or internal developer platforms.

  • Experience with Airbyte, CDKs, or connector-based architectures.

Location:

  • Onsite 5 days/week in San Francisco, CA

If you find this role exciting, we encourage you to apply even if you think you don’t meet all of the requirements!

Airbyte is an equal opportunity employer that does not discriminate on the basis of actual or perceived race, creed, color, religion, national origin, ancestry, age, physical or mental disability, pregnancy, genetic information, sex, sexual orientation, gender identity or expression, marital status, familial status, domestic violence victim status, veteran or military status, or any other legally recognized protected basis under federal, state or local laws. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Airbyte is committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. Please let us know if you need assistance or accommodations due to a disability.

Similar positions

Airbyte
Software Engineer, Platform
Airbyte · San Francisco
Airbyte
Director, Data & Analytics
Airbyte · San Francisco
Airbyte
Solutions Architect
Airbyte · San Francisco
Airbyte
Senior GTM Engineer
Airbyte · San Francisco
Airbyte