← Back to results

Technical Account Manager (TAM), GPU Cluster

Location
San Francisco
Compensation
$260k–$290k/yr
Level
senior
Type
full time · Remote OK

Requirements

Experience
5+ years

Benefits

Health Insurance Equity/Stock Options Remote Work

Joblaze summary

In the role of Technical Account Manager at Together AI, the individual will manage the technical relationship for a key enterprise client, ensuring the operational health of large-scale GPU deployments. This position requires deep expertise in GPU infrastructure, networking, and storage systems, along with strong skills in incident management and customer communication. Ideal candidates will have over five years of experience in customer-facing technical roles, particularly in AI or HPC environments. Together AI emphasizes a collaborative approach to AI infrastructure, making this role pivotal in driving both customer success and company growth.

Joblaze insights

Quick facts

Is the Technical Account Manager (TAM), GPU Cluster role remote?
It's hybrid — Together AI expects some on-site time in San Francisco.
What's the salary range?
Together AI lists $260,000–$290,000 for this role.
How much experience is required?
At least 5 years of relevant experience for this Technical Account Manager (TAM), GPU Cluster role.
Where is the role based?
Together AI is hiring for this position in San Francisco.
What's the tech stack?
Joblaze extracted these technologies from the posting: Grafana, Prometheus, AI/ML, bash, Python, GPU.
What seniority level is this role?
Together AI targets senior candidates for this position.
Is this full-time or contract?
Full-time for this Technical Account Manager (TAM), GPU Cluster role at Together AI.

From the original posting

About the role

As a TAM at Together AI, you will serve as the named technical owner for one of our most strategic customer relationships. You will be the primary technical point of contact across all infrastructure domains — compute, networking, storage, and facilities — ensuring flawless delivery and operational health of large-scale GPU deployments. This role sits at the intersection of deep infrastructure expertise and high-stakes customer partnership, making you a critical driver of both customer success and company growth.

Responsibilities

  • Serve as the named technical point of contact for a dedicated strategic customer, owning the end-to-end technical relationship across compute, networking, storage, and facilities
    • Drive structured engagement through regular cadences including status reporting, technical steering meetings, and executive business reviews
    • Translate customer operational feedback into actionable input for Engineering, Product, and Infrastructure roadmaps
  • Lead issue lifecycle management, escalation, and RCA authorship across all infrastructure domains in partnership with Support, SRE, DC Ops, and Engineering teams
  • Own end-to-end RMA coordination and hardware lifecycle management, including acceptance testing, spare inventory management, and hardware health reporting for large-scale GPU deployments
  • Maintain deep technical expertise across the customer's infrastructure stack — GPU compute, high-speed fabric, and large-scale storage systems — advising on configuration, operational best practices, and incident resolution
  • Own the observability strategy for the customer estate, including alert policy definition, dashboard development, and proactive health management across all infrastructure layers
  • Coordinate DC operations and facilities events in partnership with internal teams and hosting providers, ensuring SLA compliance and cluster availability
  • Act as project manager for all capacity expansions, owning the full node deployment lifecycle from freight receipt through production acceptance

Qualifications

  • 5+ years in a customer-facing technical role, with 2+ years in dedicated technical account management or solutions architecture for large-scale AI or HPC infrastructure
  • Deep expertise in GPU infrastructure — GPU health diagnostics, RMA workflows, and hardware acceptance testing
  • Hands-on experience with large-scale Ethernet and InfiniBand fabric architecture
  • Working knowledge of enterprise storage systems, including high-density NVMe, parallel file systems, and metadata infrastructure
  • Experience with DC operations, facilities coordination, and hosting provider SLA management
  • Strong ownership mindset for incident management, RCA authorship, and executive-level customer communication
  • Proficiency in infrastructure monitoring and observability tooling (Prometheus, Grafana, or equivalent)
  • Proven ability to manage multiple concurrent workstreams with hyperscaler-level rigor and communication standards
  • Proficiency in Python, Bash, or infrastructure automation tools preferred

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers on our journey in building the next generation of AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $260-290K OTE + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Location

San Francisco, CA (Hybrid) or New York, NY (Hybrid)

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our Privacy Policy at https://www.together.ai/privacy

Similar positions

Together AI
Together AI
Sr. Technical Program Manager (TPM)
Together AI · San Francisco
Together AI
Infrastructure Accounting Manager
Together AI · San Francisco
Together AI
Infrastructure Design Engineer
Together AI · San Francisco
Together AI
Customer Support Engineer (GPU Cluster)
Together AI · San Francisco