Direct from source · No middlemen

Low Latency Inference Jobs in San Francisco

32 open positions · Updated 2 months ago

Average salary: 190.5k–304.2k/yr

Showing 20 of 32 positions

Search with filters →
Together AI

Join Together AI as a Research Intern to work on cutting-edge distributed inference and optimization for large foundation models.

Together AI San Francisco $58–$63/yr Published 2 weeks ago
Flexible on stack
Together AI

Join Together AI as a Staff ML Engineer to optimize voice model serving for real-time applications on a high-impact team.

Together AI San Francisco $220k–$280k/yr Published 1 month ago
Flexible on stack 60% coding
Anthropic

Join Anthropic as a Staff Engineer to lead the technical direction of the Inference Runtime for AI systems serving millions of users.

Anthropic Remote-Friendly (Travel-Required) | San Francisco, CA | Seattle, WA | New York City, NY $405k–$485k/yr Published 2 weeks ago
Flexible on stack
Together AI

Join Together AI as a Research Intern to explore efficient reinforcement learning and post-training systems for large language models.

Together AI San Francisco $58–$63/yr Published 1 week ago
Flexible on stack
Anthropic
Anthropic San Francisco, CA | New York City, NY | Seattle, WA $280k–$850k/yr Published 2 years ago
Together AI

Design and deliver multi-petabyte storage systems for AI workloads at Together AI, optimizing performance and cost.

Together AI San Francisco $250k–$300k/yr Published 3 weeks ago
Flexible on stack
Mercury

Join Mercury as a Senior Machine Learning Operations Engineer to build and operate real-time inference services for risk decisioning.

Mercury San Francisco, CA, New York, NY, Portland, OR, or Remote within Canada or United States $166.6k–$208.3k/yr Published 1 week ago
Flexible on stack
Anthropic
Anthropic San Francisco, CA | New York City, NY | Seattle, WA $280k–$850k/yr Published 1 year ago
Page 1 of 2