Is the Prompt Engineer, Agent Prompts & Evals role remote?

It's hybrid — Anthropic expects some on-site time in San Francisco, CA | New York City, NY.

What's the salary range?

Anthropic lists $320,000–$405,000 for this role.

How much experience is required?

At least 5 years of relevant experience for this Prompt Engineer, Agent Prompts & Evals role.

Where is the role based?

Anthropic is hiring for this position in San Francisco, CA | New York City, NY.

What's the tech stack?

Joblaze extracted these technologies from the posting: LLMs, A/B Testing, NLP, AI/ML, Statsig, Python.

Does Anthropic sponsor work visas for this role?

Yes — the posting indicates visa sponsorship is available for the right candidate.

What seniority level is this role?

Anthropic targets senior candidates for this position.

Is this full-time or contract?

Full-time for this Prompt Engineer, Agent Prompts & Evals role at Anthropic.

Prompt Engineer, Agent Prompts & Evals at Anthropic

Joblaze summary

In this role, the prompt engineer will focus on designing and optimizing prompts to enhance the user experience of Anthropic's AI products, particularly Claude. Proficiency in Python and experience with large language models are essential, along with a strong grasp of evaluation methodologies. This position is ideal for seasoned engineers who thrive in collaborative environments and can juggle multiple projects while maintaining a focus on quality and safety. Anthropic's commitment to building beneficial AI systems underscores the importance of this role within their innovative product engineering team.

Joblaze insights

Listed about a month ago on Joblaze — tracked directly from Anthropic's career page.
Anthropic has 401 other open roles including Finance & Strategy, EMEA GTM (Dublin or London location), Director, Infrastructure Compute Procurement, Engineering Manager, Cybersecurity Products.
2358 active Python roles on Joblaze right now.
1581 active AI/ML roles on Joblaze right now.
57 active LLMs roles on Joblaze right now.
Salary band is above the typical range for AI/ML (median ~$190,000).

Quick facts

Is the Prompt Engineer, Agent Prompts & Evals role remote?: It's hybrid — Anthropic expects some on-site time in San Francisco, CA | New York City, NY.
What's the salary range?: Anthropic lists $320,000–$405,000 for this role.
How much experience is required?: At least 5 years of relevant experience for this Prompt Engineer, Agent Prompts & Evals role.
Where is the role based?: Anthropic is hiring for this position in San Francisco, CA | New York City, NY.
What's the tech stack?: Joblaze extracted these technologies from the posting: LLMs, A/B Testing, NLP, AI/ML, Statsig, Python.
Does Anthropic sponsor work visas for this role?: Yes — the posting indicates visa sponsorship is available for the right candidate.
What seniority level is this role?: Anthropic targets senior candidates for this position.
Is this full-time or contract?: Full-time for this Prompt Engineer, Agent Prompts & Evals role at Anthropic.

From the original posting

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role

We’re looking for prompt and context engineers to join our product engineering team to help build AI-first products, features, and evaluations. Your mission will be to bridge the gap between model capabilities and real product experience, working with product teams to build consistent, safe, and beneficial user experiences across all product surfaces.

You will be deeply involved in new product feature and model releases at Anthropic, combining engineering expertise with an understanding of frontier AI applications and model quality. You’ll become an expert on Claude’s behavioral quirks and capabilities and apply that knowledge to deliver the best possible user experience across models and domains. You’ll be the first resource for product teams working on Claude’s AI infrastructure: system prompts, tool prompts, skills, and evaluations.

This role requires someone who can effectively balance caring deeply about making Claude the best it can be while also supporting a wide variety of concurrent projects and efforts across many product teams.

Key Responsibilities

Prompt Engineering Excellence: Design, test, and optimize system prompts and feature-specific prompts that shape Claude’s behavior across consumer and API products.
Evaluation Development: Build and maintain comprehensive evaluation suites that ensure model quality and consistency across product launches and updates.
Cross-functional Collaboration: Partner closely with product teams, research teams, and safeguards to ensure new features meet quality and safety standards.
Model Launch Support: Play a critical role in model releases, ensuring smooth rollouts and catching regressions before they impact users.
Infrastructure Contribution: Help build and improve the frameworks and tools that allow teams to develop and test prompts and features with confidence.
Knowledge Transfer: Mentor product engineers on prompt engineering best practices and help teams build their first evaluations.
Rapid Iteration: Work in a fast-paced environment where model capabilities advance daily, requiring quick adaptation and creative problem-solving.

What We’re Looking For

Required Qualifications

5+ years of software engineering experience with Python or similar languages.
Demonstrated experience with LLMs and prompt engineering (through work, research, or significant personal projects).
Strong understanding of evaluation methodologies and metrics for AI systems.
Excellent written and verbal communication skills – you’ll need to explain complex model behaviors to diverse stakeholders.
Ability to manage multiple concurrent projects and prioritize effectively.
Experience with version control, CI/CD, and modern software development practices.

Preferred Qualifications

Experience with Claude or other frontier AI models in production settings.
Background in machine learning, NLP, or related fields.
Experience with A/B testing and experimentation frameworks (e.g., Statsig).
Familiarity with AI safety and alignment considerations.
Experience building tools and infrastructure for ML/AI workflows.
Track record of improving AI system performance through systematic evaluation and iteration.

You Might Thrive in This Role If You…

Get excited about the nuances of how language models behave and love finding creative ways to improve their outputs.
Enjoy being at the intersection of research and product, translating cutting-edge capabilities into user value.
Are comfortable with ambiguity and can define success metrics for novel AI features.
Have a strong sense of ownership and drive projects from conception to production.
Are passionate about building AI systems that are helpful, harmless, and honest.
Thrive in collaborative environments and enjoy teaching others.

The annual compensation range for this role is listed below.

For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.

Annual Salary:

$320,000—$405,000 USD

Logistics

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience

Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience

Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position

Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.