Back to jobs/Anthropic

Regional hiringpublishedExternal employer

Anthropic•AI/ML

Full-Stack Software Engineer

Reinforcement Learning

Location

San Francisco, California, United States

Work type

Hybrid

Employment

Full Time

Experience

5-8 years

Compensation

$300K - $405K per year

Posted

3d ago

Summary and responsibilities

Role overview

Summary

As a Full-Stack Software Engineer in Reinforcement Learning, you will develop platforms, tools, and interfaces crucial for environment creation, data collection, and training observability for Claude's next generation AI. This role involves owning product surfaces end-to-end, from backend services and APIs to web UIs, and rapidly shipping polished, reliable products in a fast-paced, ambiguous environment.

About the Role

As a Full-Stack Software Engineer in RL, you'll build the platforms, tools, and interfaces that power environment creation, data collection, and training observability. The quality of Claude's next generation depends on the quality of the data we train it on — and the systems you build are what make that data possible.

You'll own product surfaces end-to-end — from backend services and APIs to the web UIs that researchers, external vendors, and thousands of data labelers use every day. You don't need a background in ML research. What matters is that you can take an ambiguous, high-stakes problem and ship a polished, reliable product against it, fast.

This team moves very quickly. Claude writes a lot of the code we commit, which means the bottleneck isn't typing — it's judgment, taste, and the ability to react to what researchers need next. You'll iterate on data collection strategies to distill the knowledge of thousands of human experts around the world into our models, and you'll do it in a loop that closes in hours and days, not quarters or months.

Anthropic's Reinforcement Learning organization leads the research and development that trains Claude to be capable, reliable, and safe. We've contributed to every Claude model, with significant impact on the autonomy and coding capabilities of our most advanced models. Our work spans teaching models to use computers effectively, advancing code generation through RL, pioneering fundamental RL research for large language models, and building the scalable training methodologies behind our frontier production models.

The RL org is organized around four goals: solving the science of long-horizon tasks and continual learning, scaling RL data and environments to be comprehensive and diverse, automating software engineering end-to-end, and training the frontier production model. Our engineering teams build the environments, evaluation systems, data pipelines, and tooling that make all of this possible — from realistic agentic training environments and scalable code data generation to human data collection platforms and production training operations.

What You'll Do

Build and extend web platforms for RL environment creation, management, and quality review — including environment configuration, versioning, and validation workflows
Develop vendor-facing interfaces and tooling that let external partners create, submit, and iterate on training environments with minimal friction
Design and implement platforms for human data collection at scale, including labeling workflows, quality assurance systems, and feedback mechanisms that surface reward signal integrity issues early
Build evaluation dashboards and observability UIs that give researchers real-time insight into environment quality, training run health, and reward hacking
Create backend services and APIs that connect environment authoring tools, data collection systems, and RL training infrastructure
Build and expand scalable code data generation pipelines, producing diverse programming tasks with robust reward signals across languages and difficulty levels
Develop onboarding automation and documentation tooling so new vendors and internal users ramp up in hours, not weeks
Partner closely with RL researchers, data operations, and vendor management to translate ambiguous requirements into well-scoped, well-designed products

You May Be a Good Fit If You

Have strong software engineering fundamentals and real full-stack range — you're comfortable owning a surface from database schema to frontend
Are proficient in Python and a modern web stack (React, TypeScript, or similar)
Have a track record of shipping systems that solved a hard problem, not just shipped on time — e.g. you built the thing that made your team 10x faster, or the internal tool nobody thought was possible
Operate with high agency: you identify what needs to be done and drive it forward without waiting for a ticket
Have found yourself wondering

Updated 2d ago

Candidate fit

Skills and qualifications

Additional skills

Software Engineering • 1+ yrs

Full-stack development • 1+ yrs

Python • 1+ yrs

React • 1+ yrs

TypeScript • 1+ yrs

Cloud Infrastructure • 1+ yrs

Communication • 1+ yrs

UX • 1+ yrs

Experience

5-8 years

How this role is positioned

Role classification

Job domains

Software Engineering

Industries

Technology & IT

Software & SaaS

Employment

Full Time

Contract duration

Permanent

Hiring type

Direct

Global hiring

Location specific

Offer details

Compensation and benefits

Compensation

$300K - $405K per year

VisibilityShared on listing

CurrencyUSD

PeriodYearly

Benefits and perks

Paid Parental Leave

Flexible Working Hours

Visa Sponsorship

Location, schedule, and role shape

Work setup

Work conditions

Primary locationSan Francisco, California, United States

Work typeHybrid

Global hiringNo

Bandwidth profile

peopleMedium • 7/10

physicalLow • 2/10

cognitiveHigh • 9/10

executionHigh • 9/10

creativityHigh • 8/10

uncertaintyHigh • 9/10

communicationHigh • 8/10

Context on the employer

Company snapshot

Company

Anthropic

Team size

Growing team

Location

San Francisco, California, United States

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Visit website

Similar jobs in Software Engineering

Frontend Web Development Internship

Khwaaish•Remote, Global

Internship0-1 yrs

Compensation not disclosed

Remote

Technology & ITSoftware & SaaSSoftware EngineeringTypeScriptNext.jsTailwind CSSReactREST APIsResponsive DesignPerformance OptimizationUI/UX Design+5

As a Frontend Web Development Intern at Khwaaish, you will build responsive and high-performance web interfaces using Next.js, TypeScript, and Tailwind CSS. You will integrate frontend with backend APIs, optimize performance, and ensure cross-browser compatibility, contributing to a fast and intuitive user experience for an AI-driven commerce platform.

...

External

Digital Marketing Intern - Website Development

WhiterApps•Remote, Global

Full Time0-0 yrs

Compensation not disclosed

Remote

Technology & ITSoftware & SaaSMarketing & PRSoftware EngineeringWeb DevelopmentProgramming LanguagesWeb Development FrameworksContent Management SystemsWordPressResponsive DesignSEODigital Marketing+6

The Digital Marketing Intern will be responsible for the technical design and implementation of new products and enhancements, working through all phases of the development cycle from concept to implementation. Key responsibilities include developing and maintaining websites, creating responsive designs, and collaborating with various teams.

...

External

Research Engineer, Knowledge Foundations

Anthropic•San Francisco, California, United States

Full Time5-15 yrs

$350K - $850K /yr

Hybrid

Technology & ITSoftware & SaaSSoftware EngineeringPythonML experimentsDistributed systemsData pipelinesModel trainingEvaluationLarge Language ModelsCommunication+5

As a Research Engineer on the Knowledge Work team, you will design and execute experiments to enhance Claude's ability to search, retrieve, and reason over information at scale. This involves developing training environments, curating data, and building evaluations to improve model behavior in real-world professional workflows.

...

External

Research Engineer, Economic Research Data Platform

Anthropic•San Francisco, California, United States

Full Time5-10 yrs

$235K - $285K /yr

Hybrid

Technology & ITSoftware & SaaSSoftware EngineeringData PipelinesCloud InfrastructurePythonAnalytics WorkflowsCommunicationData TransformationLLMsPrivacy-Preserving Data Systems+5

As a Research Engineer on the Economic Research Data Platform team, you will design, build, and maintain critical infrastructure that powers Anthropic's research on AI's economic impact. This involves developing data pipelines, designing new systems for understanding AI usage, and creating APIs and interfaces to serve data to researchers and the public.

...

External

Research Scientist and Engineer

Anthropic•Zürich, Switzerland

Full TimeExperience flexible

Compensation not disclosed

Hybrid

Technology & ITSoftware & SaaSSoftware EngineeringSoftware EngineeringPythonDeep Learning FrameworksLarge-scale ML SystemsLanguage ModelingKubernetesLarge-scale Data ProcessingProblem Solving+5

This role involves conducting cutting-edge research and practical engineering to develop the next generation of large language models, with a primary focus on multimodal capabilities. You will contribute to model architecture, algorithm development, data processing, and scaling training infrastructure for safe, steerable, and trustworthy AI systems.

...

External

ML Infrastructure Engineer, Safeguards

Anthropic•San Francisco, California, United States

Full Time5+ yrs

$320K - $405K /yr

Hybrid

Technology & ITSoftware & SaaSIT & System AdministrationSoftware EngineeringML InfrastructurePythonDistributed SystemsCloud PlatformsKubernetesData EngineeringPyTorchAI Safety+6

As a Machine Learning Infrastructure Engineer in the Safeguards organization, you will design, build, and scale critical infrastructure for AI safety systems. This role involves working at the intersection of machine learning, distributed systems, and AI safety to ensure reliable and trustworthy AI models.

...

External

More jobs from Technology & IT

Wildlife and Habitat Conservation Scientist - AI Training

Alignerr•Mumbai, India

Contract3+ yrs

$30 - $55 /hr

Remote

Technology & ITSoftware & SaaSOtherWildlife BiologyEcologyHabitat ConservationBiodiversity ScienceEcosystem ManagementCritical EvaluationDetail-OrientedSelf-Motivation+5

As a Wildlife and Habitat Conservation Scientist, you will review and assess AI training datasets and AI-generated content related to conservation scenarios. Your role involves identifying inaccuracies, flawed methodologies, and misapplied principles to provide structured feedback, ultimately shaping how AI understands and communicates conservation science.

...

External

Popular Domains

Explore opportunities across specialized functional areas.

Administration & OfficeRoles providing organizational, secretarial, clerical, and executive support functions.

Customer Success & SupportRoles managing customer onboarding, retention, satisfaction, and technical support.

Data Science & AnalyticsRoles using data modeling, statistics, and visualization to derive business insights.

Design & CreativeRoles focused on visual design, UX/UI, branding, illustration, and creative production.

Education AdministrationRoles managing educational institutions, programs, curriculum, and student affairs.

Finance & AccountingRoles managing financial reporting, budgeting, auditing, tax, and investment activities.

Gigs & Flexible TasksShort-term, contract, or freelance task-based work across any domain.

Healthcare & MedicalRoles in clinical care, medical practice, patient management, and health services delivery.

Trending Industries

Discover roles in the world's most innovative sectors.

Aerospace & Space Tech

Agency & Consulting Services

Agriculture & AgriTech

Automotive

Biotech & Life Sciences

Blockchain & Web3

Construction & Infrastructure

Cybersecurity

Full-Stack Software Engineer

San Francisco, California, United States • Full Time