AI Engineer Job Description Template (2026)

Book a Free Strategy Call

Skip the read: talk to Walid in 30 min.

Free strategy call. We map your AI engineering team, you keep the notes.

A great AI engineer job description in 2026 does 4 things: it names the exact model and stack instead of "experience with LLMs", it separates AI engineering from ML research, it states that the hire will own the eval harness, and it is honest about the ambiguity of the role. Get those 4 right and the wrong applicants filter themselves out before you read a single resume.

The role looks nothing like it did in 2023. You are not hiring someone to fine-tune BERT or stand up a Jupyter notebook. You are hiring someone who can design agentic systems with Claude or GPT-class models, ship RAG pipelines that survive real production traffic, write evals that catch regressions before customers do, and reason about latency and cost the way a senior backend engineer reasons about databases. And the title now covers at least 5 very different jobs: a founding AI engineer at a Series A startup, a senior agents engineer at a scale-up, and a junior who needs mentorship should never get the same JD.

This guide gives you 5 AI engineer job description templates you can copy-paste and customize: Senior AI Engineer (LLM/Agents), AI Engineer (RAG/Production), Junior AI Engineer, Founding AI Engineer, and AI Engineer (Contractor / Fractional). Each one has a real responsibilities list, must-haves, nice-to-haves, what success looks like in the first 90 days, and an honest compensation range based on what teams are actually paying in 2026.

What makes a great AI engineer JD in 2026

A great AI engineer job description in 2026 does 4 things that most JDs still get wrong.

It names the model and the stack. "Experience with LLMs" is meaningless. Write "Claude 3.5 Sonnet via Anthropic API and Bedrock," "OpenAI Responses API," "LangGraph or the Claude Agent SDK," "pgvector or Pinecone." Specific stacks attract specific engineers. Vague stacks attract people who have read the OpenAI quickstart.

It separates AI engineering from ML research. If the role does not involve training, do not list PyTorch, transformer architecture, or "publications at NeurIPS." You will lose engineers who are great at shipping LLM products to a screen that says "we want a PhD." Most AI engineering work in 2026 is API orchestration, retrieval design, evals, and product integration, not gradient descent.

It states the eval and observability expectation. The single biggest differentiator between a junior and a senior AI engineer in 2026 is whether they instinctively write evals before shipping. Your JD should explicitly say "you will own the eval harness for the features you build". That one line filters the field aggressively.

It is honest about ambiguity. AI engineering is still a frontier role. There are no settled patterns for memory, no agreed answer on agent frameworks, no stable best practice for long-context retrieval. The candidates worth hiring want to know that. Write "you will be making calls without a playbook". The right people read that as a feature.

If you want a deeper breakdown of what to screen for, see our guide on how to hire AI engineers and the AI engineer skills that matter in 2026.

Template 1: Senior AI Engineer (LLM / Agents focus)

Use this when you have a product in market, you are building agentic features (tool-calling, multi-step workflows, autonomous loops), and you need someone who can own architecture without supervision.

Job summary

We are hiring a Senior AI Engineer to own the design and delivery of our agentic product features. You will architect multi-step LLM workflows that take real customer actions (not demos), and you will be responsible for their reliability, latency, cost, and eval coverage. You will work directly with our product and infrastructure teams and report to the Head of Engineering.

This role is for someone who has shipped LLM features to production at scale, has opinions about agent frameworks earned through pain, and can explain to a non-technical exec why a particular tool-calling pattern is failing 4% of the time.

What you will do

Design and ship agentic workflows using Claude 3.5 Sonnet (Anthropic API + Bedrock), GPT-class models via OpenAI Responses API, and orchestration frameworks like LangGraph or the Claude Agent SDK
Own end-to-end delivery of 2-3 major AI features per quarter, from spec to eval to production rollout
Build and maintain the eval harness for every feature you ship: golden datasets, regression suites, LLM-as-judge pipelines
Lead architecture reviews for AI features built by other engineers on the team
Reduce per-request cost and p95 latency on existing features by 30%+ year over year through prompt engineering, caching, model routing, and structured output design
Partner with product to translate fuzzy customer problems into testable, evaluatable LLM specs
Mentor 2-3 mid-level engineers on prompt engineering, retrieval design, and agent debugging

Must-haves

5+ years of professional software engineering experience, including 2+ years shipping LLM-powered features to production users
Deep, hands-on experience with at least one frontier model API (Anthropic, OpenAI, or Google) and at least one agent framework (LangGraph, Claude Agent SDK, OpenAI Agents SDK)
Strong Python or TypeScript: you write the production code, not notebooks
Demonstrated experience designing eval pipelines (you have written LLM-as-judge prompts, maintained golden datasets, caught regressions before users)
Experience with at least one production vector database (pgvector, Pinecone, Weaviate, Qdrant) and hybrid retrieval
Comfortable with cloud infrastructure (AWS, GCP, or Vercel): you can deploy, observe, and debug your own work

Nice-to-haves

Experience with Anthropic's Bedrock deployment, structured outputs / tool-calling at scale, or extended thinking modes
Background building developer tools, internal copilots, or B2B SaaS AI features
Open-source contributions to LLM/agent tooling
Comfort speaking publicly: blog posts, conference talks, or office-hours format

What success looks like (first 90 days)

Day 30: You have shipped one small but customer-visible improvement to an existing AI feature and you have proposed an eval harness improvement
Day 60: You own one feature end-to-end. You have published an internal architecture doc and run at least one eval-driven debugging session with the team
Day 90: You have shipped a new agentic feature to production with full eval coverage, and you are leading architecture review for at least one peer's work

Compensation and benefits

Base salary: $180,000 - $230,000 USD (or local equivalent)
Equity: 0.1% - 0.4% depending on stage and experience
Remote-first with quarterly on-sites
$3,000 annual learning budget
Top-tier health, dental, and vision

Free weekly brief

Steal our production automations

The exact n8n flows, Claude Code setups, and prompts we ship for clients, broken down step by step. No spam, unsubscribe anytime.

Template 2: AI Engineer (RAG / Production focus)

Use this when your core challenge is retrieval over a real corpus (docs, support tickets, contracts, codebases) and the model is the easy part. You need an engineer who can make retrieval reliable at scale.

Job summary

We are hiring an AI Engineer to own our retrieval-augmented generation (RAG) stack. You will design ingestion pipelines for messy real-world data, build hybrid retrieval that actually returns the right chunks, and make sure our LLM-powered answers are grounded, cited, and trustworthy.

If you have ever debugged a "but the chunk was in the index" production incident at 11pm, this role will feel familiar.

What you will do

Own the end-to-end RAG pipeline: ingestion, chunking, embedding, indexing, retrieval, reranking, and answer synthesis
Design and run retrieval evals: precision@k, recall@k, MRR, and end-to-end answer faithfulness scores
Build ingestion pipelines for diverse source data (PDFs, web pages, ticketing systems, internal wikis, structured DBs)
Tune chunking strategies, embedding models (OpenAI, Voyage, Cohere, BGE), and rerankers (Cohere Rerank, Voyage Rerank, cross-encoders) per content type
Implement citation, grounding, and refusal patterns so the model says "I do not know" instead of hallucinating
Optimize cost and latency across the retrieval and generation steps
Partner with the data team on schema evolution and freshness guarantees

Must-haves

3+ years of professional software engineering, including 1.5+ years shipping RAG or search systems
Hands-on experience with at least 2 embedding providers and at least one reranker
Strong Python; comfort with async pipelines and queue-based ingestion
Production experience with a vector database (pgvector, Pinecone, Weaviate, Qdrant, Turbopuffer)
You have written retrieval evals from scratch and used them to make product decisions
Familiarity with at least one frontier LLM API (Anthropic, OpenAI, Google) and structured outputs

Nice-to-haves

Experience with hybrid search (BM25 + dense), query rewriting, HyDE, or multi-step retrieval
Background in information retrieval or search engineering pre-LLM
Experience with PDF/OCR pipelines, layout-aware parsing, or table extraction
Knowledge graphs, entity linking, or GraphRAG patterns

What success looks like (first 90 days)

Day 30: You have documented our current RAG pipeline end-to-end, identified the top 3 failure modes, and proposed measurable fixes
Day 60: You have shipped retrieval improvements that move at least one eval metric (recall@10, answer faithfulness) by 15%+
Day 90: You own the RAG roadmap and have shipped at least one ingestion improvement and one retrieval improvement, both backed by evals

Compensation and benefits

Base salary: $150,000 - $200,000 USD
Equity: 0.05% - 0.2%
Remote-first
Conference + learning budget $2,500/year
Health, dental, vision, 401(k) match

Template 3: Junior AI Engineer / AI Engineer I

Use this when you can invest in mentorship and you want to grow talent rather than fight over the same 200 senior engineers. The mistake most teams make is writing a junior JD that demands senior experience.

Job summary

We are hiring a Junior AI Engineer to join our applied AI team. You will work alongside senior engineers shipping LLM-powered features, you will own small features end-to-end with mentorship, and you will grow into a fully independent AI engineer within 12-18 months.

This is a real engineering role. You will write production code, ship to real users, and be expected to learn fast. It is not a research role and it is not an internship.

What you will do

Implement well-scoped LLM features under the guidance of a senior AI engineer
Write prompts, structured outputs, and tool definitions and iterate on them with eval feedback
Maintain and extend our eval harness (golden datasets, regression tests)
Help with prompt debugging, error analysis, and small retrieval improvements
Pair-program with senior engineers on larger features
Contribute to internal docs, runbooks, and onboarding material

Must-haves

0-2 years of professional software engineering experience (internships, bootcamp, junior roles, or strong self-taught portfolio all count)
Comfortable writing Python or TypeScript and reading other people's code
You have built at least one personal or open-source project that calls an LLM API, and you can explain its failure modes
You have read at least one foundational piece on prompt engineering or RAG (Anthropic's prompt engineering docs, the original RAG paper, or equivalent) and can discuss it
Strong written communication; you can explain what you tried and what you observed
Genuine curiosity: you do not need a PhD, you need to be the kind of person who reads model release notes the day they drop

Nice-to-haves

A side project that uses tool-calling, agents, or RAG
Open-source contributions
Familiarity with one frontend framework (React, Next.js)
Comfort with git, GitHub PRs, and code review

What success looks like (first 90 days)

Day 30: You have shipped your first PR (small, mentored) and you have shadowed at least 3 eval debugging sessions
Day 60: You own a small feature end-to-end with senior review. You are writing your own prompts and evals
Day 90: You can independently ship a well-scoped feature with light review, and you are starting to spot regressions in other people's PRs

Compensation and benefits

Base salary: $90,000 - $130,000 USD (varies widely by location)
Equity: 0.01% - 0.05%
Mentorship from senior engineers, dedicated learning time (4 hours/week)
Full benefits package

Template 4: Founding AI Engineer (startup)

Use this when you are pre-seed, seed, or early Series A. The founding AI engineer is part architect, part product manager, part on-call. The JD needs to be honest about that.

Job summary

We are hiring our Founding AI Engineer. You will be the first dedicated AI hire, you will work directly with the founders, and you will own everything AI from model selection to production observability to writing the first internal eval harness on a whiteboard.

This is a role for someone who has shipped LLM products before, has scars from at least one production incident, and wants to build a category-defining AI product from a blank page rather than maintain one.

What you will do

Make the foundational technical decisions: model providers, agent framework (or no framework), vector DB, eval tooling, observability stack
Build the first version of every AI feature, then hand off and rebuild as we hire the next engineers
Own the relationship with model providers (Anthropic, OpenAI, Google) including rate-limit escalations and early-access programs
Write the first eval harness, the first runbook, and the first onboarding doc for future AI hires
Partner directly with the founders on product strategy: you will say "no" a lot and you will be right most of the time
Be on-call for AI-related incidents; build the systems that make on-call boring within 12 months

Must-haves

4+ years of professional software engineering, including 2+ years shipping LLM-powered features to real users
You have shipped to production with at least 2 of: Anthropic, OpenAI, Google, open-source models
You have experience with agents, tool-calling, and structured outputs at production scale
You are a generalist: you can wire a Postgres + pgvector setup, deploy a Next.js frontend, and configure a CI pipeline in the same sprint
You have opinions about agent frameworks formed by shipping with them, not by reading Hacker News
You thrive in ambiguity. You make decisions, document them, and revisit when new data arrives

Nice-to-haves

Previous founding engineer or early-stage experience
Experience hiring and onboarding other engineers
Public technical writing or speaking
Background in the vertical we operate in

What success looks like (first 90 days)

Day 30: You have audited everything, made the call on the stack, and shipped one customer-facing AI improvement
Day 60: The core AI feature is in production with first-pass evals, and you have a written technical roadmap for the next 2 quarters
Day 90: At least one customer is using the AI feature in anger, eval coverage is in place, and you have a hiring plan for engineer #2

Compensation and benefits

Base salary: $160,000 - $220,000 USD
Equity: 0.5% - 2.5% (this is the trade-off)
Co-located or remote, founder's call
The chance to design the technical culture from day one

Template 5: AI Engineer (Contractor / Fractional)

Use this when you have a defined project (a RAG MVP, an agentic feature, a migration from one model provider to another) and you do not want to commit to a full-time hire yet. The contractor JD should look very different from a full-time one.

Job summary

We are hiring a contract AI Engineer for a defined 3-6 month engagement. You will scope, build, and ship one specific AI feature, hand it off with full documentation and evals, and exit cleanly. This is not a try-before-you-buy role.

The project

[Describe in 2-4 sentences. Be specific: "Build a RAG-powered support copilot over our Zendesk corpus, integrated into our internal admin dashboard. Must handle 5,000 queries/day at p95 < 3s with citation grounding."]

What you will do

Scope the project in week 1: written spec, eval plan, architecture doc, milestone schedule
Build the feature end-to-end, including ingestion, retrieval, generation, and basic UI
Write evals that your replacement (us, internally) can run and extend
Document every decision: why this embedding model, why this chunking strategy, why this prompt
Train 1 or 2 internal engineers on the system before exit
Be available for 2 weeks of bug fixes post-handoff

Must-haves

4+ years of professional software engineering, including significant production LLM experience
Proven track record of shipping contract or consulting engagements (references required)
Strong Python or TypeScript, comfort across the full stack
Experience with the specific stack we use (or fast ramp time on it)
Excellent written communication: you will be documenting everything for the internal team

Nice-to-haves

Previous engagements with companies at our stage
Experience training internal teams
A portfolio of public projects or case studies

What success looks like

Week 1: Written spec, eval plan, milestone schedule approved
Week 4: First demo to internal stakeholders
Week 8: Production rollout to a beta cohort with evals in place
Week 12: Full handoff complete; internal engineer can extend the system

Rate and terms

Day rate: $1,200 - $2,500 USD/day depending on seniority and scope
Or fixed-price per milestone (preferred for scoped projects)
Remote
US/EU/MENA time zones acceptable

Common JD mistakes

The 5 mistakes we see most often in AI engineer JDs in 2026:

Listing 14 frameworks "or equivalent": pick 2. Real engineers ignore JDs that read like a buzzword bingo card.
Demanding a PhD for a product engineering role: if you do not have a research team and you are not training models, do not require a PhD. You will lose every great applied engineer to a competitor with a sharper JD.
No mention of evals: the single fastest tell that the hiring company does not yet know what good looks like. Senior engineers read "no eval mentioned" as "they will ship vibes-based AI and blame me when it breaks."
Salary range hidden: in 2026, hiding the range filters out senior candidates first. They have options. Post the range.
Generic "experience with LLMs": name the model, the API, the framework, and the use case. Specificity attracts specificity.

Where to post your JD

Where you post the JD matters almost as much as what is in it. The candidate pools are different on each surface.

LinkedIn Jobs: broadest reach, best for senior + mid-level full-time roles. Sponsor the post for the first 7 days.
Wellfound (formerly AngelList Talent): best for startup and founding-engineer roles. Equity-aware audience.
Y Combinator's Work at a Startup: only if you are YC-backed, but the candidate quality is high
Hacker News "Who is hiring?" monthly thread: strongest signal-to-noise for senior engineers
AI-specific boards: aijobs.net, aijobslist.com, the AI Engineer World Fair job board, and various Discord communities (LangChain, LlamaIndex, Anthropic Builder, OpenAI Devs)
Twitter/X: still the highest-bandwidth channel for senior AI engineers; post from founders' or engineering leaders' accounts, not the company account
Targeted outreach: for senior and founding roles, no job board beats engineering leaders DM'ing candidates whose GitHub or blog work matches the role

A team building production AI systems should expect to do at least 30% inbound and 70% outbound for senior roles in 2026. The good engineers are not browsing job boards: they are shipping.

Need help hiring or just building it yourself?

If you read these templates and realized the role you are trying to hire for is actually 3 roles, or that you do not have anyone internally to evaluate AI engineer candidates, you have 2 options. Option 1 is to hire a fractional CTO or technical advisor to run the interview loop with you. Option 2 is to skip the hire entirely and bring in a team that already does this work.

That is what we do at AY Automate. We design and ship agentic systems, RAG pipelines, and production LLM features for companies that need the work done now and want to hire internal engineers later, when the patterns are settled. If that fits, see our AI agent development services or book a consultation and we will tell you honestly whether you should hire or partner.

If you do hire, our deep dives on how to hire AI engineers and the AI engineer skills that matter in 2026 cover the interview loop, technical screen, and offer stage in detail.

FAQ

What is the difference between an AI engineer and an ML engineer in 2026?

In 2026, AI engineer typically means someone who builds products on top of frontier model APIs: prompt engineering, RAG, agents, evals, product integration. ML engineer typically still means someone who trains, fine-tunes, or deploys custom models. The skill overlap is smaller than it sounds. Hire for the work you actually have.

Do I need a PhD to be an AI engineer in 2026?

No, and most production AI engineering roles do not require one. A PhD helps for research, foundation-model training, and some applied research roles. For product engineering (which is the majority of open roles), strong software engineering plus 12-18 months of hands-on LLM product experience matters more.

What programming language should I require for an AI engineer role?

Python is still the default for AI engineering work. TypeScript is increasingly common, especially for full-stack AI products and roles that touch the frontend. For most roles in 2026, "strong in Python or TypeScript" is a reasonable bar. Avoid demanding both at senior level: you narrow the funnel for no real gain.

What is a fair salary range for a senior AI engineer in 2026?

Senior AI engineers (5+ years total, 2+ years shipping LLM products) typically land between $180,000 and $250,000 base in the US, with frontier-lab compensation and FAANG bands going significantly higher. EU bands run 30-45% lower. MENA and LATAM remote roles for US companies typically sit at 50-70% of US base. Equity adds meaningfully at startups.

How do I screen for AI engineers without a take-home project?

The most effective screen in 2026 is a 90-minute live working session where the candidate debugs a real prompt or retrieval problem with you. You learn more in 90 minutes of pair-debugging than from any take-home. If you must use a take-home, make it a paid 4-hour exercise and review it as code, not as a research paper.

Should I require experience with a specific agent framework?

No. The agent framework market is still moving. Hire for the underlying skills (tool-calling, structured outputs, eval design, debugging non-deterministic systems) and let the candidate pick up your framework in the first 2 weeks. Hard requirements on a specific framework narrow the funnel sharply with little real upside.

How long does it take to hire a senior AI engineer in 2026?

Plan for 8-14 weeks end-to-end for a senior full-time hire, assuming a sharp JD, an outbound effort, and a 4-stage interview loop. Founding engineer searches often take longer (12-20 weeks) because fit matters more. Contractor engagements can start in 1-3 weeks.

Can I hire an AI engineer if I do not have one already on the team to interview them?

You can, but you should not interview them alone. Bring in a fractional CTO, a trusted advisor, or a partner agency to run the technical screen and architecture interview with you. The cost of a bad senior AI hire (6 months of wasted runway and a feature that never ships) is far higher than a few hours of advisory time. For a fully generated job description customized to your role (with salary range, interview questions, and evaluation rubrics), try the AI Job Description Generator.

Book a Free Strategy Call

Building this in production?

Walid runs a 30-min call to map your AI engineering team. Free, no slides.

Or send us a brief →

Free weekly brief

Steal our production automations

The exact n8n flows, Claude Code setups, and prompts we ship for clients, broken down step by step. No spam, unsubscribe anytime.

Share this article

About the Author

Taha

AI Engineer

Taha builds and ships custom AI agents and workflow automations for AY Automate clients across SaaS, finance, and professional services.