min read

ROADMAP to become the ultimate AI Engineer

The AI field is booming, but most roadmaps focus on theory over practice. This comprehensive guide provides a practical pathway for software engineers to become AI engineers in 2025 without needing deep ML expertise. Unlike traditional ML roles, AI engineering focuses on building functional AI systems with existing LLMs rather than training models from scratch. You'll learn core skills like prompt engineering, RAG systems, agentic workflows, and evaluation techniques, plus advanced topics like fine-tuning and self-hosting. The roadmap progresses from foundation prerequisites through specialization areas including knowledge management systems, multi-agent architectures, and monitoring techniques. Perfect for developers ready to build AI systems that solve real-world problems.

Motivation :

Since the rise of LLM models, the "AI" field has been booming. I see a lot of people getting into AI engineering. But no clear definition of what it is nor a clear roadmap on how to become one. Often, AI Engineering is more "Gen" AI Engineering.

I skimmed through a lot of roadmaps where the focus was more on the theoretical part - Which is totally fine for a researcher or a scientist. But for an AI Engineer, it's more about the practical part.

I read, watched and spent countless hours experimenting with a lot of stuff in the field.

And I wanted to share my point of view on how the "AI" field can be way more tangible and accessible than it is often made out to be for tech people and specifically software engineers.

Especially after reading an incredible article by Eugene Yan about how to Hire Hire ML/AI Engineers where he highlighted the main skills that are needed for the role.

AI Projects are no longer about building your own ML model from scratch and gathering data for first milestones. the focus has shifted to "Engineering" a viable product with current Generative Pre-trained transformer models and then push it boundaries by fine-tuning it and specialising it on a specific task.

No more 6 months iterations of data gathering, data curation, and model training just to get a model that doesn't even work.

But keep in mind that this domain is still evolving very quickly, so this roadmap might not be 100% accurate in 1 year from now on.

AI engineering in 2025 is very much linked to GenAI, LLMs and transformer models as it bridges the gap and gives some sort of "entry point" for people to get into the field.

And as said before, this is my own point of view on the subject, after a lot of research and exploration of the field, feel free to disagree and share your own.

AI Engineering Definition

let's start with the basics by defining "Engineering" : "the action of working artfully to bring something about"

That's what we're doing here, we're building something, we're creating something. AI engineering will be about creating AI systems/applications that will be fully functional.

Paradoxaly about what every one thinks, AI engineers proved to not be so focused on "AI" as per say.

It's more of an "AI system developper" if we can say so. Less research and more practical use of AI.

As said Andrej Karpathy, Sr. Director of AI at Tesla:

In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. One can be quite successful in this role without evertraining anything.

So, as opposed to "pure" Data Scientists, LLM researchers and so on, AI engineers are more focused on the practical use of AI.

That is why there is no need to be an expert in Machine Learning, Deep Learning, Transformers, NLP, CV, etc. A Basic understanding of concepts in ML(basic understandings of algebra, stastics, cookie cutter ML & deep learning concepts such as neural network, basic view of transformers are more than sufficient in many cases).

Word of notice : I'm not saying that you shouldn't be interested in the latest research in ML, DL, LLM, NLP, CV, etc. Quite the opposite, as this field evolves very quickly. but the focus here is more on the practical part. diving deep into the theoretical part is not necessary for this role.

You might have guessed it by now, the roadmap will be more focused on the practical use of AI.

So you won't be seeing hands-on guides on how to build a new LLM architecture or a new ML model as mandatory in this path.

All that being said, this doesn't drop by a bit the complexity of the role.

For a starter because it requires very strong competences in various fields : dev, system design, infrastructure but also AT LEAST a basic understanding of concepts in ML( even if as said before, no need to be an expert).

I remember following my first hands-on building a transformer architecture back in 2k21 right before the public rise of GPT. But i never used it for anything except for my own "joy".

I'm not specialized in LLM research or DL research. But today entry-level fine-tuning of LLM models is utterly simple, thanks to the abundance of tools and libraries - Training & Inference Engines for Large Language Models like Hugging Face Transformers or Unsloth make it feel too easy.
But the most difficult part of it remains the usual suspect :DATA gathering, generating good datasets, and mostly EVALUATING it to avoid know issues such as catastrophic forgetting etc....
For practical use, i feel like it's more than enough.

To summerize : You won't revolutionize the world by inventing a new Deep learning architecture. But you can build something really useful.

Of course, each AI Engineer will have their own path, their own focus, their own expertise. Some will be way more focused on the model part, pushing boundaries of fine-tuning with reinforcement learning, DPO and so on.

Others will be way more focused on the system part, designing and building scalable and efficient systems, optimizing latency and throughput, building monitoring and observability, etc.

But at the end of the day, it's all about building AI systems that should bring REAL VALUE !

Challenges

One of the main challenges of AI engineering is the inherent uncertainty and opacity of ML.

To those that are too used to "Database" and "API" predictable systems, ML is a black box. (Even if some later research are trying to make it more transparent, it's still a work in progress)

And that's totally fine.

But it's important to understand that ML models are not deterministic.

And to adapt your systems to this, you need to understand that.

That's why EVALs is a CORE COMPETENCE for AI Engineers.

Because it's the only way to "understand" if your model is working as expected. And the quickest and most efficient way to do so is to have a good EVALS strategy and error analysis. as stated in the following very inspiring article : https://hamel.dev/blog/posts/field-guide/

It's a must read for all AI Engineers in my opinion.

Now ...

Let's dive into the roadmap. Shall we ?

‍

AI Engineering Roadmap

‍

🔍 You should Zoom IN 🔎 - Details and links are below the visual Roadmap - If you have any idea how to make this roadmap "interactive", please contact me :)

‍

‍

Ok now that we have a clear vision of the roadmap, let's dive into the details by startint with the AI Core skills.

Roadmap details

Core Skills

LLM Fundamentals :

Understanding the basics of LLMs and transformers.

Amazing reading material from Machine learning to LLMs evolution : https://medium.com/data-science-at-microsoft/how-large-language-models-work-91c362f5b78f
a great Video for visualizing the transformer architecture : https://www.youtube.com/watch?v=wjZofJX0v4M

--> Key take away :

Input and output must be a sequence of tokens. https://www.youtube.com/watch?v=hL4ZnAWSyuU
The transformer architecture is a very efficient way to process this sequence of tokens.
The main innovation of the transformer architecture is the self-attention mechanism.

Prompt engineering :

Prompt engineering is the art of crafting the input to a model to get the desired output.

All the following resources are great to get started and understand a lot of what you need to know about prompt engineering.

OPENAI Prompt Engineering Guide
Anthropic Prompt Engineering Guide
My own article on the subject that summarizes the most important techniques : Prompt Engineering
Another full comparison of the different techniques by major models providers: Prompt Engineering according to different LLM providers
Prompt Management :

Amazing new tool from Github : Models including Prompt Management

OPENAI API :

Understanding the basics of the OPENAI API and how to use it.

Why OPENAI API ?

It's the defacto standard for LLM API. First to market, first to dominate. Every Provider is trying to be compatible with it. You understand it? you can literally use any model from any provider with the OPENAI API only by changing the API key and Provider Base URL.
OpenAI API Documentation
OpenAI Cookbook

--> The goal is to explore Structured Outputs and Function Calling, start experimenting with it and prompting to get the hang of it.

Warning :

OPENAI just released their new responses API.

RAG Systems :

Understanding the basics of RAG systems and how to use them. https://www.youtube.com/watch?v=u47GtXwePms

--> understanding the main concepts behind RAG :

Text Embeddings -> give a vector representation of the text
Vector Databases -> store and retrieve embeddings efficiently
Chunking Strategies -> split data smartly for better retrieval quality. ( fixed size chunking, dynamic chunking, semantic chunking, etc.)
Ranking Strategies -> not all retrieved docs are equal. rank them to return only most relevant ones.
Hybrid Search -> combine different search strategies to get the best of both worlds.

That's all you need to know to get started with RAG. Later there will be a section on RAG Advanced Techniques.

What you should keep in mind :

Vector embeddings do not magically solve search that what you should explore in the RAG Advanced Techniques section.

Agentic Workflows :

Understanding the basics of agentic workflows and how to use them.

Some reading materials : ReAct Article Anthropic Building Effective Agents

Start exploring the different tools and libraries available to build agentic workflows. Maybe LangChain or PydanticIA are a good starting point. just to get a grasp of the concepts.

Evaluation & Monitoring :

Understanding the basics of evaluation and monitoring of LLM models.

For easy and quick monitoring of self-hosted models, you can use the following guide i made: Monitoring self-hosted models

Specialization Areas

Knowledge Management Systems :

Understanding the basics of knowledge management systems and how to use them.

Chunking & Embedding : Advanced chunking strategies, Fine-tuning embedding models, etc.
Knowledge Graphs - ( mixing Graphs and LLMs )
Multi-Modal Knowledge Bases : multimodal embeddings - PaliGemma --> ColPali --> ColQwen and the likes. This is a good starting point to understand it.
Retrieval Strategies ( Hybrid Search, Reranking, etc.) great implementation yt video
RAG Evaluation

Key take away :

Except from the Shiny new embedding models, there is very efficient state of the art keyword search (BM25) that is still very efficient and mixing it with semantic search is a good way to start any RAG system.

Agentic Systems Patterns :

Understanding the basics of agentic systems patterns and how to use them.

Single Agent Architecture
Multi-Agent Systems : collaborative system between specialized agents that bring each specific details that would have been missed by a single agent. i do like the following explanation from PydanticAI :

Multi-agent systems are a type of AI system that consists of multiple agents working together to achieve a common goal. Each agent is designed to perform a specific task, and the agents work together to solve complex problems.

Rag Agent : Agent dynamically retreiving data based on the tools it has access to ( mail, websearch, drive, etc.) and using it to enrich context and answer the query.
Tool Use & Function Calling : Routing & Orchestration
Memory & State Management : Amazing article on the subject by Mem0, about memory in agents
MCPs from Anthropic, ACPs, A2A from Googles ...

Monitoring & Evaluation Techniques :

Understanding the basics of monitoring and evaluation of LLM models.

Response Drift Detection
LLM as Judge Techniques https://eugeneyan.com/writing/llm-evaluators/
Hallucination Management OR Actually, the biggest challenge of them all : Latest OPENAI benchs on their latest models got o4-mini and o3 -as of 11/05/2025- OPENAI O3 and O4 Mini System Card

Latest models hallucinated more and sounded more confident when doing so !

Advanced Techniques

Fine-tuning :

Understanding the basics of fine-tuning of LLM models. the Unsloth notebooks and HuggingFace’s How to fine-tune open LLMs are a great starting point that can be used to an initial quick start.

PEFT - LoRA/QLoRA/Prompt Tuning
Direct Preference Optimization (DPO) / Proximal Policy Optimisation (PPO)
Reinforcement Learning with Human Feedback (RLHF)
Memory fine-tuning, Mixture of Experts (MoE) and Mixture of Agents (MoA)

Fine-tuning LLMs - Complete overview & Best Practices

The hole HuggingFace Transformers section is a great resource to get started with fine-tuning as it showcases the whole transformers pipeline and a practical guide on how to fine-tune a model. https://huggingface.co/docs/transformers/index

Unsloth Google Colabs can can also be a great starting point to get started with fine-tuning to experiment with the different techniques and see by yourself the main challenges one can face during the process. https://docs.unsloth.ai/get-started/unsloth-notebooks

Self Hosting :

Understanding the basics of self-hosting of LLM models.

Inference Engines for Large Language Models My own article on the subject : LLM Self-Hosted Deployment Roadmap

Compression - Quantization
Caching - Prompt, KV Caching
Inference Engines
Attention Optimization
Structured Outputs

Conclusion

I hope this roadmap will be useful to you and that it will help you in your journey to become an AI Engineer.

I will be updating this roadmap as I go along and as I learn new things.

I will start a new learning article series to go along with this roadmap. ( except for the core skills as they are quite obvious) A course series that will be fully focused on the practical use of AI. So don't hesitate to subscribe to the newsletter to be notified when it's ready. and like the article if you enjoyed it so that it can reach more people. ( And also to motivate me to keep going haha)

Feel free to share your own roadmap or feedback in the comments below.

‍

Want receive the best AI & DATA insights? Subscribe now!

•⁠ ⁠Latest new on data engineering
•⁠ ⁠How to design Production ready AI Systems
•⁠ ⁠Curated list of material to Become the ultimate AI Engineer

Latest Articles

View All Articles

Controlling AI Text Generation: Understanding Parameters That Shape Output

Control LLM probability distributions using temperature to modify softmax, top-k/top-p sampling methods, and frequency penalties for precise text generation.

AI Engineering

6

min read

VLM vs OCR Benchmark Part 2: Self-Hosted Quantized Models - The Reality Check

Building upon our [initial OCR vs VLM benchmarking study](https://www.dataunboxed.io/blog/ocr-vs-vlm-ocr-naive-benchmarking-accuracy-for-scanned-documents), this follow-up investigation tests the practical reality of self-hosted VLM deployment. While Part 1 established that Bigger commercial VLMs significantly outperform traditional OCR methods in accuracy, Part 2 addresses the critical question: Can quantized Qwen 2.5 VL models and tiny VLMs deliver production-ready OCR performance with reasonable hardware constraints?

AI Engineering

7

min read

Monitoring vLLM Inference Servers: A Quick and Easy Guide

Running vLLM in production without proper monitoring is like flying blind. You need visibility into request latency (P50, P95, P99), token throughput, GPU cache usage, and error rates to optimize performance and costs. This step-by-step guide walks you through building a complete observability stack using Prometheus and Grafana—the same tools used by companies like Uber, GitLab, and DigitalOcean. In 10 minutes, you'll have professional dashboards tracking 8 key metrics that matter for LLM inference performance. 💡 **Perfect for:** MLOps engineers, platform teams, and anyone running vLLM servers who wants production-ready monitoring without the complexity.

AI Engineering

4

min read

ROADMAP to become the ultimate AI Engineer

Erraji Badr

Table of Contents

Motivation :

AI Engineering Definition

Challenges

AI Engineering Roadmap

Roadmap details

Core Skills

LLM Fundamentals :

Prompt engineering :

OPENAI API :

RAG Systems :

Agentic Workflows :

Evaluation & Monitoring :

Specialization Areas

Knowledge Management Systems :

Agentic Systems Patterns :

Monitoring & Evaluation Techniques :

Advanced Techniques

Fine-tuning :

Self Hosting :

Conclusion

Want receive the best AI & DATA insights? Subscribe now!

Latest Articles

Controlling AI Text Generation: Understanding Parameters That Shape Output

VLM vs OCR Benchmark Part 2: Self-Hosted Quantized Models - The Reality Check

Monitoring vLLM Inference Servers: A Quick and Easy Guide