AI Is Replacing Data Engineers
(But Not the Way You Think)
Yeah, so a lot of people are asking this question right now:
Is AI going to replace data engineers?
And I think the problem is that people are looking at this way too simply. It’s not a yes or no question, because what’s actually happening is more like a shift in what we are doing rather than everything disappearing.
Think in Waves, Not in One Big Change
What helps a lot here is to think about AI in waves.
We already had the first wave, which was basically coding with AI. Everybody started using ChatGPT, Copilot, Cursor, all these tools, and that already changed how we work quite a bit.
If you are mostly coding, especially things like:
SQL
Spark jobs
basic pipeline logic
then a lot of that is already being taken over by AI.
Not completely, but enough that it matters. And especially if that’s the only thing you are good at, then that’s where it becomes a problem.
Wave 1: Coding Is Getting Automated
We already see this in tools like Databricks with Genie, where you can just describe what you want and it creates the code for you.
That’s not some future thing, that’s already happening.
So if your job is mainly writing code, then yes, AI is going to replace a big part of that work. It’s just not that difficult anymore to generate code, especially for standard use cases.
Wave 2: Agents and Automation
Now we are moving into the next wave, which is agents and automation.
This is not just about writing code anymore, but about systems that:
collect information from different tools
check logs and metrics
connect APIs
help you solve problems
And that removes a lot of the work that engineers used to do manually.
For example, debugging. Instead of going through different tools, checking logs, checking metrics, trying to understand what’s going on, you have systems that help you gather all that information automatically.
That’s a big shift.
So Is Data Engineering Going Away?
No.
But it is changing.
And I think that’s the important part that people need to understand.
We are moving away from:
doing simple tasks
writing code all day
building small pieces
towards:
building end-to-end pipelines
thinking about architecture
solving actual problems
The Important Part: AI Is Not Good at Decisions
One thing that I’ve seen myself is that AI is not good at making decisions.
If you take the same requirements and give them to two different AI chats, you will often get two completely different solutions. That’s something that is really eye-opening.
So if you think that AI will just take over and make the right decisions, that’s not how it works.
It can help you explore options, but you still need to decide what actually makes sense.
This Is the Worst AI Will Ever Be
Another thing that people underestimate is that what we have right now is the worst version of AI we will ever see.
It’s only going to improve from here.
So if you already feel like parts of your work are being replaced or automated, then you should expect that this will continue and not go away.
Final Thought
AI is not replacing data engineers completely.
But it is replacing the parts of the job that are easy to automate.
So if you stay in that space where you are only doing coding or small tasks, then yes, you will have a problem.
But if you move towards:
understanding systems
making decisions
building things end-to-end
then you are moving in the right direction.
***
Ready to become a Data Engineer? Then join my Learn Data Engineering Academy today!
If you want to build real platforms, master the full stack, and close your skill gaps, check out my Data Engineer Coaching program.
If you are interested, but still have a few burning questions on your mind: feel free to contact me via hello@learndataengineering.com.
For more information and content on Data Engineering, also check out my other blog posts, videos and more on Medium, YouTube and LinkedIn!



Hello Andreas,
it would be a very good thing for AI to replace people who are calling themselves "Data Engineers". No man who calls himself a "Data Engineer" has earned the right to use the title "Engineer". "Engineer" has meaning. It is a title to be earned, not a title to be adopted by men who are not willing to earn it.
In terms of ETL and data warehousing?
I already invented the future of ETL in 2002 when I invented "typeless, codeless, mappingless" ETL. The enabled us to build much more reliable ETL. In 2009 we moved from using C++ as the software to run ETL to using generated SQL. The rate was 1000 fields mapped in a 220 hour work month. That rate stood from 1996 to 2017, more than 20 years. In 2017 I upped it to 6-8K fields mapped per 220 hour work month. And two years ago I upped that again to 12-15K field mapped per work month. So AIs for ETL development don't help people using my software. We are already much faster than any AI can generate code and my code is more reliable anyway.
Once I got to those productivity figures I changed focus a little to the time and effort of having multiple ETL systems and checked into the ability to have "mega models" implemented even on cheap databases like SQL Server SE. Turned out it's possible. My mega models I mean one data warehouse with one data model housing many customers data from a common large operational system like an ERP or telco billing system.
So with these two inventions data warehousing development costs will be reduced by 5-6X.
This is all free and published. You and anyone else can get my data models and software for free on the link in my BIO. For those men who are the smartest men in our sector? They are very likely to make a lot of money if they build a mega model data warehouse for a large operational system they are very familiar with. That's what my colleagues and I are doing, we shall see how it works out.
AI only does things that are so easy that they were very easily automated a long time ago by those of us who know what we are doing.