Data Science Is Dead Again

I get where the sentiment comes from. The job looks very different today than it did five or ten years ago. Back then, a lot of the focus was on building and training models from scratch. You collected data, labeled it, trained a model, evaluated it, and repeated the process until it worked.

Today, with the rise of LLMs, many people believe this equation has changed. Most of us aren't training large models ourselves anymore. Instead of spending most of our time training large models, we're increasingly spending time designing systems that make pre-trainedmodels useful.

My view is that agentic AI is showing that the "Sexiest Job of the 21st Century" is back again.

The Agentic Loop Feels Familiar

One thing that struck me while building AI applications with my teams is how familiar the process feels.

A traditional machine learning workflow usually looks something like this:

Model → Data → Train → Evaluate → Iterate

Building an agent-based system isn't that different:

Foundation Model → Context → Tools → Evaluate → Iterate

The mechanics are different, but the mindset is the same. Instead of improving behavior through gradient descent, you're improving behavior through prompts, retrieval, tool design, and evaluation. You're still shaping a system, just with different levers.

LangChain's research on evaluating deep agents captures this well — the loop is recognizable, even if the tools have changed.

Terms such as "no-code agents," "plug-and-play," and "drag-and-drop" agents have been around for years, but mostly from people who aren't building these systems themselves.

The "Just Add an LLM" Myth

The systems people actually enjoy using — whether that's ChatGPT, Claude, Copilot, or anything else — depend on a lot of engineering, integration, and evaluation around the model itself:

Choosing the right model
Designing reliable tools
Providing useful context
Evaluating outputs
Monitoring behavior
Iterating constantly

Without those pieces, agents quickly become inconsistent, expensive, or simply wrong.

How expensive agents can become at scale was demonstrated vividly at the end of May 2026, when a company racked up a $500M Claude AI bill in a single month after putting no usage limits on employee licences.

The model matters, but the surrounding system matters much more when it comes to delivering real business outcomes and value.

Building a Small Agent System

My portfolio site is intentionally simple, but embedding a chatbot was a must-have. It was also a way to demonstrate hands-on capabilities. Even though it looks simple on the surface, underneath it's a small agent system. Building the full loop was important to me.

Please stress-test Simon Says.

"Simon Says" has access to information across multiple parts of the site, including blog posts, the About page, and legal content. That means responses can be grounded in actual data rather than relying purely on model knowledge or being squeezed into a massive system prompt.

It can also perform actions. For example, it can:

Subscribe or unsubscribe users by calling an API
Trigger transactional emails
Provide my CV
Apply company-specific styling to that CV (I added a web search tool for that)

Those aren't just generated responses. They're actions that change state and interact with other systems when necessary. That's what makes the chatbot feel more like an agent than a traditional assistant.

When I look at my portfolio project, I don't see a website with a chatbot attached to it. I see a small but complete AI system:

A data layer with Postgres
An application layer built with Next.js
A model layer powered by an LLM
An agent layer with tool calling and context retrieval
Observability through LangSmith from LangChain
An evaluation testing harness

Observability Matters More Than You Think

One lesson that surprised me was how important observability becomes once agents start using tools. It's easy to think a system is working because the final response looks correct.

But what actually happened? Google Research's work on scaling agent systems asks exactly this question — when and why do agent systems work, and how do you know?

Which tools were called?
How long did they take?
What context was provided?
And, more importantly — when I change the underlying model, add new context, or introduce new tools, does the agent still perform previous tasks as well as before?

To answer those questions, I integrated LangSmith and trace every interaction. Once in a while, I add new labels and run automated evaluations and experiments whenever I make a larger update.

LangSmith Experiments dashboard showing evaluation runs for Simon Says

LangSmith Experiments — 13 evaluation runs tracking continuous improvement before launch

To make improvements with confidence, I score conversations on:

Hallucination
Safety
Helpfulness
Instruction following

In many ways, it feels very similar to classic ML evaluation. The difference is that we're evaluating behavior instead of model predictions.

So Is Data Science Dead?

I don't think so.

Is it reasonable to expect data scientists to have a stronger understanding of system design and software engineering? I think so.

Even if I'm wrong about no-code, plug-and-play, or drag-and-drop agentic AI, my experience has been that working with data scientists is a genuine pleasure. You're curious, smart, open-minded, and deep thinkers. Who isn't looking for that kind of profile?

And to all the software engineers who entered the same boat after the "software is dead" moment in January: keep learning, stay motivated, and keep building. Headlines fade and often age poorly. The builders who keep learning usually age very well.

Let's continue the conversation

Whether you have thoughts on this article, are an AI enthusiast, or just want to grab a coffee to exchange ideas and network — I'd love to connect.

Sources & References

4 sources cited in this article

1

Data Scientist: The Sexiest Job of the 21st Century

The original HBR piece that coined the phrase — and whose prediction is looking more relevant than ever in the age of agentic AI.

Harvard Business Review2012hbr.org

2

Evaluating Deep Agents: Our Learnings

LangChain's practical findings on evaluating agent behavior — covering how the classic ML evaluation loop maps onto agentic systems.

LangChain2025langchain.com

3

Company Racks Up $500M Claude AI Bill in One Month

A real-world case study in what happens when agents are deployed without usage limits, governance, or cost controls.

The Economic Times2026economictimes.indiatimes.com

4

Towards a Science of Scaling Agent Systems: When and Why Agent Systems Work

Google Research's investigation into the conditions under which agent systems reliably deliver value — and why observability is critical.

Google Research2025research.google