Opportunities and Limitations of Computational History

By now, we have all heard of computational humanities. We knew of digital humanities and digital history, but now, all of a sudden, it seems that computational history is a thing. I will not get into defining how these four things differ (we all know the field has a habit of both obsessing over and avoiding self-definitions), but here’s a short history of computational history approaches in case you are new to the field or have come across the term through the recent wave of AI tools.

This blogpost may or may not be an ad for my recent article on the topic, which you may check out in case you want to know more: Lang, Sarah A. “(Doing) Computational History: The Role of Data Work in Computational Approaches.” Histories 6, no. 2 (2026): 26. https://doi.org/10.3390/histories6020026

Where it began and where we are now

Computational history didn’t appear overnight. Its roots go back to the 1960s and 1970s, when historians began experimenting with quantitative methods in fields like historical statistics and economic history. These early efforts were often quite narrow in scope, focused on specific, well-defined problems. Even at the time, critics pointed out that such approaches seemed limited in what kinds of historical questions they could answer.

Things started to shift with the rise of digital humanities. As digital tools became more common, historians began exploring methods like distant reading—analysing large collections of texts computationally rather than through close reading—and later distant viewing for visual materials. These approaches expanded what historians could study, especially at scale, but they still required a certain level of technical expertise and were not always widely adopted.

The late 2010s marked a turning point. Machine learning tools became more stable, accessible, and easier to use. While some scholars had already been working with these methods, this period saw them gain much more visibility within the discipline. The launch of the Computational Humanities Research Conference in 2020 is one sign of this shift, reflecting a growing sense that computational work needed its own space—even if others argued it had already been well represented within digital humanities.

Then came the real acceleration. By 2023, large language models like ChatGPT brought computational methods into everyday academic awareness. What had once been the domain of specialists quickly became accessible to a much broader group of historians. Researchers who had previously relied on more traditional, qualitative approaches began experimenting with AI tools, not just for writing and publication but also for data analysis.

So what is computational history, actually?

If the short history above makes computational history sound interesting to you, the next obvious question is: what does it actually look like in practice? (Note that this may actually be my hot take and not an opinion shared by all who consider themselves practitioners of this art.)

At the most basic level, computational history tends to happen in very specific contexts. It works best when there is already a large body of digitised sources and when there is a team with the technical skills to do something meaningful with them. That immediately makes it selective. You need the data, the infrastructure, and the expertise—and all three are unevenly distributed.

Because of that, a lot of computational history right now is exploratory. Researchers are testing what these methods can and cannot do with historical material. And one thing becomes clear quite quickly: these tools don’t magically “discover” things, just like chemical tests don’t just “test everything”. They only detect what they are designed to look for. If something isn’t captured in the whole design, it simply won’t show up.

This is where some of the hype runs into reality. Yes, computational methods can reveal patterns at a scale that would be impossible to see otherwise. But that doesn’t mean they produce sweeping, general insights. In fact, many of the most successful projects focus on very narrow, carefully defined questions and take years of work to get there.

Think of projects that try to reconstruct lost books from fragmentary evidence, or that model the transmission of scientific texts across centuries. These are impressive, but they depend on a lot of groundwork: identifying suitable sources, building or adapting tools, training models, and often working across disciplines. They are not quick wins, and these are not methods easily replicated using your own data.

This also means that computational history is, somewhat paradoxically, both labour-intensive and narrow in outcome. There’s a common misconception that “using AI” (or other computational tasks that people nowadays like to lump together) somehow reduces effort while increasing scale. In practice, it often shifts the labour instead of removing it. Time goes into cleaning data, designing models, checking outputs, and interpreting results. The work doesn’t disappear, it just changes shape.

Another important point is that computational methods require unusually precise questions. You need to know, in advance, what you are trying to measure and whether it can even be measured with the available data. —Of course, traditional historians are now going to pretend that they, too, always already know exactly what they’re looking for from the very first step but I’m calling bullshit (respectfully).— This can lead to analyses that feel more binary or confirmatory than the open-ended interpretation many historians are used to. Unsurprisingly, this is sometimes where friction with more traditional approaches appears.

At the same time, it’s worth noting that the results of computational analysis are not, by themselves, historical interpretations. Maybe this was already obvious to you, but I’m just putting it out there in case you hadn’t yet made up your mind. The results of algorithms are outputs (patterns, classifications, probabilities). Turning those into historical arguments still requires careful, qualitative work. If the model is poorly aligned with the historical question, you may end up learning more about your dataset than about the past.

There are also broader concerns that come with using these methods. Historical datasets are rarely complete or representative—they are shaped by what has survived, what has been preserved, and what has been digitised. Computational approaches inherit these biases. Questions about how corpora are constructed, documented, and audited are therefore central.

And then there is the issue of scale versus effort. While some argue that AI automates research, others point out that it often creates new kinds of work. Even though not all computational history is AI, the broader idea still applies to most of it. Historians may find themselves checking model outputs, maintaining workflows, or learning technical skills that are fields in their own right. Doing this well (especially outside of well-funded projects) can be demanding.

All of this leads to a slightly sobering conclusion: computational history is not a shortcut. It is a different way of doing historical research, with its own constraints and trade-offs. It tends to produce precise, targeted insights rather than broad generalisations, and it requires significant investment to do so.

That said, this experimental phase is still valuable. Carefully chosen case studies are helping us understand where these methods work, where they don’t, and what they might become. They are also forcing the field to ask bigger questions: what role should computational methods play in historical research? How do they relate to digital history more broadly? And what counts as meaningful historical knowledge in this context?

One area that will must become increasingly important (at least according to me) is corpus criticism, i.e. thinking critically about the datasets we build and use. And standardizing workflows to facilitate such critical analysis so that it can become a habit, an expected part of every single computational study (and don’t pretend everybody’s already doing that anyway, I hear that each time I put a critical DH paper through peer review and please, let’s stop lying to ourselves about this: we don’t yet do it.) If computational history depends on data, then understanding that data with its limits and biases as well as its construction history may turn out to be just as important as the methods we apply to it.

So long.
Best,
the Ninja

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

LaTeX Ninja'ing and the Digital Humanities

The verb "to ninja" means "to act or move like a ninja, particularly with regard to a combination of speed, power, and stealth." LaTeX adventures, demystifying digital tools for Humanists, one tutorial at a time.

Opportunities and Limitations of Computational History

Where it began and where we are now

So what is computational history, actually?

Leave a comment Cancel reply

Where it began and where we are now

So what is computational history, actually?

Share this:

Related

Leave a comment Cancel reply