Teaching Materials: Intro to basic NLP in CLTK for Classicists

Dear people, today I wanted to point you to a new github repository where I started to share some of my teaching materials. In this case, it’s a basic intro to NLP, programming, python, NLTK and CLTK for Classicists who have never heard about any of this before. To get directly to the repo, click here.

My teaching materials usually include slides (at the moment using the beautiful metropolis template) and a cheatsheet using my colourful cheatsheet template (although some of them supposedly have the pre-share-as-template version of the cheatsheet which is a bit more complicated as opposed to the simplified version in the template). These repos are all linked to my Overleaf account. This is why there is a separate repo for the cheatsheet and slides from the same class. I tried to circumvent the inconvenience for you by creating a landing page called “Teaching Materials” where you can get an overview. 

This is what the cheatsheet looks like. It’s a basic intro to NLP, programming, python, NLTK and CLTK for Classicists who have never heard about any of this before.

So far, I have only added the slides and cheatsheet for this one workshop I gave in November 2019. It was a preparatory workshop for the ‘real’ workshop, that is “Using the Classical Language Toolkit”, given by Eleftheria Chatziargyriou and Clément Besnier on 2019/11/06 can be found here (repo). This happened the day after my workshop, or rather, my workshop happened on the evening before this one. The repo includes Jupyter Noteboks and Slides. It’s great material – be sure to check it out!

This is what the slides looks like. It’s a basic intro to NLP, programming, python, NLTK and CLTK for Classicists who have never heard about any of this before. The slides also include some installation info.

The cheatsheet and slides include lots of information, not all of which was supposed to be talked through in the 2-3h we had. Lots of information is included so the materials can be used as a reference materials for afterwards when people want to continue working on this on their own.

By the way, some of the slides include the content from the popular blog post Algorithms, Variables, Debugging? Intro to Programming Concepts.

The slides to the great ‘main’ workshop ‘Using the Classical Language Toolkit’ given by Eleftheria Chatziargyriou & Clément Besnier on 2019/11/06 can be found here (repo).

My original workshop was called ‘Digitale Sprachverarbeitung für historische Disziplinen’ and was held at Karl-Franzens-Universität Graz in November 2019. Probably somewhat annoyingly, the materials are part-German part-English, so probably better suited for German users. But big parts might still be interesting for those not speaking German. I’ll translate the whole thing if it’s requested.

I hope some of you can use these materials. They can be used for self-directed learning but are also free to reuse for teachers (see the details regarding re-use on the github). The LaTeX source for all the materials is available. I might add the code snippets in separate .py files, if desired (please let me know). But then again, for teaching it’s encouraged you type them yourselves for maximum learning effect!


the Ninja

PS: In the long run, I am also planning on publishing the same sort of materials from my classes taught at University of Passau (Lehrstuhl für Digital Humanities) in 2018-2020. They are two classes, one intro to XML and annotation in general and one intro to quantitative text analysis (mostly using tools and understanding what’s going on in QTA, but also some R in the end). But not all of these materials are actually ready for publication, so please be patient with me 😉

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!


I like LaTeX, the Humanities and the Digital Humanities. Here I post tutorials and other adventures.