Machine Learning is one of those hot topics at the moment. It’s even starting to become a really hot topic in the Humanities and, of course, also in the DH. But Humanities and Machine Learning are not the most obvious combination for many reasons. Tutorials on how to run machine learning algorithms on your data are starting to pop up in large quantities, even for the DH. But I find it problematic that they often just use those methods, just show you those few lines of code to type in and that’s it. Frameworks have made sure that ML algorithms are easy to use. They actually have a super-low entry level programming-wise thanks to all those libraries. But the actual thing about ML is that you need to understand it or it’s good for nothing. (Ok, I admit there are some uses which are pretty straightforward and don’t need to be fully understood by users, such as Deep Learning powered
beginner friendly
A category to post “beginner level” posts to make them easy to find. The topics get specified by other categories and tags.
Amateurishly beautifying event photographs
Do people think that you, as a DH person, are also responsible for your project’s outreach activities yet nobody considers
An amateurish but functional tutorial to make (logo) vector graphics from photos
Do you need to make a vector logo for your project but have no idea how? Do you want to
read more An amateurish but functional tutorial to make (logo) vector graphics from photos
Easy and quick strategies to #scicomm your DH project
Your digital project is great, I’m sure of that – but does it even exist if nobody knows about it? Science communication is the answer to avoid this philosophical dilemma. In this short post, I wanted to share a list of quick-and-easy-to-implement ideas to add some science communication to your projects. This is just a short post to give you some ideas, not tutorials on how to do it. However, I am open to any tutorial requests you might have on the topics involved. As for the Twitter bot, there is a short post available already. So let’s get to it! Quick and easy strategies to #scicomm your DH project Create a better / thematic / facetted search interface. Maybe people aren’t using your data because the interface is not intuitive and they can’t find things or don’t know what to look for and where to look. This is the basic building block to build all the following things on.
read more Easy and quick strategies to #scicomm your DH project
Create your Tweepy/AWS-powered Twitter bot in a day
This post wants to convince you to try out creating a Twitter bot using Python Tweepy and AmazonAWS Lambda because it’s easy and fun. Of course, you can use any other utilities but Tweepy and AWS Lambda are the ones I tried. This is not a full tutorial but I can make one if anyone is interested. Inspired by the #100DaysofDH challenge In this post, I will just give you some basic Twitter knowledge, links for what you need to know to get it done and a link to the github of my #100DaysofDH challenge for which I implemented such a bot. If you want more guidance, please let me know. Also, read the post on the challenge because I noted down some restrictions I realized the Twitter automation guidelines impose on bots as I went along. In my example, I think I’m in fact doing one or two things which you actually shouldn’t do (I think bots shouldn’t like
read more Create your Tweepy/AWS-powered Twitter bot in a day
Join the #100DaysofDH Challenge!
I have been following the #100DaysofCode community for a while now and thought that it was sad that there didn’t seem to be a connection with the DH community. 100 Days of Code is such a great project which is motivational for those willing to learn but also a great way to foster a community. So I thought, why not start #100DaysofDH and I did. Looking forward to your contributions! The main activity around this will be happening on Twitter (account is @100DaysofDH, hashtag #100DaysofDH) but there is also a minimalist github.io page: https://100daysofdh.github.io/ On the github, you can also find the current state of the Tweepy and AWS-powered bot. The story behind the creation of this challenge Before getting into the details of how the challenge works, let me share some thoughts that I had in mind for the adaption of the 100 days challenge to the DH (skip this part if you just want the rules which can
Looking at data with the eyes of a Humanist: How to apply digital skills to your Humanities research questions
In my recent post on how to get started doing DH, I basically said that the essence of being DH is looking at data with the eyes of a Humanist and gave some tips on how to get started in just 10 days. However, it’s not that easy. Learning digital skills and the problem of skill transfer A problem I see a lot is that H people fail to transfer their newly won practical DH skills to their own research questions. They don’t know how to look at their own material as data. They don’t know how to leverage digital methods to help answer their own research questions. But if it isn’t compatible with their own research, they’ll never deepen their knowledge enough to actually profit from their DH skills. If you don’t use them, they are forgotten quickly. So how do you make this transfer which I think is, so far, being neglected as a skill which has to
Formulating Research Questions For Using DH Methods
In the feedback forms I did on the DH classes I have taught over the last years, I got one feedback I didn’t expect: People were extremely grateful I had practiced with them how to formulate valid research questions which, apparently, no one had ever (really) done with them before. I found that quite astonishing because the DH are all about methods and methods are like specizalized tools. You need to know what you can use them for. So here’s the crashcourse. The Hammer and the Nail I want to start off with an analogy. A hammer is a specialized but not an extremely specialized tool. You can use it for a range of tasks, however, not all tasks are going to work equally well. Some might work but would actually require a more specialized tool if you had one. You can really use the hammer on about anything and almost always, something is going to happen. For example, you
read more Formulating Research Questions For Using DH Methods
Teaching Materials: Intro to basic NLP in CLTK for Classicists
Dear people, today I wanted to point you to a new github repository where I started to share some of
read more Teaching Materials: Intro to basic NLP in CLTK for Classicists
An easy intro to 3D models from Structure from Motion (SFM, photogrammetry)
Using photogrammetry to obtain 3D models has become one of those ‘hot topics’ lately. For that reason, I wanted to
read more An easy intro to 3D models from Structure from Motion (SFM, photogrammetry)
Understanding Scalability and Relative Values
What is the difference between 12pt and “format as heading“? Between 50px or 0.5\textwidth? Most of us know that we should always prefer relative to absolute values. But many who are new to webdesign or LaTeX don’t really get why. All of us who typeset papers and conference proceedings know that years of using MS Word does not necessarily teach you that difference either. This short post will try to remedy this in a quick and painless way 😉 In a WYSIWYG texteditor: Fontsize 12pt or “Format as Heading” In the case of a text editor, it is advisable to use the format templates rather than manually changing headings and so on for simple reasons: The information is stored as markup and if we tell the program what we want formatted as a heading, the machine gets semantic information about the text. Most people will understand that something is meant to be a heading when the font size is manually
Automating XML annotation: Get more done using RegEx Search&Replace and xsl:analyze-string
Annotation is a fundamental part of the DH. But often, us DH people don’t actually do the annotation. We do
read more Automating XML annotation: Get more done using RegEx Search&Replace and xsl:analyze-string
How to historical text recognition: A Transkribus Quickstart Guide
Today I wanted to share a little quickstart tutorial for the Transkribus Software. Its purpose is Handwritten Text Recognition (HTR)
read more How to historical text recognition: A Transkribus Quickstart Guide
Algorithms, Variables, Debugging? Intro to Programming Concepts
Since I am about to prepare a workshop on natural language processing and a pre-workshop-workshop where I need to quickly/crashcourse introduce my (non-digital) Classicist friends to some basics on programming, let me share a list of programming concepts I compiled with you. I would be happy for your suggestions and comments regarding mistakes. I will probably publish this together with some key concepts of quantitative text analysis (blogpost to come) on a cheatsheet or as slides for you later 😉 Intro to key concepts of programming This list of concepts is not super-structured and meant to work as a ‘reference tool’ as well as a text to be read, so I tried to give it a more or less useful ‘chronology’, meaning that later parts kind of build on earlier ones. I start off with what a computer program or algorithm actually is and how we translate between source code (the code we write) and the code which gets fed
read more Algorithms, Variables, Debugging? Intro to Programming Concepts
Academic Posters – How to design: My favourite tips
This is a post dealing with some simple tips to keep in mind when making academic posters. I have gotten into the habit of not posting very regularly over my fellowship this summer, but I will get back into the rhythm of approximately one post a week 😉 So I decided to give you a quick post with some tips on academic posters here. There will be follow ups on how to make them either using GoogleSlides (if you’re really stressed and can’t learn LaTeX first) and another version where I explain how to create a poster in LaTeX using an Overleaf template for complete beginners. So, without further ado. How to get a nice poster without a lot of skills and little effort: Have a color scheme. Probably best start with your project colours (from the logo, project website) if you have some. This ensures you have some sort of brand identity and your project is recognized more easily.
read more Academic Posters – How to design: My favourite tips
Cheat Sheets and Study Summaries
This is a quite long post about cheatsheets and also about effective studying. When you need a cheatsheet, chances are
Typesetting Code in LaTeX
Since I recently pulled a few all-nighters to prepare code slides to teach my students R and they were less
Two basic image manipulation life-savers
In this post, I wanted to share a few tricks for simple image manipulation (with the goal of making pictures
Floating minipages and other wizardry
Inspired by a current issue from my friend the LaTeX Noob, I wanted to give a short explanation on how you can combine floats (i.e. figures) and minipages. Why should you care? Well, if you need tikzpicture or images placed besides eachother or beside text. So most people will probably need this at some point 😉 A great resource is the WikiBook, as always. If you want the lengthy account – that’s the way to go. For everybody else, an explanation of my own. Floats and non-floating boxes What are floats? Some fundamental explanations first: A figure is a float. A minipage is not a float but a box which sits at its fixed place. These are two fundamentally different things. When you combine them in a bad way, LaTeX might get fed up at this. So when planning your minipaging or floating situation, ask yourself which effects are really important to you and which aren’t. Do I even need
Simple XML to LaTeX Transformation Tutorial
Today, I wanted to share this super simple XML to LaTeX tutorial. Using XSLT, you are going to transform XML data to LaTeX output which you can then go on to compile into your desired output PDF. There will be no fancy stuff whatsoever in this post, just the basics and what to keep in mind with these transformations. It is the quick intro to XML to LaTeX I did with my students a while ago which was done one day after they had their first contact with XSLT, so it should really be beginner-friendly. I labeled it “Advanced LaTeX” anyway because I think starting to automate things is always a step in the right direction 😉 Configuring the transformation scenario in Oxygen I am going to assume you use Oxygen now because that’s what a lot of people in the DH do and this post is directed towards my friends in the DH. Especially those who think print editions