Two basic image manipulation life-savers

In this post, I wanted to share a few tricks for simple image manipulation (with the goal of making pictures look better) I’ve picked up over the years. While tip one will always be to learn to take better pictures, maybe these little tricks will help you when you are asked to put the latest conference pictures online and realize they’re not good and really need a makeover first. So one thing will be to beautify event photos. Another tip I share is how to easily create a nice little vector graphic from a photo of some historical object or person.

First things first

This might seem obvious, but make sure you only use images you actually have the right to use. Check explicitly, don’t just assume! Get written permisson, if it’s not 100% clear what your rights are. If you are supposed to take pictures for an event, most people now mention that pictures will be taken in the event invitation already and that by attending, you agree to have your picture taken. Check what practices are at your institution.

Secondly, I am not a graphic designer or anything and fully self-taught in this realm. These tips work well for me and might work for you too. But just to make sure: I’m not actually competent to teach anyone on this topic 😉

Vector graphic from photo

I want to give you a few tips on how to create a nice vector graphic from a historical photo. Vector graphics make for nice logos, for example. Adding one can be all the spicing up a simple template or poster needs. It’s an eye catcher. They take a while to make if you want a good one (a few hours), but can be done on a train ride or during a meeting 😉 and I’ve never regretted making one and people loved them. Even though I’m not a professional logo / graphic designer and my results are far from perfect.

tikzposter-with-timeline
A TikZ poster for the exhibit “Lyrik des Widerstands – Richard Zach zum Gedenken”, PH Steiermark / Graz / Austria. It illustrates what I mean by “a nice vector graphic can go a long way”. At least I hope it does.

How to go about it

Scan very high quality if the image is from a book. Or take a high quality photo, reducing background elements if possible (will save lots of time later). Get rid of noise (Gaussian blur, 10px or something, try it). Then vectorize it, for example using Vectorizer.io: they will let you do quite enough if you sign up for a free account and disable your adblock. You can’t do very many images at the same time but since you’ll have other image manipulation taks to do in between, this will be just fine. Of if you’re pressed, use a trial account.  Or do Vectorization.org. I also like the Imaengine app. But you can only save the results when you pay a few bucks for the full version (worth it though). In this case, save it high quality. But maybe also get back to Imaengine only after you’ve cleaned out background elements you don’t need to add final touches. Play around with reducing the number of colour or adding some blur. But reducing the number of colours basically is the main thing you want to do. Maybe start with the smallest possible number of colours and go back up until you think you got the “feel” of the original image back, but without the details. (For example, with just two colours, you’re likely to have lost the original “feel”.)

Then the work starts: Manually get rid of patchy sections as well as all the background in GIMP. It will look a lot better and way more professional if you just take the time to do this (even if it’s not perfectly done). This is where you can invest the 1-2 hours I mentioned this would take. As I said, I am by no means an expert but I feel that most image manipulation things just take a lot of time and effort – they aren’t necessarily difficult to do. So even a beginner can achieve an ok result if they’re just willing to put in the time.

Edit: For getting rid of the background, you could also use services like Remove.bg to speed things up, as was pointed out thankfully by the #TeXLaTeX community on Twitter. It works super well with many types of photos. However, with my early modern print thingy, it didn’t know what the foreground was supposed to be: “Please select an image with a somewhat clear distinction between foreground and background. For instance, try a photo of a person, product, animal, car or another object.”

maierus
This image is a direct result of a vectorization tool and now you probably understand why we need the manual labour. The simplicity of the end result is the result of lots of work. But tweaking the vector output can help, too. This was made from a scan of an engraving depicting the early modern iatrochemist Michael Maier.

Manually drawing on the computer in an image manipulation program is not everybody’s thing an might take some getting used to. Many people will want to use a mouse for this. Don’t get discouraged by initial failures. It still is a question of practice and routine. You will, however, be able to get instantly better and more failsafe results if your base image is fairly big and you zoom in very closely. Start with baby steps. It will get easier once the most patchy areas are gone. Once you’re done, run it through the vectorizer once again to get rid of some of the small patchiness you might have overlooked or created while manually drawing around.

Delete the background. This can be done in multiple ways. If it’s all done in the same color, there is the option ‘color to transparency’ (or something like that). Else you might want to use the wand too and click delete or the like. Save (=export) the result. Voilà, your vector graphic is done.

maierus3
After you’re finally done with the manual editing done and marvel at your finished vector graphic, you’ll be so relieved you’ll start making pop art out of it. Believe me, I know what I’m talking about 😉 Oh, those train rides…

 

 

Some more tips

  1. To take into account: If the logo is supposed to be displayed very small, be sure to reduce a lot of detail. Always opt for 30% less detail than you think you need.
  2. Maybe make multiple logos or draw stuff on paper. Multiple quick sketches always yield better results than trying to get it perfect the first time. Fail fast and learn from your failures.
  3. Another win when using vectors is that you can output them as SVG or enlarge their size without things getting pixely. This might also be a reason you want to create a vector graphic in the first place.

Summary of a few GIMP shortcuts (learn some to significantly speed things up)

  • O = get the color (click)
  • P = switch to pencil
  • CTRL + C: copy
  • CTRL + V: past
  • CTRL + SHIFT + V: paste as new image
  • CTRL + SHIFT + E: export as (this is the ‘saving option’ in GIMP which will save the actual product, not the project itself – ‘save’ will save the project).
  • CTRL+Z to undo
  • paste (CTRL+V)

Beautifying event photographs

Use the GIMP Retinex filter

From my research into the retinex filter you should only change the bottom-most setting to adapt the filter. The stronger you make it, the less natural the result, the more noise will be created, the more colours will be enhanced.

retinex-filter3
With retinex, adjust only the bottom-most option “dynamic”.

Usually, the point will come quickly where you crank up the colours and sharpness too much so the result looks weird and gets a “steely” feel (try it out to see what I mean).

 

retinex-filter2
Overdone retinex. Colours ruined. “Steely” feel.

 

For more or less failsafe use of this if you want to achieve results that still look more or less natural: use the filter, then copy the product (CTRL+C). CTRL+Z to undo the effect in the base image. Then paste (CTRL+V) the version with the filter on top of the now reverted to its initial state image. Open the layer editor, click right: make the pasted section as a new layer. Then scale down the manipulated layer’s transparency to 50% (or as you like). The original picture will be improved without looking all unnatural. And you won’t have to take a lot of time looking for the perfect retinex setting. Just take one that looks ok. With the overlay, details will get lost anyway. And by the way, retinex will always create noise. The default settings reduce this to a minimum but it’s still pretty noticeable. So maybe use a relatively fine-grained Gaussian blur afterwards to make up for it again.

And maybe combine the effect with the “three layer trick” explained below. Play around with it. Find something that works for you. I’m no expert either but I found these tricks quite useful.

 

The three layer trick to bring out colours and contrast

Copy your initial picture into three layers. Desaturate the first to grayscale and invert it. Then do a light Gaussian blur on it, set its transparency to 30-35% and ‘merge down’. Then (you’re ‘on’ the middle layer now) set this layer’s mode from ‘Normal’ to ‘grain merge’ (‘Faser mischen’ in German), then merge down. This will bring out colours and contrasts better. However, if the colours are not nice to begin with, they will just be intensified as if you’d cranked up the saturation too much.

If this is no good, mabye try the retinex filter to bring out the colours and sharpen instead. I found this three layer trick on the internet somewhere in the past and sadly, can’t find it anymore. If you accidentally happen to know the source, feel free to point it out to me 😉

Other things you could do to enhance your images would be to tamper with the white balance or colour curves. But I decided not to add other stuff here to keep it simple and actionable.

both-filters
Left: initial image. Not bad, but I’m not happy because these wooden rooms always give nasty light. Right: image after having used both techinques explained above. First the three layer trick, then retinex. The result is not much different in the way that it looks “obviously manipulated”, but it does kind of look better and more professional, I feel.

Very basic ‘mini’ retouch for dummies

Select the eyes and increse colour and contrast. Select the skin regions where there are no lines (like the nose, elbow pit on the arm, etc.) and blur them – either manually or with the blur ‘pencil’. This will smoothen out impefections. Don’t overdo it though or it will be very noticeable. Don’t smooth over lines or the result will look comical at best.

The best image manipulation is taking better pictures in the first place

But, the most important thing is: Try to get hold of a good camera and take pictures in good light. I personally find it easy to take good pictures when the light is good but when you would need flash, human beings always look bad. At least if you don’t have a clue what you’re doing. So I try to avoid situations where flash is necessary when taking pictures of people. Maybe other people are better at this. But if people are not wearning tons of foundation and powder to mattify their skin, flash is not advisable and you might get better people pictures without it.

Also, ever wondered why models spend so much time on makeup before photoshoots?That’s because image manipulation can only do so much. The better the pictures you start out with, the better the result.

Conclusion

So, these were some simple tips. I’m not really good at this myself but I’ve been doing it for a few years. These little tips have helped me in situations where, for example, I was asked to beautify some snapshots of the last conference. My experience is that your suffering will be minimized when you have good light and can bring a good camera. This is the most important part. “Repairing” pictures taken under bad circumstances and making them look good is a really hard task. But making mediocre shots a little better is something you can learn. But then again, if you only need small pictures for social media, phone apps offer filters to improve your pictures with a few clicks. So maybe it’s not even relevant for you to “go traditional” and use GIMP at all.

Hope this helps someone,

best,

the Ninja

 

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

Advertisements

Don’t call it a database!

When I started this blog, one of my promises and goals, apart from LaTeX-Ninja’ing, was to demystify the Digital Humanities for non-DH people. For a long time I have watched and I think one of the big mysteries of the DH still persists in Normal Humanists’ heads and thus, really needs demystifying. You might have guessed it, I want to explain why DH people will cringe if you call digital resources ‘databases’ which are not, technically speaking, databases.

Is it ok to call any digital resource / corpus a ‘database’?

We know, that’s what you tend to call a digital corpus. But in most cases it’s not correct, it’s a pars pro toto. A database is just one possible technical implementation, but the term is used more broadly for any ‘digital base of data’. By laypeople, at least. A pars pro toto stylistic device is a Humanities’ thing, right? You do get stilistic devices. So you can also understand why you shouldn’t use imprecise terminology. You don’t like it when people misuse your fields’ terminology either and probably make quite a religion about it.

If you want to work with the DH, you need to understand their terminology and respect it by using it correctly. Even though it might initially feel unintuitive to you. Believe me, you will adapt quickly if you give it a try.

I’ve caught myself so many times now, educating my Normal Humanist friends about digital resources and why my (DH) colleagues won’t take you, as a Humanist, very seriously if the word “database” slips out your mouth at inappropriate moments. It’s kind of like the Tourette’s of NH-moving-in-a-DH-world. Which probably is not a politically correct analogy. No offense to people who actually have Tourette’s, I don’t want to devalue or disrespect your struggle in any way! It’s just analogical in the way of spluttering out inappropriate words at inappropriate moments.

Everybody has their cringe-prone terminology item, right?

To be honest, I am not sure how strict the English speaking DH world is in this, but I can guarantee you that this distinction is very valid concerning the German language use of “Datenbank”. When a quick web search yielded this result, I wasn’t sure anymore if it’s actually a thing in English too. Digital Humanities at King’s College define a database as follows:

Database is the term we use for any large collection of online material.

( https://libguides.kcl.ac.uk/dighum/dighumdbase )

This, however, is exactly the way I don’t suggest you use this term. I am aware that this is the association linked with it in many people’s minds. But hey, you are Humanists. You do have a sense for the intricacies of terminologies, right? I, for one, really hate it when people use the wrong gender on the term corpus (in German: neutral (!) for a collection of documents, so always neutral, unless you mean an actual body like that of a musical instrument). You probably have a thing like that, too, where you get furious at laypeople saying it wrong, don’t you? Well, the DH equivalent of this thing is the misuse of the term database.

Using terminology correctly is a sign of respect towards the DH community. It shows you respect us as researchers and don’t think of us as the ‘idiot who does the tech stuff’

Well, to be exact, it’s not even a misuse. You sure can use the term database in this way and it’s not, strictly speaking, completely incorrect. It’s just misleading, and – most importantly as the subject of this post – it is a strong pointer to the fact that you are not very tech-savvy and either unaware or else disrespectful of digital terminology. It will be seen as either a lack of respect and esteem towards the digital field or, I don’t know which is worse, a lack of competence in general. You would deem it impolite, too, and probably take it as a sign of general incompetence or lack of intellecutal ability/openness  if a DH person came along and persistently misused your terminology, right?

Edit/addition 2019/06/04: I think this issue is less about whether it is technically or theoretically correct to use a term like this or like that. It’s a question of being ‘politically correct’ and of not hurting people’s feelings. To show the point on an extreme example (which is maybe exaggerated applied to databases but illustrates the point): you could theoretically argue that the term ‘nigger’ has been used historically to mean ‘person of color’, ergo it would – terminologically speaking – not be incorrect to use it, right? Wrong. In this case, it’s obvious (to everyone, hopefully) that it would be extremely rude and not ok to call a person of color a ‘nigger’ nowadays. Nobody would be confused if people’s reaction to this was to feel insulted because the above explanation does not take connotations into account.

Like you could say that before the advent of the DH, it maybe wasn’t a big deal to throw around the term ‘database’ to mean any digital ‘base of data’, but since the DH is starting to be established as a discipline and not only as a tool like it might have been in the beginning, things have changed. DH people sometimes feel like their competencies are not taken seriously because their part of the job is seen as the ‘handiwork’ whereas the non-DH input data is the actual research. I think that this latent inferiority complex, or maybe rather some sort of struggle for recognition, is the reason non-precise use of DH-related terminology is sometimes taken bitterly.

So ultimately, it’s not about being right or wrong. It’s about being respectful and not hurting other people’s feelings. Also, non-DH people insisting on using the term in a non-DH way while simultaneously wanting to participate in a DH project will cause a clash of terminology. It might be ok for a non-DH person to use the term like this, but DH people are kind of bound to use the term in a strictly technical way or else they might be seen as incompetent of their own field. In this case, I think the non-DH person should give in because even when they will not be judged by their use of DH-specific terminology, a DH person will. You don’t want your imprecise language to reflect negatively on your cooperation partners.

Since the initial publication of the post, I received the feedback that some people with technical backgrounds are quite open to non-technical uses of the term ‘database’. But from my own experience of the DH overall, I feel this is not necessarily representative. And only because people will accept that it is theoretically valid to use the term to one’s own judgement, that doesn’t mean people will condone it in practice.

If you want collaboration, start actually collaborating by learning about DH terminology

Especially if you are trying to get a collaboration DH or label DH project, I suggest you prune your language a little bit here. After all, DH people usually have lots of people queueing to get a project with them. They will tend to take the ones interesting for them (in terms of subject) and/or those where the applicants seem nice. And it is deemed base politness to research your collaboration partners’ field so you don’t draw a complete blank. You want your partners to be understanding and reasonably well-educated on the baseline of your field too, right? And you probably catch yourself sometimes, secretly saying to yourself in indignation or disbelief ‘How can any academic not know that?! This is completely basic!’

Well, it happens to DH people, too. Often concerning so-called ‘databases’ which are not, in fact, databases. If you persistently use the term wrong, it’s seen as lack for trying or plain incompetence. Don’t be rude. Now you are aware of the problem, you have no excuse to continue saying it wrong.

How to know if it’s ok to call it a database?

Two questions to ask to get a feel for whether what you mean might actually be a database:

  1. Would it make sense to represent this data in an (Excel) spreadsheet? Then it is likely someone chose a database format to represent it digitally.
  2. Are there any other fitting means to represent it? Only because it outwardly looks like it stores spreadsheet-formatted data, this doesn’t mean it’s they way data is stored “behind the scenes”.
  3. In case of any doubt whatsoever, just refrain from calling it database. Just say ‘digital resource’, ‘digital corpus’ or basically anything else which seems half appropriate. Anything else is way less stigmatized and cringe-worthy than the misuse of ‘database’.
  4. So to be on the safe side: Just don’t call it a database unless you’re sure it is one (technically speaking). When not 100% sure, just term it a digital resource or digital data collection. I know you just mean a ‘digital base of data’. But please respect that the wide category ‘digital base of data’ doesn’t mean the same thing as the narrow term of ‘database’ in a technical field.

Just a little thought – hope this helps!

Best,

the Ninja

PS: Can someone tell me whether you think it is valid to inform people they should not use the term database or would it be ok with you when they use the term in a non-technical way? I only know that people around me react quite aggressively when you do and will think you’re a technical layperson, thus not trust you much once you did. You’d basically be ‘disqualified’ after that unless you really have some very interesting other assets or extremely good grant acquisition records or splendid networking connection value.

 

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

How to improve at programming when your current position doesn’t require it & Online Learning Resources

Have you ever felt like you would like to get better at programming, maybe even get a position involving more programming some day but the fact that you currently don’t really need it at your current position seems to hold you back? This post is for you.

Daily practice is key for improvement

You need daily practice if you actually want to improve. You already need daily practice just to keep your skills sharp during a time where you don’t need to use them. Also, if you don’t even have programming skills yet, you probably are too tired after work to sit down and work on a private programming project.

But you should. Programming is a skill which takes a long time to learn. That is, if you want to reach a decent skill level. This means that you have to start regular practice long before you actually need that skill or need to apply for a job, if possible.

The common advice: Find a starting out project to program in your free time

When I first informed myself on this, the most common advice was to find a cool project and try to program it in your free time. But I found that when you don’t have an idea for such a project which makes complete sense to you, you’re not going to go through with it. Without an acutal, urgent need you probably won’t sit down the amount of hours necessary to actually make progress. At least the great majority of people wouldn’t.

Then I tried setting myself mini-challenges. This was a good idea. But there wasn’t a lot of guidance (obviously). This, in return, was discouraging and wasted a lot of time. Of course, time well spent learning something. But since I am really interested in effective learning, I felt that I was wasting time. Effective learning is always better than just playing around. You need curriculum. At least some. That’s when I found out that there are a lot of competitive programming sites which offer useful short exercises where you can do one per day.

Before that, I had tried all sorts of “Learn programing” sites like SoloLearn, freeCodeCamp, the Enki app, just to name a few of my favourites. But really, I didn’t like them all that much. They went through the syntax of the language and that was it. Like learning the vocabulary of a living language but never using it.

Review of some of the materials out there

Here, I want to share links to all sorts of learn coding sites out there. Of course, it’s not extensive but I think it does cover a good few of the most important ones. And maybe has a bit of a different perspective from most other “here are 30 sites to learn coding for free” blog posts out there.

A constraint for me was that the site had to be completely free of charge. So, for example, DataCamp is a no-go. Although, I have still linked to some pay-sites in case you are interested.

But especially as there are so many pages where you can get a similar service for free, I don’t see why I should pay for one of them, when there are so many alternatives available. If it were really brilliant, I would probably pay for it, in theory. Like HackerRank, I like so much at the moment, that I would probably get access for a one-time payment. But most of those apps and sites ask for monthly payment of up to or even starting at 10€ – that is a crazy amount of money and you already get books starting from 10€. I personally would always put more trust in a book in terms of quality and the hope that there might be a logical progression to the teaching; and thus, rather go for that. Also, I just don’t do monthly subscriptions. They eat so much of your money and mostly, when you sum it all up, are not really worth it. 

Also, the amount of sites (plus all the apps!) out there has become so huge, it’s really a full-time job to check them all. That time might be better invested in just picking one and learning to code. These other sites that I found initially focused more on interview prep for experienced programmers or programming contests, but many of them have actually developped training tracks. They also tend to offer a broader range of subjects than many other very popular sites which focus on web development (“coding”) mainly.

Enki app

I had liked the Enki apps daily workouts, but the learning progression was not stable. They give you a random tutorial every day. This was not very effective and I quickly was through with all of their material. Ergo, the workouts started repeating very quickly.  For more, you would now have to pay monthly for access which I am not willing to do since learning-wise, it is not sooo well done. It was really good for a free site (for a while I used it a lot), but not good enough to be paying for it. Sadly, I have experienced this with most of the sites I have tried (and I have tried quite many).

App-wise, you will just get the Enki review here. I tried probably all the most important apps out there. But mostly, in their way of just explaining the syntax of a language, the progress was slow and it was ineffective for me who already knew most of the syntax. You usually can’t skip much or speed up if you’re getting bored. Still you never get any actual programming done which I found both useless and frustrating. Having tried all the apps and “learn the syntax of XY” (disguised as “Learn programming language XY” which is not the same thing, in my opinion), I now found competitive programming sites to be more what I had been looking for. 

Some of them don’t have all the languages you might want, especially the smaller ones and the web dev focused ones.

 

General-purpose learning sites (video-based, MOOCs, etc.)

Many popular sites (Coursera, Udemy, Udacity, Khan Academy, MOOCS like edx, OpenMIT) are video-based and I don’t personally like that. I prefer interactive sites where you can type your code directly. But well, now you have the links to those resources as well.

Online books or blog tutorials

Even though I have a tutorial blog myself, I personally would not try to learn a programming language from a blog. Sometimes you find useful posts for a specific problem you need to solve, that’s mainly what they are good for. I just think that the available interactive things are cooler for actually getting programming experience as a novice programmer.

Youtube tutorials or channels

Some posts on free “learn coding” resources recommend Youtube tutorials or channels. This is, I think, a valid point if there really is an excellent tutorial video out for exactly what you want. So, if what you want isn’t uncommon, there is likely to be one. Sometimes a 5min video can save you an hour of reading a tutorial. But I find that hardly any channels offer sensible curriculum for a motivated learner, so I’m not sure how much you would get out of it in the long run. That’s why I won’t recommend any here.

Sites just/mostly teaching the syntax of languages

 

Many of them are gamified as well

Sites I really like for practice / or of the type I like (challenge-based)

Learn the language tracks are available on for example: HackerRank, HackerEarth (teaches algorithms).

With Hackerrank, for example, I really like their testcases. They don’t just ask you to write a solution on your own (sites like SoloLearn, Codecademy hardly do that in their regular curriculum), they also provide testcases where, for example, overflow is bound to occur. So with every single test you are reminded to remember that. This is a good reinforcment method in  teaching, I think 😉 Also, have I mentioned that supposedly, the only thing which really works wonders in teaching is getting tested? So forget about what learning type you are (visual, audio-visual-bla, etc.) and become a tester. You can skip some of the testcases of course, but the frequent reminder still works wonders. Also, you can learn from other users good (high-ranked) solutions. Especially in algorithms, it’s really worth checking how more experienced programmers did it. However, these competitive programming platforms do kind of encourage bad programming style (and dirty hacks to improve speed), so be sure to take care of that yourselves. Be persistent disciplined when it comes to using good style! And remember to still work on bigger projects every once in a while, since the daily practice from these sites is just one single function without context.

Learn Catch the Flag / Hacking

Also, if you’re interesed, these are some sites where you can learn CTF (catch the flag), something like competitive ethical hacking.

Other / Tutorial-based

More tutorial-based sites which are useful but not for the kind of rapid learning I recommend:

Resources

Also, check these other posts on free coding resources:

 

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

Should I start doing DH?

My non-DH colleagues and friends ask me more and more often if I think they should start doing Digital Humanities and if yes, where to start? Since this seems to be an interesting topic for many, I thought I’d quickly elaborate on it.

Disclaimer: Even though I’ll  put on my “career advisor” hat right now, I want to remind you that I am in no way qualified to advise you on your career. So if it all goes downwards from now, I am not the one to blame. All opinions are my own and should be treated as such.

So, now we got the legal part over with (essentially: don’t sue me), let’s get to my opinion on the topic. I think it is out of the question whether you should start doing DH. In my prognosis, almost all Humanities research is going to be at least part DH in the near future. If you ask me. And you did.

So, the point is: if everybody is going to do DH anyway, so should you. You don’t want to fall behind. This is good – and bad. If everybody is going to do DH in the future, there is no way around the extra work for you to learn it. But then again, hey, you’re already at the right site for getting awesome DH help – so I’m not too worried for you.

Doing DH is going to be normal soon enough

In fact, I think we’re almost at the point where it already is. So for one thing, if you don’t learn the basics and do at least some DH, you will be sub-standard and below average. If you learn and do  some DH however, it won’t be a door opener either because everybody starts doing DH now, ergo it won’t be special anymore in 5 years.

So yes, you should do DH, already so you don’t fall behind. But also don’t expect it to get you very far. Learning what you can now is merely the entrance barrier. If you want your DH affiliation to count in the years to come, this will only be possible if your approach is super innovative or you’re really good at technical stuff. And technical probably way beyond what is common now (XSLT and stuff). It is my opinion that if you want to make a career in the DH in a few years to come, XSLT and web developemnt might not be enough anymore. Maybe if you get lucky. At least, those now-standard DH basics technologies will be the very foundation everybody is expected to have. Alongside the 500 other skills on top of that.

“Label-DH”

If you are a Normal Humanist now, you might not want to completely change course and become a very tec-savvy Digital Humanist unless you already have the programming foundations. You might just want to add a pinch of DH to spice up your regular Humanities research or be eligible for certain grants. Then you are what some call “label DH”.

First of all, I have to add that I am a bit biased when it comes to so-called “label DH”. “Label DH” are people who label themselves  as “DH” but don’t really do DH or are ‘only’ the Humanist part in a DH project or affiliated to a DH project or else. Essentially they have no legitimate DH skills whatsoever but aggressively label themselves as DH for the advantages of it. If you’re only in for the benefits but not ready to put in the work, obviously everybody is going to hate you and you might or might not get lucky with this approach. I wouldn’t recommend it. I think that not so many people are successful with it now. Never overstate your DH abilities, especially if you have none. People will know and you’ll basically be out of the race. I don’t like label DH. Some great DH thinkers, like Patrick Sahle from what I gathered from a talk of his, believe that label DH is just as important for DH as a discipline as is “hardcore DH”. Because it popularizes the discipline more widely. Maybe it is. I’m not particularly fond of it anyway.

Well, I’m a hardliner. I believe that “real DH” would mean to be just as hardcore at programming as you are at your Humanities research. All while not losing touch with your Humanities research, for then, you would turn to a “mere programmer” (not meant in a pejorative way). Because the whole point of DH is that you’re not either a programmer XOR a Humanities scholar. It’s the combination of both. Most people see that combination as some sort of 30/70 or 40/60 kind of thing. I think it has to be 100/100. And yes, that means you’ll have to be a freak with a 200% workload. I’m pretty alone with this opinion, however, so don’t panic. Most people don’t see it like that at all.  I’m generally a bit of an eccentric and maybe some might perceive my opinion to be extreme. Well, sorry, but I like extreme. I think that “real DH” should mean 200%, or even better: 300%. 150% programmer and 150% Humanities. Be hardcore at both. At least that’s my personal goal.

Half-assed just probably won’t do the trick anymore

Like I said, I’m no expert. But my view of the field is that already now there is a lot of half-assed stuff. A lot of people do DH and not all of it is good. So far, the field has been pretty chill but I’m not so sure it’s going to stay that way. Competition will get harder and harder. In fact, it already is harder than it used to be and the boom is extreme. DH used to be marginalized but now, it has become mainstream. I can’t even imagine the masses of people starting to do DH from all over the Humanities. And then, there is formal  education in the DH now which booms, so we soon will be “flooded” with certified Digital Humanists. I put “flooded” in quotes, because of course, there is more work than ever. Seeing as everyone everywhere is going to do at least some DH from now on, the demand is high too. But still, as a non-DH-certified Humanities scholar you will probably have  a harder time benefitting from the DH without going all-in in the near future. 

I can’t really judge if this will cease, as it was feared a few years back when people thought the DH were yet another hype, to pass as quickly as it had come. They were wrong about that. The DH have come to stay. And they are the cool kids in school now. The Geeks get the girls or whatever. (In case anybody noticed, this is an American Hi-Fi reference but you probably have to Google to find out what that is).

People around me think the demand is not going to sink in the next few years and probably not in the next decades either. Digitization is everywhere and it gets ever more extensive. So no, the demand is probably not going to cease. But new generations of scholars might soon start to learn the DH basics you lack as part of their normal curriculum. So yes, I very much believe you might be at risk to get left behind. Not unless you’re revolutionarily good at your Humanities stuff. Like “excellent” or whatever they call it. So, as a guideline, you probably will need to learn DH. Applying for grants will also require you to have at least a basic overview of what’s going on in the DH. You don’t want to be left behind. For the normal scholar, going your way around the DH basics will be a prerequisite for “excellence”, not the easy way to an excellence award.

What I think you really should learn as fast as you can

Annotation in XML and at least one XML-standard relevant to your research

Learn annotation in XML now because it is easy. Like I said before, this won’t get you very far anymore but it is the foundation on which you can build and will be a gatekeeper. If you don’t even have this basic building block, no more doors will be open to you, even in label-DH projects. I see this starting to become reality now already for everyone who is not an important Humanities professor or otherwise super-important. Also, if you ask for cooperation and possess a basic knowledge of these basics, DH people will be a lot more willing to talk to you because it shows that you did your homework. DH centres can’t accept all projects. This is a way you can stand out from competitors.

How can I start?

Formal education

  • Get a certificate (from a summer school up to a year’s worth of classes).
  • Do a DH master

Teach yourself

Well, of course there is your favourite go-to resource for everything DH (and LaTeX): The LaTeX Ninja – yaaaay! 😉 With many more tutorials to come (soon, hopefully).

Pause to think whether you’re already doing DH

You would have noticed, you think? Well, DH is not only XML and annotation. There are many aspects to it and maybe you have already done something digitally that doesn’t strike you as DH or doesn’t come to mind rightaway.

Learning DH will only really work for you, if it fits your research. So, find a way of going digital which is compatible with what you already do (like a “digital update” of your current work) rather than trying to force yourself to do DH in ways which don’t immediately make sense to you. Take some time to brainstorm this, however. The good ideas might not come to mind  straightaway. Google digital projects from your field. What are they doing? Who does the digital serve their research purposes? What can you take away from that for your own research? If it doesn’t fit between DH and you, people will know. You have to find something you like. If you hate what you do, you’ll never get good. If you like what you do, learning something new will be fun.

Learn something new

I have an extreme drive to always learn and do new things. People usually comment they can’t really understand that. They don’t get me. I think it’s all a question of perspective. If you feel like you have to learn something new, it will be “hard work”. If you want to, it can be an adventure and a nice challenge. Rise to the challenge.

The power plant doesn’t have energy; it transforms one form to another. It generates energy and transmits it. We are the same. (Brendon Burchard)

Life-long learning sounds like a burden to many, but somewhere deep down, past the coziness of our comfort zone, we do have a natural child-like curiosity for learning new things. Try to reacitvate that if you’ve lost it. Use the DH as your trial project.

Cheers,

the LaTeX Ninja

How do I get to do task XY for the first time at the job

Today I want to talk about how you convince others to let you do XY for the first time as an official job responsibility, even though you might not have experience or any formal training doing so. And also, why you have probably come across a situation where one of your colleagues has been chosen to do task XY and not you. Even though you are both equally qualification-less. Now you feel left out. New tasks are opportunities for growth you probably really need if you want to stay in academia. It is all the more detrimental that bosses often don’t take the personal/CV growth of their young colleagues into account and hardly ever give out those tasks strategically. You can end up the lucky one – or you end up left out.

 

Disclaimer: Again, as always, these are my personal opinions and they might not apply to your situation. Use your brain.

 

New skills are always needed in your institution

Especially in the Digital Humanities it can happen a lot that there suddenly is a demand for a certain skill at your institution that nobody has yet acquired. Then somebody gets chosen to do it, often basically by chance and after they have done so, they are the expert on the topic. Which is good if you were the lucky one chosen (teaching yourself probably was quite the struggle so you’d deserve it). But if you aren’t – congratulations, the possibility of you ever going to be able to do this same thing (like programming in a certain language, teaching, shouldering a certain responsbility) might have just shrunk to zero. Often, DH centres are not big enough to need more than one person for a  less-mainstream specialty skill. It will from now on be incredibly hard for you to prove yourself in that area although you might be just as qualified. You have officially become invisible and somebody else has officially become the guy who does XY. I have experience on both sides. I have both had a responsibility thrust upon me, not really freewillingly or because I would have wanted to. But because there was just a demand and nobody there who was actually qualified to do the task.

 

If you were chosen

You’ll have to teach yourself and might end up with patchy skills

Meaning you will probably end up with a pretty stiched together knowledge and might have to relearn the skill in a more systematic way after the project is done if you really want to go on doing what you were asked to do in a professional way. Projects are often time-sensitive and deadline-driven, so you won’t have the time to really learn the skill in a systematic way. Unlearning bad practices acquired like this can be really hard  afterwards.

You are now officially the default person for the task

You might not exactly be more qualified than your colleagues but you are still going to be the default option to do the task. If you want it or not. So be careful accepting these jobs if it’s a task you genuinely dislike or consider out of line with your own personal development goals. For me, personally, I want to become something of a ‘real programmer’ in the Digital Humanities. I am a girl, but I don’t want to end up being the web designer. Not that web design is bad, inferior to ‘real programming’ or anything. I just prefer ‘real programming’ but since I am a girl, people tend to hand me the ‘soft bits’ and give the ‘hardcore programming’ to a man. Which the man might not even want. Sadly, unconscious gender stereotypes are still very effective in workplaces. Woman often get discriminated againt by ‘non-events’, i.e. not being asked to take up a challenge while male colleagues are etc. which ends up harming their success in the long term. If you want the challenge, you might just have to take it up in your private life or compete hard with your male colleagues. If you accept a specialty you don’t really want, you might seriously harm your ability to start something else afterwards. You will end up with that label. So be careful which label you choose. Also, your time resources for personal development will go into this task completely. If you were planning on learning something else, that’ll have to wait for a long time. Choose wisely and turn it down if you have to and can.

You probably can’t say no

In many situations when you’re asked to do this daunting task nobody else has ever done before, it is probably because you are not be most important member of your organization. You are probably young or new and are deemed to be a hard worker and able to learn. These are good things. But it might still not be something you do freewillingly. But do accept the task. If you don’t you might come across as though you’re unwilling to take up new challenges or learn new skills. Not an impression you want to leave for further job openings and ending contracts – which are never very far away in academia. Also, this new skill your institution is trying to acquire through you might be a reason they hold onto you later or or the base for a new grant proposal, etc. So this might just be a golden ticket, even though you had always imagined those would look more glamorous. Also, it might just be that you are the only idiot they dared to hand this stupid task to. You never know.

 

If you were not chosen but would have been interested

This is really stupid and happens a lot because these informal decisions are not discussed with everyone on the team (which they probably should be and bosses should be aware of this once they have read this post.) But the sad truth is that this decision will probably be made by the bosses in a back room in a discussion you are not allowed to join.

So even if you knew it was about to happen, there isn’t much you can do except maybe inform people beforehand that you would be interested. This by no means guarantees your success but since these decisions can be very spontaneous, maybe it even will get you the job. Definitely try it if you get the chance or overhear a discusssion. Butting in on other people’s discussion is rude but it is also rude of bosses making seemingly inconsequential decisions in private which actually are very consequential to their young employees and can make or break a career in the long term. In this case I would say, better sore than sorry.

But what if it’s already too late? It might even have happened to you that somebody else was chosen as a “new expert” for a job (in a backroom decision) which they are not qualified for – but for which you, in fact, are qualified for. Of course you can tell people that you think you would have been more qualified or at least wished you had been asked. But once the job is already given out, it’s unlikely they’ll take it back. Unless the other “chosen one” has expressed that they will only do it if nobody else is found but rather wouldn’t do it if they didn’t have to. Probably try and say something anyway.

Official responsibilities make you more trustworthy than actual skills

This is an especially stupid situation for you because it undermines your skill and legitimates the other person who actually didn’t have any legitimate skill up to now. If you already have some experience in these matters, you will probably know that often, experts are not made by skill. They are made by decisions of their superiors. In the end, your skill doesn’t count. What counts is solely the fact that your bosses trust you to do a job. This can lead to major unfairness, of course. And you are virtually powerless once it does. The only thing you really can do is show your experience and skill elsewhere. Join an expert society. You will need very bold action and extremely solid credentials if you ever want to make up for this misguided decision again. Also remember that bosses hardly think about this. They are probably completely unaware of the detrimental effect this will have on your career.

Boost your CV, exaggerate your skills a little bit and be over-confident of yourself (because, sadly, everybody else is and you will be left out otherwise)

So show off your skills as much as you can. Drop your knowledge whenever appropriate. Especially if you are a girl or shy, this is not like you. But you will notice how (even misguided) self-confidence goes a long way. Men tend to be much more bold in their statements in the workplace and also in what they write in their CV. If have seen a CV where someone said they were a C1 or C2 in English when they really were so bad that they made tons of typos in basic programing commands. And programming language English is hardly the real deal. If they already can’t spell ‘length’ properly, how can they have a C2 level? It was not a one time typo and by no means the only type of error I observed in the very short time span I paid attention to this either. In their defense, they probably didn’t even know what C2 meant. It is still a bold claim. What I have learned from this is that the impression you convey is all that counts. Be a bit more self-confident than you really are. Pretend you have some more skill than you do. By this, I don’t mean overly exaggerate. But ask yourself whether you could learn a certain skill (in basics) in a week or weekend. Then you probably are good to state it in your CV. (Then go on and actually learn the requested skill since people will probably test this by asking some general questions on the topic. And, of course, this only goes for minor skills but many DH skill requirements are actually quite basic).

 

You have to have done it once

If you actively seek to try it out new things or want to be challenged in your job but were not chosen, you are out of luck. Since somebody else is the default option now, you are going to have an incredibly hard time getting yourself seen or heard from now on. Even if you do everything you can to learn the skill along with your colleague, they are always going to be the one who has the practical experience. Even if you should also manage to get some practical experience, they are going to be the one who won your institution’s trust and showed results on a concrete job-related project. Unless there is a great need for the skill, you might never be able to do this at your job. Sorry, but it’s the truth. 😦 The only thing you can do now is to get real job experience with the task outside of your instituation or going freelance (if your job allows that at the side). Or create a truly mind-blowing hobby project and share it online.

This is partly one of the reasons why I have this blog. I don’t really like the idea of sharing my life with strangers but at the same time, I still want my private technology- and teaching-related activities to be visible. People will only trust you once you’ve “done it once” because it is seen as proof that you can do it. That’s why people often say that you should teach exactly one class in your PhD time – takes up the least possible amount of time and energy possible but from now on, once you apply for a position which includes teaching, you are credible when you say that you can do it. If you haven’t – well, good luck to you. It is highly unlikely someone who doesn’t know you will take the risk. Especially since they probably have 50 other applications from people who did get that chance. So you kind of depend on getting the experience from your own institution. If they have chosen to ask somebody else, all you can do is be annoying or follow that default person along. Tag along and offer to help as much as you can. Drop knowledge you have whenever appropriate. This is by no means guaranteed to help – you might just get ignored. But then you can say you have at least tried to get people’s attention. And maybe it will turn out for the better at some point. Maybe they will remember you the day they need a lab rat for a new task nobody is qualified to do.

 

Conclusion: On the importance of learning from new responsibilities for your CV

“So grow your own CV and decorate your own skills, instead of waiting for someone to bring you opportunities for personal growth.” – based on a quote by J. L. Borges

So, as we have seen, this informal way of giving out new tasks to people can be a great opportunity if you are chosen. But it can also be a way of preventing eager people from taking up new tasks. Once somebody did it, they are the default person and probably nobody else is needed. So nobody else will be given this opportunity of personal development. This can be a real problem in academia where you are expected to constantly grow your CV and tend to your skills. Some people even say that you should add one line to your CV every month if you want to be successful in Academia. What line have you added last month? What will you add next month? Plan this strategically!

I hope that maybe some bosses read this post and become aware of the problem. Maybe people get inspired to hand out these opportunities more strategically and more consciously. It also often happens that it’s always the same people who get the opportunities (because they have already proven their potential to rise to the challenge) and others continuously get left out. This is bad for the ones left out and can lead to overwork in the others. If you are responsible for early career scholars, please make conscious choices with anything which could affect their careers. If you are affected, my consolations. Try to prove your skills in a side project or join a society.

 

Hope this helps someone,

best,

the Ninja

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

[Guest Post] Confessions of a LaTeX Noob

I am happy to introduce my first guest post on this blog. It’s from my archaeologist friend whom we decided to call “the LaTeX Noob” here. She will give her perspective on how using LaTeX in the Humanities feels for her and the problems she has encountered. Like how getting help can be tricky, you don’t want to look like an idiot and how you constantly have to defend your choice to use LaTeX (to users and non-users alike). “Why would a Humanities person want to use LaTeX anyway? You don’t need it and you’re not up for it” are the most common insults a Humanities person might have to endure after choosing LaTeX.

 

Here come the confessions of a LaTeX Noob:

Confessions of a LaTeX Noob

Okay, here I am, the LaTeX noob. Well, not that noob-noob, but noob nonetheless. I am an archaeologist and I am trying to write my thesis in LaTeX. Well, my catalogue, to be exact, because we archaeologists like pictures of our stones and potsherds and whatsoever.
So, we need a good tool to achieve a really nice looking result document. And no, I don’t have the money or the skills for Adobe or anything like that.
Have you ever tried to get a nice document done with Word, including lots of pictures? Yes, there is a good chance you will kill yourself trying. So, that is why I am here, busy reading the Ninja’s blog and her tutorials. I like the way she explains things (to me, also in our lunch and coffee breaks at university) and I really trust her on that issue. Mainly, because I cannot talk computer-talk and she understands my archaeologist-talk, because she is one of us.
So, why am I writing my first guest post on this blog? As a LaTeX noob, you are willing to try this new way of getting nice results in your writings, articles, and so on, BUT: It is neither simple nor easy to ask for help. Or trying to. Or showing your code to others.

Fear of being dumb

Okay, this might sound a little like overreacting, but yeah, asking for help in some online forum or even colleagues and friends who are skilled with LaTeX (and some of my friends and colleagues indeed did to write their thesis in LaTeX) is not that easy. Especially not if you are like me – always trying to find your own way, your own solution. But it is obvious that you will reach the point where you have to ask for help. After all, this is teamwork, right? I know, maybe this is sort of naive, but I always learned that I have to ask others, because they have different skills and knowledge and can help me, and I can help them in return. So, I just asked my questions… And then, a lot of nasty answers followed.

Sometimes, people are just suggesting what they mean using short terms or abbreviations – so, I have to google it, because simply writing that I am a LaTeX noob does not get across to them. If I repeat my question stating (the obvious) that I have no idea what they are talking about, some get really upset, writing that noobs and beginners should not even try because they have no idea of the matter. It gets worse when I tell them that I am an archaeologist, so, a girl of bones and sherds. This means, humanities. And hello, one of the greatest cliches around us: What do we need LaTeX for? We are no technicians. I will not argue about that. I took some university courses in programming, I can use QGIS and I know how a database should work and how I would have to build it, but I would not strictly need that stuff in my area of research, that is correct.

So, I might have heard about some things, but not all. And not all things I have heard about are logical to me. Arrogance does hurt. Noobs and beginners like me are tryng to learn. So, be kind. I sometimes try to talk to this kind of people… never mind. It does not help very often. But, I have to say, there are a lot of friendly people in the same online forum who tried to help and who sent a lot of tutorials. There is hope!
But mainly, I just felt dumb and stupid and I quit my experiments everytime, because I thought that it is not worth my time trying to get better and to do something good with LaTeX if there are that many unfriendly people out there. It sometimes felt like they did not want to be nice or help at all.

 

Confusion

No one ever said that LaTeX was easy-cheesy, but you have to admit that some coding is just not logical. How to put text and figures in the right connection, wondering why the whole thing is just a mess – until I stumbled upon the problem of floats (where it got even worse). Floating and non-floating, figures and minipages, and so on. I got really stuck there, it took me lots of hours to figure out what the differences were and how I can use both in combination.

It gets even harder, when skilled LaTeX-people are asking you about your work. You describe your way and your solution – and find yourself bombed with suggestions and new variations of packages you never ever heard of. And even trying to make your point clear does not really help sometimes. So, I just remain quiet, listening and somehow my brain gets totally winded up in knots and funny question marks in very bright colors. I asked them to write their suggestions down or to send me an example – most of them promised, but never did, and we never ever talked about that subject again. So, if you just want to show me how good you are, fine. But I really wanted your help, otherwise, I would not have started the conversation at all. If I do not want your help, I will tell you.

Feeling misunderstood

And then, there are people who think that you are exaggerating trying to use LaTeX, mainly because no other person they know has used it. They will even tell you every time they see you that you are wasting your time. You should be writing your thesis, not surfing the net and looking up funny lines of code. I had to invest all my energy on my genetically predisposed stubbornness to lock them out out of the room and to figure out my needed piece of code.Then I present it and they are really surprised (well, most of them). Some remain sceptical and even tear your work apart, because they have worked and are working in a different way and, of course, it is only correct if done their way.

 

Fear of doing things wrong

I have still to fight that one. As for now, my document looks nice and is working, so after all, it is functioning – no errors mean everything’s good, right? Still, there are people who are telling you that you code is not clear, not logical or far too complicated. BUT: It can’t be completely wrong. It works, after all.
Okay, I might have a very strange style, but that is okay for me, as long as everything works that way I want it to work. I somehow think about coding style as a fashion issue. So my coding may not be fashionable, but I think it is like cosy underwear: I might not want to show it to people. But hey, it feels good to me.

 

Getting rid of my image as a LaTeX-Noob

I sometimes wonder when I will get rid of my image as a total noob. I might not be a beginner anymore because I can read tutorials, even advanced ones, and I can understand them and I can follow them and reproduce them myself on my terms and  conditions. But, I am still a noob. Maybe I will always be a noob, concerning some people out there who are basically LaTeX-gods. But at least I have a LaTeX Ninja among my friends. She will show me how I can survive.
So, this is my first post here. In my next one, I will write about my writing and my time management – and my motivation vs. my iron discipline on how to write 130 pages a month, neither loosing my nerves or my mind, nor killing anyone (including my laptop).

 


 

So, this is it for now. Thanks for reading, as always.
Cheers,

the LaTeX Ninja

PS: By the way, she prepared this post in LaTeX 😉

XML to LaTeX (simple)

Today, I wanted to share this super simple XML to LaTeX tutorial. Using XSLT, you are going to transform XML data to LaTeX output which you can then go on to compile into your desired output PDF. There will be no fancy stuff whatsoever in this post, just the basics and what to keep in mind with these transformations. It is the quick intro to XML to LaTeX I did with my students a while ago which was done one day after they had their first contact with XSLT, so it should really be beginner-friendly. I labeled it “Advanced LaTeX” anyway because I think starting to automate things is always a step in the right direction 😉

Configuring the transformation scenario in Oxygen

I am going to assume you use Oxygen now because that’s what a lot of people in the DH do and this post is directed towards my friends in the DH. Especially those who think print editions are an obsolete concept in times of the Digital Edition. Maybe having a nice little intro to XML to LaTeX transformations available will change their minds 😉

To set up the transformation scenario, choose XML transformation using XSLT, then choose your XML document and your XSL stylesheet (set up and open those document in the editor before you configure the scenario). Then choose a Saxon 9 version (whichever you like). Then ignore the FO tab and get right to the output tab.

Here you configure how to name your output. Best click the green arrow and choose cfn, then append -latex.tex. So the result the current filename with -latex.tex appended to it. This is an important step so you don’t accidentally overwrite the original. Which in this case is not so dramatic since it has a different ending anyway but if you do XML to XML transformations, this is even more crucial. Then tell Oxygen to show it in the editor as XML (even though you know it isn’t). The editor will then, of course, complain about the non-valid XML but don’t worry. Just copy all (CTRL+ACTRL+C) and paste it (CTRL+V) into a completely empty (!) project in Overleaf  or just compile the LaTeX directly if you have it installed on your machine.

There it is, you’re set. Now let’s get to the stylesheet.

The stylesheet

So, this is the whole thing. You can just grab it and go if you don’t care for the explanation or read on to find out why things were done the way they were done. This tutorial assumes you’re already familiar with how XSLT works, just haven’t done transformations to LaTeX yet, by the way. I am also assuming, your base XML is in the TEI standard.

 

 

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"     xmlns:xs="http://www.w3.org/2001/XMLSchema"     xmlns:t="http://www.tei-c.org/ns/1.0"     exclude-result-prefixes="xs"     version="2.0">
    <xsl:strip-space elements="*"/>
    <xsl:output method="text" encoding="UTF-8" indent="no" omit-xml-declaration="yes"/>

    <xsl:template match="/">
        <xsl:text>\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\DeclareUnicodeCharacter{2060}{\nolinebreak} % might not be necessary for you

\title{</xsl:text><xsl:apply-templates select="//t:title"/>
        <xsl:text>}
\author{</xsl:text><xsl:apply-templates select="//t:author"/>
        <xsl:text>}\date{\today}
\begin{document}
\maketitle
\tableofcontents\newpage</xsl:text>
        <!-- get some metadata from the TEI header using the push paradigm -->
<xsl:text>\begin{itemize}</xsl:text>
<xsl:for-each select="//t:persName[ancestor::t:teiHeader]">
    <xsl:text>\item </xsl:text>
    <xsl:value-of select="." />
    <xsl:text>

    </xsl:text>
</xsl:for-each>
<xsl:text>\end{itemize}
\newpage</xsl:text>

        <!--  <xsl:apply-templates/> OR -->
        <xsl:apply-templates select="//t:text"/>
        <!-- just use the pull paradigm on the TEI body so you don't get a meaningless TEI header dump in your document -->

        <xsl:text>\end{document}</xsl:text>
    </xsl:template>

    <xsl:template match="t:head">
        <xsl:text>\section{</xsl:text><xsl:apply-templates/><xsl:text>} </xsl:text>
    </xsl:template>

    <xsl:template match="t:p">
        <xsl:apply-templates/>
        <xsl:text>

        </xsl:text>
    </xsl:template>

    <xsl:template match="t:hi">
        <xsl:text>\emph{</xsl:text><xsl:apply-templates/><xsl:text>} </xsl:text>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:analyze-string select="." regex="([&amp;])|([_])|([$])">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="regex-group(1)">
                        <xsl:text>\&amp;</xsl:text>
                    </xsl:when>
                    <xsl:when test="regex-group(2)">
                        <xsl:text>\_</xsl:text>
                    </xsl:when>
                    <xsl:when test="regex-group(3)">
                        <xsl:text>\$</xsl:text>
                    </xsl:when>
                    <xsl:otherwise/>
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="." />
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

</xsl:stylesheet>

 

The XSLT declaration and the LaTeX preamble

Put the following after your XML declaration. This will ensure output is LaTeX-friendly.

 

<xsl:strip-space elements="*"/> <!-- for LaTeX -->
<xsl:output method="text" encoding="UTF-8" indent="no" omit-xml-declaration="yes"/>

 

\DeclareUnicodeCharacter{2060}{\nolinebreak} was added because LaTeX complained about an undefined character in some of my students XML data. Personally, I had never gotten this error before, so you might as well leave it out.

Creating environments

As we had just learnt some XSLT basics, I wanted the students to use at least one push and one pull paradigm type template. So the task was to process any element from the TEI header using the push paradigm and then a pull paradigm template for the body. To make this little template more efficient teaching-wise, I decided to introduce how to create a LaTeX environment using XSLT for the TEI Header / push paradigm and do at least one other command using the pull paradigm on the TEI body. Also this demonstrates as opposed to for the body.

So this next piece of code sets up an itemize environment for persons present in the header. In a “real” stylesheet, it would probably be more wise to check, using whether there is one potential element like that present and only paste the \begin and \end on that condition.

Also, as you might have noticed, all the LaTeX commands are inside . This looks a bit confusing at first but really isn’t. Just make sure you don’t create invalid XSLT by shuffling them around.

 

<xsl:text>\begin{itemize}</xsl:text>
<xsl:for-each select="//persName[ancestor::t:teiHeader]">
    <xsl:text>\item </xsl:text>
    <xsl:value-of select="." />
    <xsl:text>

    </xsl:text>
</xsl:for-each>
<xsl:text>\end{itemize}</xsl:text>

 

Pull paradigming emphasis

When creating simple commands using the pull paradigm, be sure that you don’t end up with too many “overlaps” since “simple” commands in LaTeX don’t take multiple paragraphs as arguments. If in doubt, always use environment and the global switches (like \bfseries). Since you are automating things, you always have to take into account that data might not always be marked-up in a way which makes sense to you. There can easily be linebreaks inside a single italic highlight. If this is the case in you data, better create an environment. For simple purposes, however, this is good enough:

 


<xsl:template match="t:hi"><xsl:text>
\emph{</xsl:text><xsl:apply-templates/><xsl:text>} <xsl:text></xsl:template>

 

With these commands (and genereally when transforming to LaTeX), you sometimes need to make sure you don’t involuntarily add spaces or lack space which will make the output hard to read and debug.

In this example, I made sure not to add any whitespace inside the template rule and also did not have Oxygen format or indent the XML to avoid these unwanted spaces.

And finally: Escaping entities

As you might remember, markup languages tend to use entities to escape certain characters. Bad thing is, LaTeX and XML use different entities. So we need to escape them. I know that the OxGarage standard stylesheet does this using the translate() function but I prefer to use since it’s less “messy” than a nested translate() construct.

Ah, and side info by the way: There is this standard stylesheet from the TEI consortium which might be of help if you are looking for inspiration. Since it is very generic, however, it might not be helpful if you are a newbie at both XSLT and LaTeX. The XSLT is pretty advanced and also the LaTeX probably uses some commands you might not be aware of.

 

    <xsl:template match="text()">
        <xsl:analyze-string select="." regex="([&amp;])|([_])|([$])">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="regex-group(1)">
                        <xsl:text>\&amp;</xsl:text>
                    </xsl:when>
                    <xsl:when test="regex-group(2)">
                        <xsl:text>\_</xsl:text>
                    </xsl:when>
                    <xsl:when test="regex-group(3)">
                        <xsl:text>\$</xsl:text>
                    </xsl:when>
                    <xsl:otherwise/>
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="." />
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

 

And that’s it. I hope this was useful to you.

Cheers,

the Ninja

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00