LaTeX for Archaeologists: An archaeological catalogue from a spreadsheet

Today I wanted to share a long-promised second workflow for typesetting an archaeological catalogue in LaTeX. There is a first post about it (LaTeX for Archaeologists: An archaeological catalogue using LaTeX) but the approach was not very advanced at the time. In the meantime (actually, it has been a while already) I coached another archaeology PhD student to typeset her archaeological catalogue in LaTeX.

Prerequisites

Unlike our friend, the LaTeX noob whose needs were descirbed in the last post (mentioned above), this person had a catalogue which didn’t feature overly long object descriptions. These had made it very difficult to automate the typesetting for our friend, the LaTeX noob. Use case 2, however, featured data which was very much tabular in format. And this data for the archaeological catalogue was also available in the form of a spreadsheet.

Importantly, this spreadsheet was basically finished by the time we started implementing the LaTeX workflow/solution (unlike in the Noob’s case where the requirements – i.e. how exactly the finished data would look – weren’t all that clear at the time we started). The most difficult part in the csvsimple spreadsheet workflow was probably adding a file path for all the images, so that they would automatically be correctly included in our LaTeX output without us having to add them by hand for each one of the considerable amount of catalogue entries.


Wait until your advisor gives you the green light before implementing and involve them in the design decisions early on

In the last post, I already mentioned that it is of utmost importance to check your layout and typesetting decisions with your advisor as soon as possible. We don’t want you to do all the work and then have to convert it all back to MS Word in the end.

In our case here, the advisor agreed to all the design decisions and was very cooperative. It was important to include them though, as to spare yourself avoidable changes, etc. Also clarify that you won’t be working in MS Word, ergo your output might look a little different to what they’re used to and also, some features of MS Word will not be available to you. Maybe the advisor doesn’t care but better check beforehand anyway. Some people are very stubborn, especially concerning new stuff that they don’t fully understand. We wouldn’t want you to have a bad experience.

Things to take into account with our CSV-based catalogue

Like I said above, the situation was as follows: The person who needed the catalogue already had it practically finished in the form of a spreadsheet. This means that it was already quite clear what the result would look like and how much space everything would need. We tried out multiple layouts and came up with what is now the finished layout. Your needs and preferences will likely differ, especially with regard to the amount of text to be included and the amount and size of images.

In the end, we decided on a workflow using the csvsimple package which allowed us to process and format the data in a very efficient way (no need to re-code data with the same old commands for each single entry of the catalogue).

But working with multiple files can also have its downsides. That’s why I first wanted to make you aware of some possible problems with processing data from the csv format. (You can save your spreadsheets as csv, in case you were wondering how to get your data in this format in the first place.) As the name says, .csv is a simple format which stores cells of data separated by comma and lines separated by linebreak. Maybe somewhat confusingly, you can change the delimiter (comma) to anything else. This might be relevant if there are lots of commata in your text which you cannot get rid of by just shifting the text around:

Be careful with commata or semi-colons in your CSV cells or table headings as those can hinder the processing. You might need to replace them by something else or reset the csvsimple package settings or save your spreadsheet differently (as a comma-separated values format but using a deliminator other than comma, etc.). Comma-separated values as a format can be a problem when your data contains commas. This will ruin the processing. In this case, you can set a different symbol (other than comma) as a delimiter which doesn’t appear in your data. Or you need to make sure not to include commas in your file. This takes some getting used to and might create really annoying errors which are extremely hard to find. These can sneakily infiltrate your \cites especially! If a comma sneaks in, this will be interpreted as a column separator – which will likely ruin the code for the whole row but LaTeX won’t necessarily throw an error, so beware and check if all your data is there!

Also, in our solution, images get added automatically. But of course, this is only possible if you store the filename of the images in your spreadsheet in the right place. If you only have your image file names in the spreadsheet (not the path to the sub-folders they’re in as well), then the files can’t be in a sub-directory at the time of compiling your LaTeX or else they won’t be found (as the path passed to \includegraphics won’t be correct). So keep this in mind in case you want images. EDIT: Michael Piotrowski suggested on Twitter, that one might use \graphicspath for handling the image locations (like so: \graphicspath{{subdir1/}{subdir2/}{subdir3/}…{subdirn/}} and then just add the images without the paths).


A simple example of what our output can look like. In the upper left corner, you can see the environment for the location, followed by the auto-formated CSV rows. How you layout all this is a matter of your own taste, of course, and can easily changed for the whole document.

The setup of the catalogue

As for the general structure, we created a simple environment which holds the general information for a location (where your findings are from) as well as a csv reader for the findings in each location. This means that each location needs its own csv file or the data needs to be spread out into multiple spreadsheets (was that a bad pun?). Calling those spreadsheets into LaTeX using csvsimple isn’t a big deal. We just need to make sure that they all have the structure needed for our processing (which is ultimately a \foreach which is applied to each row of the csv table).

In the template, I also have a short commented out section (in catalogue-commands.tex) which says that you might want to consider using the ifthen package to suppress the printing of the labels in case an entry is empty.

So, all in all, this solution is super clean and simple. It took us a long time and a lot of testing and tweaking to arrive here but overall, I think something like this will be a great option for all those who aren’t intimately familiar with LaTeX. It only has one custom environment and one command to read and process the lines of the csv. Anybody using this template would only have to find out how to adapt those commands to their own needs (or get help – which shouldn’t take long).

There might be some troubleshooting required with the csv (see the comment on accidental commata above). Even in putting together this super short example template, I have experienced enough csv trouble shooting to remember that this really was a thing 😉 But overall, I think it’s manageable and gives you pretty great output, especially for data which is a tabular in nature and images which tend to consistently have similar sizes and shapes. (Oh, and of course, this template only takes one image per entry as well but could be adapted to your needs.)

So far, you can get the template here on Github. I will also share it as a template on Overleaf and add the link once it’s available. (EDIT: Already here!)

EDIT: In response to this post here, Sophie Schmidt from the archaeoinformatics community came up with a post on why you should use LaTeX for archaeology.

Conclusion

So I hope this is useful for someone and/or motiviates archaeologists to write their PhD theses and other publications using LaTeX. Have you ever tried typesetting an archaeological catalogue in LaTeX? What were your experiences? Do you have any suggestions how we could improve this template further?

The content of this blog post was possible only because the person in need of the archaeological catalogue discussed in this post reached out to me via email (see “About” section for the email adress). So if you need anything or want to discuss things, please feel free to reach out to me. I also have a Patreon now where you can book some consulting. Please let me know if you have any questions.

Best,

the Ninja

Buy me coffee!

If my content has helped you, donate 3€ to buy me coffee. Thanks a lot, I appreciate it!

€3.00

I like LaTeX, the Humanities and the Digital Humanities. Here I post tutorials and other adventures.

2 thoughts on “LaTeX for Archaeologists: An archaeological catalogue from a spreadsheet

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.