This MSc Computer Science project was born out of the desire to explore the opportunities in which literature can offer interesting datasets to be computationally analysed.

The project aims to investigate the possibility of representing the poem visually, with a particular interest in using colours, in such a way that the final patchwork for each cantica can give some insights into the similarities as well as the differences between the three parts of the work.

The final output is rendered in a web browser and the user interface (UI) allows for interaction with the data: it’s possible to filter by cantica and canto, by equal length of / number of characters in a rhyme, a line or a tercet.

In an attempt to solve the problem described, this study applies computational methods and calculations to the material selected. The goal is to add information and understand the choices made by the original author, discover patterns that would otherwise be very difficult to see if no automated work were involved.

From a computational point of view the analysis of a body of text such as the Divine Comedy elicits particular interest for its multiple aspects referable to mathematics, like poetry and music often do because of their rules and rhythmic features.

The structure of the poem is quite strict. The number of cantos is known and the scheme for the rhymes has conventions allowing for some automation in extracting the data. Word frequency and repetitions can be counted and sentiment analysis is the key to some level of semantic comprehension.