6/16/2022 - Future Project: Fictional Maps
As I wrap up my current project, I had an idea for my next project. However, with other commitments this summer, I don't
know exactly how soon I'll be able to start it, but I'll brainstorm it a little bit in this post.
One of my hobbies (which you may have noticed from this website) is making and looking at maps. So far, I have only dealt with real places, mostly Baltimore. But in combination with my interest in literature and data, I began to wonder about fictional maps. Yes, think of The Hobbit. If I take the map from Lord of the Rings and compare the scale of it to something like, say Watership Down, will there be a correlation between the scale of the maps and the total length of the story? This is a relatively simple calculation, we just need to compare word count to scale. With a large enough dataset, we could even find a correlation value to see if this is a consistent trend.
However, the tricky part is actually acquiring this data in a usable format. If we are limited to the public domain, we are obviously only using older texts, which could be limiting in terms of the amount of data. On the other hand, it could be good since there were fewer texts with maps back then and therefore less experimentation/outliers. However, I think this would be much more interesting if we involved copyrighted works, such as Black Leopard Red Wolf or the many other fantasy novels that have popped up in recent years. As we have thought about copyright in previous posts, this would make the project somewhat tricky to organize. I would probably share some outputs from the public domain texts, but otherwise not release any unprocessed data about those books. Despite this, getting ahold of usable digital copies of these texts would also be tricky. I would need files that I can pull the maps from (either in image or pdf format) as well as pull the plain text for the body of the text. Most likely, this would require a fair amount of manual effort for each novel that I include, with some pre-determined considerations that would have to be made regarding series and how to count them.
Of course, there are other interesting trends we can look for if we did such an investigation: Is there a relationship between maps or map scales and the amount of 'spatial' language used? How do the number of locations correlate to the number of characters cast in that setting?