6/16/2022 - Future Project: Fictional Maps
As I wrap up my current project, I had an idea for my next project. However, with other commitments this summer, I don't
know exactly how soon I'll be able to start it, but I'll brainstorm it a little bit in this post.
One of my hobbies (which you may have noticed from this website) is making and looking at maps. So far, I have only
dealt with real places, mostly Baltimore. But in combination with my interest in literature and data, I began to wonder about
fictional maps. Yes, think of The Hobbit. If I take the map from Lord of the Rings and compare the scale of it to
something like, say Watership Down, will there be a correlation between the scale of the maps and the total length of the
story? This is a relatively simple calculation, we just need to compare word count to scale. With a large enough dataset,
we could even find a correlation value to see if this is a consistent trend.
However, the tricky part is actually acquiring this data in a usable format. If we are limited to the public domain, we are
obviously only using older texts, which could be limiting in terms of the amount of data. On the other hand, it could be
good since there were fewer texts with maps back then and therefore less experimentation/outliers. However, I think this would
be much more interesting if we involved copyrighted works, such as Black Leopard Red Wolf or the many other fantasy novels
that have popped up in recent years. As we have thought about copyright in previous posts, this would make the project somewhat
tricky to organize. I would probably share some outputs from the public domain texts, but otherwise not release any unprocessed
data about those books. Despite this, getting ahold of usable digital copies of these texts would also be tricky. I would
need files that I can pull the maps from (either in image or pdf format) as well as pull the plain text for the body of the
text. Most likely, this would require a fair amount of manual effort for each novel that I include, with some pre-determined
considerations that would have to be made regarding series and how to count them.
Of course, there are other interesting trends we can look for if we did such an investigation:
Is there a relationship between maps or map scales and the amount of 'spatial' language used?
How do the number of locations correlate to the number of characters cast in that setting?