Welcome back to Seen Elsewhere, my round-up of good stuff I’ve seen in R and data journalism this week.
There are three links here for you this week:
- An introduction to tidy text used on the novels of Jane Austen
- A ebook on text analysis in R
- This amazing map of shipping patterns
Tidytext introduction
I’ve been doing some research into text analysis in R this week and it’s something I hope to be able to start doing myself in the near future. This introduction is a good starting point by Julia Silge and David Robinson. The text analysis they ran divides words into ones with positive and negative connotations. It’s not a perfect system (what happens if you say something is ‘not good’ for instance?)
But it can produce some very informative results (this was from following their code):
This would probably mean more to me if I’d read any Jane Austen novels…
Text analysis ebook
Sticking with these two authors, they actually have published a whole ebook on tidy text, which I’m really looking forward to reading in full.
The world’s shipping routes visualised
This is a truly amazing visualisation by Kiln using hundreds of millions of data points to show the movement of cargo ships around the world. You can zoom in on bottlenecks such as the English Channel, the Panama Canal and the North American Great Lakes. It comes with a great explainer video to help make sense of all the data. They have also very kindly made it embeddable. I don’t know whether this was done using R, but it’s so good that I’ll make an exception if it wasn’t…