The IHR Blog |

Text Mining the Old Bailey Proceedings

by

Digital History
14 June 2011
Professor Tim Hitchcock (Hertfordshire)
Text Mining the Old Bailey Proceedings
http://commons.wikimedia.org/wiki/File:Old_Bailey_Microcosm_edited.jpg

"The Old Bailey, known also as the Central Criminal Court"

The Old Bailey Online is probably one of the most successful web-based projects produced in Britain thus far.  Based on the proceedings from London’s central criminal court this is a fully searchable edition containing some 197,745 criminal trials detailing the lives of non-elite people.  One of the originators of the project, Tim Hitchcock is looking at how to use text mining tools to examine the proceedings and discover new things about them.  Text mining is the derivation of meaningful data from a large body of unstructured data, using automated methods to reveal structure and associations.  Through text mining Hitchcock is able to compare patterns of persecution over time and further examine changes in court behaviour and procedure. 

Did you know, for instance, that the shortest trial on the Old Bailey proceedings is just eight words in length whilst the longest is 320 pages and over 150,000 words?  Hitchcock believes  that previous attempts to average trial lengths per year to show trends disguises the mix of long and short trials contained within each year and also the fact that the accounts are not entirely complete, that some trials are purposefully reduced in length for very interesting reasons.  Through text mining Hitchcock shows that changes in the nature of the jury trial (and which trials would reach a jury) are vital to understanding the trends especially when looking also as the number of non-guilty verses guilty pleas and verdicts.  Hitchcock argues that plea-bargaining became increasingly important. 

At the heart of Hitchcock’s paper is an argument that data/text mining represents the beginning of a new methodology for historians studying data and that we are very much at the beginning of an exciting process of using digital tools for new historical research.  All we have to do is rise to the challenge.

For more details about text mining have a look at one of the IHR’s other projects Historie, where we will be presenting various case studies and training modules on various digital tools, including text mining.

To watch/listen to this podcast click here.

Please follow and like us:

Comments

  1. rechtsgeschiedenis

    I looked here in vain for the link to the website with the Old Bailey Proceedings, http://www.oldbaileyonline.org/. My attention was also attracted by a small typing error in the text and the tags to this post. Surely this a post about criminal trials, not about criminal trails. However, the Old Bailey Proceedings do certainly put you on the trails of criminal offenders!

    1. Matt Phillpott

      Thanks for the corrections. Not sure how ‘trial’ became ‘trail’ twice! I’ll put it down to a lack of sleep. Link and typos should now be corrected.

Comments are closed.