The IHR Blog |

Research Training


Managing your Data for Historians – new AHRC-funded project History DMT

by

(Shutterstock)

(Shutterstock)

Historians don’t often like to think about data management.  Indeed, it is almost considered an ugly word or a taboo.  Data Management gets in the way of the interesting stuff – the research, the learning.  Nevertheless, it is vital to the work that we do.  History is data.  It is the essential essence of the subject.  Yet, it is so easy to leave your folder system in a complete mess or not to consider issues of preservation or back-up until necessary (or until your hard drive dies on you!).  Stuff that you produce now, for current use is understandable, but 6 months down the line, a year?  Perhaps not so much.

It is for this reason that the Institute of Historical Research in partnership with the Department of History at the University of Hull and Sheffield, as well as the Humanities Research Institute (Sheffield), applied to the AHRC Collaborative Skills Development strand late last year, to undertake a project called History DMT.  The bid was successful and work began in February.

History DMT stands for Data Management Training and Guidance.  We seek to integrate best practice, good principles, and skills of research data management within the postgraduate curriculum and among early career historians through a series of specialist workshops at London, Hull, and Sheffield and through the development of a free online training course dedicated to the research data types that historians are most likely to come across in their research.

Various pathways will enable a hands-on approach to research data management that covers the many types of data which historians generate, as well as the means with which to share that data. These will cover:

  • Textual materials
  • Visual sources
  • Oral History
  • Statistical data

Over the coming months the History SPOT blog will contain various posts about this project as it progresses, so please keep an eye out.

Further Information

This is an AHRC-funded project, as part of the Collaborative Skills Development strand. History DMT is led by the Institute of Historical Research in collaboration with the Department of History, University of Hull and the Humanities Research Institute, University of Sheffield. The principal grant holder is Professor Matthew Davies (IHR), with Dr Matt Phillpott (IHR) acting as project manager. Chris Awre (Head of Information Management within Library and Learning Innovation, University of Hull) and John Nicholls (Hull) will lead at the University of Hull, and Michael Pidd (HRI Manager, University of Sheffield) and Sharon Howard (HRI, University of Sheffield), from the University of Sheffield.

Text Mining for Historians: Natural Language Processing

by

The Institute of Historical Research now offer a wide selection of digital research training packages designed for historians and made available online on History SPOT.  Most of these have received mention on this blog from time to time and hopefully some of you will have had had a good look at them.  These courses are freely available and we only ask that you register for History SPOT to access them (which is a free and easy process).  Full details of our online and face-to-face courses can also be found on the IHR website. Here is a brief look at one of them.

When the Institute of Historical Research began building research training modules online, we decided fairly early on that they needed to be much more than just text.  In the Tex Mining for Historians module we included various videos to help learners to improve their knowledge of the subject.  One of these was a very simple introduction to natural language processing.

This video – available on the course and on vimeo is very short and discusses natural language processing (or NLP for short) in very basic terms.  This is intentional as the rest of this section of the module looks at the subject in much more detail.

[vimeo http://www.vimeo.com/47013183 w=500&h=281]

What is Natural Language Processing? from History SPOT on Vimeo.

If you would like to have a look at this module please register for History SPOT for free and follow the instructions (http://historyspot.org.uk).  If you would like further information about this course, and the others that the IHR offer please have a look at our Research Training pages on the IHR website.

Designing Databases for Historical Research

by

A sample page from the Databases course

A sample page from the Databases course

The Institute of Historical Research now offer a wide selection of digital research training packages designed for historians and made available online on History SPOT.  Most of these have received mention on this blog from time to time and hopefully some of you will have had had a good look at them.  These courses are freely available and we only ask that you register for History SPOT to access them (which is a free and easy process).  Full details of our online and face-to-face courses can also be found on the IHR website. Here is a brief look at one of them.

Designing Databases for Historical Research was one of two modules that we launched alongside History SPOT late in 2011.  Unlike most courses on databases that are generic in scope, this module focuses very much on the historian and his/her needs.  The module is written in a handbook format by Dr Mark Merry.  Mark runs our face to face databases course and is very much the man to go to for advice on building databases to house historical data.

The module looks at the theory behind using databases rather than showing you how to build them.  It is very much a starting point, a place to go to before embarking on the lengthy time that databases require of their creators.  Is your historical data appropriate for database use or should a different piece of software be used?  What things should you consider before starting the database?  Getting it right from the very beginning does save you a lot of time and frustration later on.

If you need more convincing then here is a snippet from the module, where Mark discusses the importance of thinking about the data and database before you even open up the software.

 ***

The very first step in the formal process for designing a database is to decide what purpose(s) the database is to serve. This is something that is perhaps not as obvious or as straightforward as one might expect, given that databases in the abstract can indeed serve one or more of a number of different kinds of function. In essence, however, there are three types of function that the historian is likely to be interested in:

  • Data management
  • Record linkage
  • Pattern elucidation/aggregate analysis

 

Each of these functions is a goal that can be achieved through shaping of the database in the design process, and each will require some elements of the database design to be conducted in specific ways, although they are by no means mutually exclusive. And this latter point is an important one, given that most historians will want to have access to the full range of functionality offered by the database, and will likely engage in research that will require all three of the listed types of activity. Or, to put it another way, many historians are unlikely to know precisely what it is they want to do with their database at the very beginning of the design process, which is when these decisions should be taken. This is why, as we shall see later in this section, many historians are inclined to design databases which maximise flexibility in what they can use them for later on in the project (a goal which will come at the price of design simplicity).

The data management aspect of the database is in many cases almost a by-product of how the database works, and yet it is also one of its most powerful and useful functions. Simply being able to hold vast quantities of information from different sources as data all in one place, in a form that makes it possible to find any given piece of information and see it in relation to other pieces of information, is a very important tool for the historian. Many historians use a database for bibliographical organisation, allowing them to connect notes from secondary reading to information taken from primary sources and being able to trace either back to its source. The simpler tools of database software can be used to find information quickly and easily, making the database a robust mechanism for holding information for retrieval.

 ***

Unlike the other courses on History SPOT this particular module also doubles as the unofficial first part of a much more comprehensive training course Building and Using Databases for Historians, which we have made available online.  This larger course is not free but well worth the price and effort.  By the end of that course you should be ready to use databases for analysing almost any kind of historical data that you might wish to use it with.   There is more information on that course on the module pages and also on the IHR website (as listed below)

If you would like to have a look at this module please register for History SPOT for free and follow the instructions (http://historyspot.org.uk).  If you would like further information about this course, and the others that the IHR offer please have a look at our Research Training pages on the IHR website.

Text Mining for Historians: Natural Language Processing

by

The Institute of Historical Research now offer a wide selection of digital research training packages designed for historians and made available online on History SPOT.  Most of these have received mention on this blog from time to time and hopefully some of you will have had had a good look at them.  These courses are freely available and we only ask that you register for History SPOT to access them (which is a free and easy process).  Full details of our online and face-to-face courses can also be found on the IHR website. Here is a brief look at one of them.

When the Institute of Historical Research began building research training modules online, we decided fairly early on that they needed to be much more than just text.  In the Tex Mining for Historians module we included various videos to help learners to improve their knowledge of the subject.  One of these was a very simple introduction to natural language processing.

This video – available on the course and on vimeo is very short and discusses natural language processing (or NLP for short) in very basic terms.  This is intentional as the rest of this section of the module looks at the subject in much more detail.

[vimeo http://www.vimeo.com/47013183 w=500&h=281]

What is Natural Language Processing? from History SPOT on Vimeo.

If you would like to have a look at this module please register for History SPOT for free and follow the instructions (http://historyspot.org.uk).  If you would like further information about this course, and the others that the IHR offer please have a look at our Research Training pages on the IHR website.

Text mining for Historians

by

Example page from the Text Mining course

Example page from the Text Mining course

The Institute of Historical Research now offer a wide selection of digital research training packages designed for historians and made available online on History SPOT.  Most of these have received mention on this blog from time to time and hopefully some of you will have had had a good look at them.  These courses are freely available and we only ask that you register for History SPOT to access them (which is a free and easy process).  Full details of our online and face-to-face courses can also be found on the IHR website.

I thought that it might be useful to talk a little more about these courses on the blog and provide a brief sample.  Over the coming months I will post up a series of blog posts about each of our training courses, and give you a little sneak peak so that you have a better idea what to expect.

I have chosen the Text Mining module as the first, for several reasons.  First, because it is probably the one that exemplifies what we are trying to do the best.  That is, to make digital tools accessible to historians through a series of introductory training courses.  The Text Mining for Historians module does just this, beginning from the very simple and slowly moving forward toward the more complex.

Text mining is not a tool of itself, but a series of tools that enables us to explore, interrogate, and analyse large bodies of text or texts.  Imagine, if you will, that you have gathered together a corpus of text – perhaps it’s a diary or series of diaries from a particular period, perhaps it’s a series of publications on a particular subject, or maybe it’s a set of official records spanning many decades or even centuries.  Normally you would wade through these documents one at a time and take notes.  Text mining allows you to automate certain elements of this task and helps you to discover trends and connections that you might never be able to do looking at the texts through traditional methods.

This training module takes you from the theory (i.e. what is text mining all about) through to its application for historical texts, and eventually on to the more complex areas of what is called topic modelling, natural language processing, and named entity recognition.  In this post I’m going to quote from the opening section of this course as it gives a description of what historians might consider a good use for text mining.  In this example we are looking at the Old Bailey Trial accounts used on the popular Old Bailey Proceedings Online website:

 ****

Would you like to know how often the word ‘guilty’ appears in the Old Bailey trial accounts? The answer is findable using a standard search engine on the Old Bailey Online website (it’s 182612). How about how many people were found guilty? The answer is 163261. What about the number of defendants found guilty of murder? The answer is 1518. These last two figures are not possible to find through the standard search engine as they are an entirely different type of question; we are not looking for how many times the word ‘guilty’ appears in the proceedings but how many trials resulted in a guilty verdict. We want to discover something meaningful within the body of texts, automatically rather than manually checking each and every trial account.

This is a relatively simple example of text mining where the original documents have been marked up and tagged by surname, given name, alias, offence, verdict, and punishment. To calculate those results manually you would have to work your way through 197,745 criminal trial accounts (some 127 million words in total).

This form of text mining, however, is little more than an advanced search engine – useful but limited. As the creators of the Old Bailey Online themselves admit (and have attempted to redress in a subsequent project):

‘Analyzing this kind of data by decade, or trial type, or defendant gender etc., can re-enforce the categories, the assumptions, and the prejudices the user brings to each search and those applied by the team that provided the XML markup when the digital archive was first created’.

Dan Cohen et al, ‘Data Mining with Criminal Intent’, Final White Paper (31 August 2011), p. 12.

In other words the search options and text tagging were emphasising and reinforcing a pre-determined expectation of what the resource creators believed was the important data. Text mining tools can help to explore alternative questions more openly.

The Data Mining with Criminal Intent (DMCI) project has done just this by enabling researchers not only to query the Old Bailey site but to export those results to a Zotero library to be managed and from there toVoyeur and other text mining tools for text analysis and visualisation.

The team behind the project uses the example of an investigator trying to understand the role poison might have had in murder cases. Using the search engine brings up 448 entries for ‘poison’ but doesn’t tell us much about what this means. Using Zotero and Voyeur it is possible to filter out the stop words and legal terminology common in all entries to find out what other words commonly appear near to the word ‘poison’. Through this method of text mining it was possible to conclude that poison was probably more commonly administered through drinks such as coffee than through food (see pp. 6-7 of the white paper report Data Mining with Criminal Intent’).

****

If you would like to have a look at this module please register for History SPOT for free and follow the instructions (http://historyspot.org.uk).  If you would like further information about this course, and the others that the IHR offer please have a look at our Research Training pages on the IHR website.

Building and Using Databases for Historical Research – now live!

by

Cover smallToday we are presenting the second of our recent additions to online training.

The Institute of Historical Research are very pleased to announce the launch of our first extensive and comprehensive online training course: Building and Using Databases for Historical Research.  The online course covers the entire life-cycle of creating and using a relational database and can be undertaken at any time and completed at your own pace.

Depending on the type of data that you are using to carry out historical research, databases – such as Microsoft Access – can be an essential tool for the historian.  However, few courses teach databases with historical data in mind or the needs of the historian.  We believe that this course can fill that gap.  The IHR have been running face-to-face training in Databases for a very long time, so the expansion to also provide the course online was an obvious choice for us.

Module 1 from the Databases course

Module 1 from the Databases course

 

Here is the information that we have on our website:

The aim of this training course is to equip you with the skills required to build and utilise a relational database suited to historical research. It is a non-tutor led course that can be completed at your own pace and at a time of your own choosing.

This course is a continuation of the free online course Designing databases for historical research handbook, which provides a free introduction for historians who wish to create databases. Building and Using Databases for Historical Research takes you through the entire process of creating and using databases and is, therefore, a much larger and comprehensive course. As such it is recommended to work your way through the Designing databases for historical research handbook before embarking on this course.

When you register for this course you will work through three modules that look at the following aspects of building and using databases:

Module 1 introduces the tools and techniques used in building a database for historical research. It covers the process of constructing related tables to accommodate your data, as well as introducing a number of practical measures that you can employ to control the quality of the data that you create. The Module also addresses what you need to do to incorporate existing data into a newly-constructed database.

 

Sample page from the Databases course

Sample page from the Databases course

Module 2 introduces the numerous ways that database tools can help you ask research questions of your data, ranging from simply finding individual instances of information at the micro level, through to providing complex networking and record linkage overviews. This Module also provides a basic introduction to employing queries highlight statistical patterns in large bodies of data through aggregation tools.

 

Module 3 addresses two main aspects of using a database in a historical research project: ‘managing’ the database and generating research output. The former element introduces various methods for ensuring good practice in terms of file and version control, back up and documentation – all important aspects of making sure the database is useful to your research; whilst the latter looks at ways of extracting data in various formats (including visual) to share with other historians.

 

The course costs £99 which includes access to the online materials, discussion forums and example data for a four month period.  The course ends with a final exercise where you can test the knowledge that you have gained and receive some feedback.

For further information check out the IHR research training pages or have a look at the Designing Databases for Historical Research Handbook which contains more information on the course as well.

 

InScribe goes live

by


Palaeography header 72 RGBAfter a period of tests, the introductory module of the new online course on Palaeography and Manuscript Studies is now available. InScribe provides a set of materials suitable both for someone interested in exploring Palaeography for the first time as well as for those in need of a refresher. Graduate students, academics and members of the general public undertaking this introductory module will become familiar with the most important writing styles (scripts) of the medieval period with particular reference to the English context; they will be able to explore a number of newly digitised manuscripts; and they will acquire some transcription practice.

Screenshot from the InScribe course

Screenshot from the InScribe course

The module includes short videos with experts on the field discussing relevant topics. Moreover, transcription can be practiced in the new Transcription Tool developed in collaboration with KCL.

 

Screenshot of the Transcription Tool

Screenshot of the Transcription Tool

Later in the year, we will release new modules that will provide advanced online training on Diplomatic, Script and Translation, Codicology and Illumination. The introductory module is free of charge.

To try InScribe click here. Notice that you will need to register (for free) to gain access to the module.

 

Our Inscribe online palaeography tutorial goes live

by

After a period of testing, the introductory module of the new free online course on Palaeography and Manuscript Studies is now available.

InScribe provides a set of materials suitable both for someone interested in exploring Palaeography for the first time, as well as for those in need of a refresher.Graduate students, academics and members of the general public undertaking this introductory module will become familiar with the most important writing styles (scripts) of the medieval period with particular reference to the English context; they will be able to explore a number of newly digitised manuscripts; and they will acquire some transcription practice.

The module includes short videos with experts on the field discussing relevant topics. Moreover, transcription can be practised in the new Transcription Tool developed in collaboration with King’s College London.

Later in the year, we will release new modules that will provide advanced online training on Diplomatic, Script and Translation, Codicology and Illumination. The introductory module is free of charge.
Try InScribe now. Notice that you will need to register (for free) to gain access to the module.

InScribe Paleography Learning materials

by

Palaeography header 72 RGBIn this blog post I would like to introduce you to our latest research training module on History SPOT.  InScribe is an online course for the study of Palaeography and Manuscript Studies developed by several of the institutes within the School of Advanced Study (Including the Institute of Historical Research and Institute of English Studies), with support from Senate House Library and Exeter Cathedral library.

At present we have only released the ‘introductory’ module in a test mode, and we would very much welcome any feedback on how we could improve it.  This module describes what the course is about, gives an entry point into palaeographical conventions and processes, and gives you the chance to transcribe text from a selection of actual manuscripts (well, digital scans from those manuscripts at least).  More modules will follow sometime in the new year offering various pathways on subjects such as codicology, illumination, and diplomatic.

The view has long been held at the IHR that paleography is one subject that translates well into the online format.  Although we would hesitate to suggest it in any way as a replacement for skills learnt in a classroom (or even better with actual copies of the MSS themselves) we believe that learning and practicing palaeographical skills online works well if the tools are in place to aid the student.

Example of a page from InScribe

Example of a page from InScribe

It is hoped that InScribe will increasingly fill this role in the future, providing palaeographical training at a postgraduate level.   At present, however, InScribe is in its infancy.  We have initially launched the first module in a test-mode, by which I simply mean that we will be seeking feedback about what works or doesn’t work, and what we might be able to improve upon.

The Transcription Tool

The Transcription Tool

To have a look at InScribe please log in or register to History SPOT for free and follow this link to the InScribe course.

InScribe: Palaeography Learning materials

Alternatively, for further information about the course look at the research training page on the IHR website.

GIS in the Digital Humanities: A free one day seminar

by

Lancaster University
Friday 30th November, 2012
Geographical Information Systems (GIS) are becoming increasingly used by historians, archaeologists, literary scholars, classicists and others with an interest in humanities geographies. Take-up has been hampered by a lack of understanding of what GIS is and what it has to offer to these disciplines. This free workshop, sponsored by the European Research Council’s Spatial Humanities: Texts, GIS, Placesproject and hosted by Lancaster University, will provide a basic introduction to GIS both as an approach to academic study and as a technology. Its key aims are: To establish why the use of GIS is important to the humanities; to stress the key abilities offered by GIS, particularly the capacity to integrate, analyse and visualise a wide range of data from many different types of sources; to show the pitfalls associated with GIS and thus encourage a more informed and subtle understanding of the technology; and, to provide a basic overview of GIS software and data.

Timetable:
9:30   Registration
10:00 Welcome and Introductions
10:15 Session 1: Fundamentals of GIS from a humanities perspective.
11:45 Session 2: Case studies of the use of GIS in the humanities.
13:00 Lunch
14:00 Session 3: Getting to grips with GIS software and data.
15:30 Roundtable discussion – going further with GIS.
16:30 Close

Who should come?
The workshop is aimed at a broad audience including post-graduate or masters students,members of academic staffcurriculum and research managers, and holders of major grants and those intending to apply for major grants.  Professionals in other relevant sectors interested in finding out about GIS applications are also welcome.  This workshop is only intended as an introduction to GIS, so will suit novices or those who want to brush up previous experience. It does not include any hands-on use of software – this will be covered in later events to be held 11-12th April and 15-18th July 2013.

How much will it cost?
The workshop is free of charge.  Lunch and refreshments are included. We do not provide accommodation but can recommend convenient hotels and B&Bs if required.

How do I apply?
Places are limited and priority will be given to those who apply early. As part of registering please include a brief description of your research interests and what you think you will gain from the workshop. This should not exceed 200 words.
For more details of this and subsequent events see:http://www.lancaster.ac.uk/spatialhum/training.html. To register please email a booking form (attached or available from the website) to: I.Gregory@lancaster.ac.uk who may also be contacted with informal enquiries.

< Older Posts

Newer Posts >