This post is from the project’s metadata specialist, Richard Gartner:

A key feature of the Dilipad project is its use of the XML schema PML (Parliamentary Metadata Language) as its core metadata format. PML was devised as part of an earlier project, LIPARM (Linking Parliamentary Records through Metadata), which constructed it as an interoperable format for components of the Parliamentary record. In that project, it was designed primarily as a discovery tool, allowing the construction of union databases of Parliamentary data; in the Dilipad project, we are extending the use we make of it to allow detailed, machine-actionable analyses of the content of this data.

PML was put together to allow all important parts of the Parliamentary record to be recorded (such as the people in it, the roles they filled, their groupings (party or otherwise) within Parliament, and above all, their contributions to proceedings (mainly, though not exclusively, speeches)). It then allows these components to be joined together to record relationships between them: a speech, for instance, is linked to information on the speaker, to the sitting in which it takes place, and to the Acts or Bill which result from the proceedings in which it is made.

The ‘glue’ that enables these components to be linked together, and to other instances of the same component in other PML documents, is the use of URIs (Universal Resource Identifiers) to specify what they are and what type of component they belong to. A record of a speech in a debate, for instance, may be identified as such in this way:-

<pml:contribution typeURI=”http://liparm.ac.uk/id/contributions/speech“>

The typeURI attribute contains a URI in a controlled vocabulary to identify precisely what type of ‘contribution’ is being recorded here.

Similarly an MP is identified though a URI which refers to their entry in a vocabulary:-

<pml:person

regURI=”http://liparm.ac.uk/id/person/person/boatengpaul1951-alive“>

<pml:label>Boateng, Paul</pml:label>

</pml:person>

Using URIs in this way allows PML records to integrate semantically with resources outside the individual PML document, particularly with the Semantic Web. Within the PML file, interlinkages between components are recorded by internal XML IDs. A  political party, for instance, may be identified by a <unit> element as follows:-

<pml:unit ID=”uk.proc.d.1992-01-16-parties-663″ type=”party”

typeURI=”http://liparm.ac.uk/id/unittype/party

regURI=”http://liparm.ac.uk/id/party/labour-party“>

<pml:label xmlns=””>Labour Party</pml:label>

</pml:unit>

 

An MP’s affiliation to the party is then marked by recording the ID of this party in the person’s categoryIDs attribute:-

<pml:person categoryIDs=”uk.proc.d.1992-01-16-parties-663″

regURI=”http://liparm.ac.uk/id/person/person/boatengpaul1951-alive“>

<pml:label>Boateng, Paul</pml:label>

</pml:person>

In this way, a complex set of linkages may be built within and outside a PML file: the resulting body of data rapidly forms a substantial corpus capable of machine-readable analysis.

For a full description of the PML schema, and a sample file, please see this page on the LIPARM project website.