Skip to content


Latest Additions

You are here: Home > Specifications > Reviews > Review September 2005

IESR Metadata Review September 2005

Ann Apps

Summary: Status of reviewed metadata

Metadata Changes in Version 3.0

  • 1.2 Collection: add item format and collection size (extent); drop 'format'
  • 1.3 Collection: add optional URI scheme to 'useRights' to capture Terms and Conditions by-reference
  • 1.6 Collection: add item type; change vocabulary for 'type'
  • 2.1 Service interface is repeatable
  • 2.3 Add Service language
  • 2.4 Indicate in guidelines that Service title should be its 'official' title
  • 2.5 Service: add 'useRights' to capture Terms and Conditions
  • 2.6 Service authentication information: add schemes for 'identifier' other than URI, in particular to capture Athens resource name; add property 'iesr:mediator' to capture Shibboleth Federation.
  • 2.10 Update to ZeeRex version 2
  • 2.11, 2.16 Investigate viability of using WSDL for Keys and HTTP GET/POST detail
  • 2.13 Add several new access methods
  • 2.15 Add a vocabulary to Service seeAlso
  • 3.2 Add Agent address, postcode, country
  • 3.3 Add Agent seeAlso (as a placeholder)
  • 3.4 Add scheme for Agent 'identifier' other than URI, in particular to capture Athens institution identifier
  • 4 Change to DC XML schemas
  • 7. Service-oriented interfaces: add to software development list

Background Material about: Services; and Institutional Profiling.

1. Collection Metadata

1.1 Consistency with Dublin Core Collection Description Application Profile

[Don't change these.]

These properties differ in namespace and/or name from DC CD AP. The status of the proposed namspace cld is not clear, so it is probably not a good idea to change to terms in cld. We could change the name of the first property and put it into IESR namespace. Is it necessary to change our metadata to exactly match DC CD AP, or is it just sufficient to have a one-to-one mapping?

IESR Property DC CD AP Property
rslpcd:contentsDateRangecld:dateContentsCreated
rslpcd:ownermarcrel:owns
iesr:logocld:logo

1.2 Format

[Make change as suggested]

DC CD AP have decided to drop 'dc:format' because generally the requirement is to capture the format of the items in the collection rather than of the collection per se. The intention is to introduce a new repeatable property 'cld:itemFormat'. 'dcterms:extent' captures the size of a collection.

Many IESR collections already have a 'dc:format' property. The values are free text and in several cases quite wordy. However they are all a mix of item formats and collection size. It would not be possible to automatically correct the current data. Thus a backup copy of the current data will be kept when the property is removed. The Data Entry interface will support the new properties and records can be updated when that becomes available.

IESR Property Status
iesr:itemFormatNew
dcterms:extentNew
dc:formatDrop

1.3 Terms and Conditions

[Add URI scheme to useRights]

A means of capturing machine-readable usage rights should be added. The current 'useRights' would remain for a free text statement. An optional URI scheme will be introduced, denoting a value that is a pointer a machine can follow. It is hoped that this would be to a machine readable licence, but it could also be to an HTML page that can be displayed to end users.

'rights' for Copyright statement, and 'accessRights' for information about who can access the collection are both free text. There doesn't seem to be any requirement for a by-reference version of these.

1.4 Sub-Collections

(Deferred from last review)

We left the decision about these for after the pilot. There has not been any stakeholder request. This is really a question of the data model. Do we need to model the distinction between a Catalogue (a Collection of metadata records) and the Collection of items described by those metadata records? Those two collections may be made available by different Services with different access conditions; and the Collections themselves may have different conditions of use etc. We need to look at real world examples to explore this and make any decision. However, because we are building a practical application, it would seem better not to introduce theoretical distinctions that would be very confusing to users.

  • dc:description (rslpcd:hasDescription)
  • rslpcd:isDescriptionOf
  • dcterms:hasPart

Note that we do include 'dcterms:isPartOf' but this is a URI supplied by the resource provider with no associated functionality within IESR.

It has been suggested that METS could be used to bundle together assoicated collections, but it is not clear how this would fit with IESR metadata.

1.5 Different Audiences

(Deferred from last review)

Should we include the possibility of having several descriptions for different audiences? I suggest we defer this for now. It needs some thought about how to design it. It could be a future enhancement.

1.6 Type

[Make this change.] (Deferred from last review)

The only scheme we have for collection type is CLDT which seems inappropriate with its 'dot' notation and its mix of collection and item types. [Note: IESR adds DCMIType Collection or Service]

It is proposed to introducing a new property 'iesr:itemType' with a vocabulary DCMIType to capture item types.

The scheme for 'dc:type' will be 'cld:CollDescType', which has terms: Catalogue, HierarchicalFindingAid, Index.

It should be possible to automatically update existing data.

Note that theoretically a catalogue has records that are metadata (and therefore text). Thus, for example when describing a catalogue into an image collection, to indicate a Collection of type 'Catalogue' with 'itemType=Image' is strictly incorrect. In a perfect world the catalogue and the collection would be modelled as two separate collections with a 'describes' relation. However, to take a pragmatic approach, acknowledging that IESR is a practical application, such modelling would be over complicated for understanding by many users. The information that they actually want to know is that the catalogue will provide them access to images. Therefore we have decided to conflate the two collections and allow Collection with, eg. 'type=Catalogue' and 'itemType=Image'.

1.7 Dates

[Wait for DC-Date proposals.] (Deferred from last review)

B.C.E, geological, approximate, questionable.
We should wait for DC-Date Working Group to make proposals on this. Until then we use some ad hoc guidelines.

1.8 Audience types

(Deferred from last review)

Currently we have only education levels, and are using educationLevel (a sub-property of audience). Should we look at including audience types beyond educational ones, eg look at publishers' lists, MARC lists? We have not had any requests for this.

1.9 Local Identifiers

[Don't include.]

There have been occasional requests for capturing local identifiers or non-URI identifiers. This seems to be information extraneous to the purpose of IESR.

2. Service Metadata

2.1 Interface

[Make this change.]

The 'interface' property should be repeatable. In particular an SRW service could have 2 interface descriptions: WSDL and ZeeRex.

2.2 Serves

[Already implemented.] (Included for documentation completeness)

Include the inverse link for a service to the collection it serves.

2.3 Language

[Add this property.]

A transactional service may wish to define the language it presents to users.

2.4 Title

[No change, but indicate in guidelines that Service title should be its 'official' title.]

There has been a suggestion that IESR should capture the 'official' name of a service, ie what it should be known as in online catalogues, etc. Adding an optional 'alternative' title and require that 'title' be used for the official title seems to be over complicated for a service record. Ther is probably not any need for a wordy title beyond the 'official' title for a service.

2.5 Terms and Conditions

[Add new property]

A new property (iesr:useRights) should be added to capture (preferrably machine-readable) usage rights.

2.6 Authentication Information

[Add new scheme to identifier and new property.]

There is a requirement to capture further authentication information related to accessRights, in particular, Athens Resource Name and Shibboleth Federation.

A new scheme will be added to 'identifier', with an associated vocabulary, to capture non-URI identifiers.

A new property 'iesr:mediator' will capture the Shibboleth Federation for services with AccessRights=shibboleth.

2.7 Subject

(Deferred from last review)

Should we include this for a transactional service? Do subject keywords make sense for a transactional service? We will defer this until after some use of IESR.

2.8 Multiple Transactional Service Interfaces

(Deferred from last review)

Should different interfaces to a transactional service be correlated in some way? They are interfaces to the same functionality. In the current model they will apear as entirely separate services.

2.9 Output

(Deferred from last review)

Output is currently applicable to webcgi and openurl services only and is not searchable. This means you cannot currently search for eg. a service providing MARC records (unless this information is in the collection or service 'abstract'). For service types like Z39.50 this is currently hidden within a Zeerex file, which is for information not discovery. Should we have a searchable output field for all services? [Request from EDINA]

2.10 ZeeRex

[Make this change.]

Update to ZeeRex version 2

2.11 WebCGI Arguments

[Investigate viability and make this change if suitable.]

Change the specification of webcgi arguments to use WSDL rather than the current proprietary 'keys' file.

2.12 Service Type List

[Keep this in with extensions to list as necessary.] (Deferred from last review)

dc:type/SvcTypeList. In the data so far there are 3 instances: 2 'Alert'; and 1 'OpenURL Resolver'. These service types were mainly for transactional services. It is possible that new requirements for this will appear, eg. Shibboleth Handle service

2.13 Service Access Methods

[Add these terms.] (First item deferred from last review)

The following are suggested additions to the Service Access Method list:

The NISO Metasearch XML Gateway, and its compliance levels, will be added when the proposed standard is stable.

2.14 Access Control List (Authentication Method)

[Add this term.] (This has already been added, included for documentation completeness)

The following is a suggested addition to the Access Control list:

  • shibboleth

2.15 Further Authentication and Availability Details

[Add a vocabulary.]

A Shibboleth service that wishes to describe more details about authentication requirements when linking to a service (eg. must supply a user's identifier) can point to its own machine-readable definition using 'seeAlso'. Possibly we need a vocabulary for 'seeAlso' to capture that this is Shibboleth detail, rather than just a general document.

seeAlso could also be used to capture machine-readable SLA definitions, with a different vocabulary term. This puts the onus on the service administrator to provide the machine-readable descriptions and we just link to it, which seem like the right approach.

An new IESR vocabulary for this property would have terms: Help; Shibboleth; SLA. Existing data should all be using this for 'help'.

We could also include pointers to other details, eg. about things like opening hours, with appropriate vocabulary terms, on request. We'd need to consider these carefully. It isn't clear that opening hours have much meaning for a digital service and would anyway be covered by SLA.

Note that the value of the property will be a URI. Adding a vocabulary means that we cannot also indicate the value as a URI in the current XML format (using xsi:type), which means it will be inconsistent with other properties. However, the alternative of adding separate properties for each of the above cases would bloat the metadata, a problem that would increase with any extensions.

2.16 HTTP GET/POST

[Add this as part of 2.11.]

It is suggested that whether a 'webcgi' service supports GET and/or POST should be captured. This is really part of the service 'interface' rather than general discovery metadata. Consideration of this request will be included in the revision of the 'keys' interface (see 2.11)

3. Agent Metadata

3.1 Owns and Administers

[Already implemented.] (Included for documentation completeness)

Include the inverse link for an agent to the collections it owns and the services it administers.

3.2 Address

[Add new properties.]

Include Agent's postal address (iesr:address, iesr:postcode, iesr:country).

There have been several requests for this (CIE, Insitutional Profiling). Should we follow some standard for this or is free text sufficient? Is there likely to be any requirement for machine processing? This is outside the current scope of IESR. To be consistent with the data model we shouldn't have sub-fields in the XML.

iesr:address and iesr:postcode will be free text. iesr:country will have vocabulary ISO3166 for country code.

3.3 Further Details (seeAlso)

[Add this property as a placeholder.]

URI. Link to further details about the Agent. This could link to the type of information suggested in the Institutional Profiling Scoping Study - a machine readable Agent Profile.

This property will be a placeholder for now because these institutional profiles don't actually exist. It will eventually need an associated vocabulary.

3.4 Information to Support Authentication

[Add new scheme to identifier.]

There is a requirement to capture information to support authentication, in particular, Athens Institution Identifier.

A new scheme will be added to 'identifier', with an associated vocabulary, to capture non-URI identifiers.

3.5 Type

[Don't add this property and vocabulary.]

Type (or role) of Agent, to support institution profiles. This would have an associated vocabulary, eg. Institution, Library. The vocabulary would be extensible on request.

Adding this could be problematic:

  • Different places use different local names for Insitutions and their constituent organisations. So we'd need a definitive list. Do JISC or HEFCE have such a list?
  • It may quickly proliferate if academic departments were included. They may have collections and services, eg data repositories.
  • It would have to clearly restriced to academia. We wouldn't want a list that included every organisation's perceived type.

Therefore it would seem better if this information were included in the XML institutional profile pointed to by 'seeAlso' rather than in IESR.

This detail is currently out-of-scope for IESR. If an explicit requirement to include further profiling of agents occurs, the requisite data modelling will be performed to inform the inclusion of new Agent properties.

3.6 isPartOf

[Don't add this property.]

URI, probably but not necessarily IESR identifier. Indication of relationship between Agents.

This is intended for the academic-specific case of insitution profiles, eg Library isPartOf University. However including this property would open the door to unconstrained uses that distort the IESR model.

Therefore it would seem better if this information were included in the XML institutional profile pointed to by 'seeAlso' rather than in IESR.

The current scope of IESR has no requirement to capture the relationship between Agents. If an explicit requirement to include further profiling of agents occurs, the requisite data modelling will be performed to inform the inclusion of new Agent properties.

4. Administrative Metadata

4.1 Adminstrative Metadata Properties

[Already implemented.] (Included for documentation completeness)

The externally exposed IESR administrative metadata now consists of:

PropertyFixed Value Description
dc:creator URI of supplier of description
dc:publisherhttp://iesr.ac.uk 
dcterms:modified Last modification date
dc:source Source of metadata
dc:rightsThis IESR administrative metadata must always be retained with its associated entity description 
dc:rightshttp://creativecommons.org/licenses/by-nc-sa/2.0/uk/ 

The following fields are held internally and will be exposed for data entry: dc:creator (person creating metadata), dc:contributor, dcterms:created (dates of creation/modification included by supplier)

4.2 Audience

(Deferred from last review)

Should we include (to support descriptions for different audiences, eg HE/FE)? (Related to 1.5)

5. XML Schemas

[Make this change.]

The Dublin Core XML schemas imported by the IESR schemas cause problems for some schema validation tools. They are in fact valid but are causing problems for practical use. Should we use our own copies of these schemas adpated to overcome this problem? PJ has already developed test versions. The 'con' is that we would no longer be importing 'standard' schemas, wheras the 'pro' is that applications would be able to use our schemas.

The original suggestion was that we do this fairly quietly. Would there be any impact on users? It would be possible to introduce a test OAI-PMH metadata format and ask particular users to check it out. However this would require development effort that we lack.

The XML Schema needs updating following this metadata review anyway. The OAI-PMH protocol requires that all records are regarded as 'modified' when there are changes to its 'metadata' part.

6. XML Descriptions

[Defer.]

The IESR XML format may need updating to be consistent with proposed new guidelines for DC-in-XML that conform to the DC Abstract Model. This should be deferred until these guidelines are available. However making such a change would need careful consideration because of its impact on existing users.

7. IESR Service Enhancements

[Add to software development list.]

Various suggested interface enhancements:

  • Z39.50 searches that return: Services only; Agents only; Collections only; all with options: IESR, DC, SUTRS
  • Web searches that return: Services only; Agents only; Collections only
  • Implement OAI Sets to return sets of: Collections; Services; Agents