Service Registries Blog

03 April 2008

Register Locally - Discover Globally

On Monday 31 March a second meeting of the Global Registries Initiative took place at the University of Southampton, UK. The initial meeting was held in Washington DC, USA on December 10 2007.


With the slogan "Register Locally - Discover Globally", the initiative aims to enable sharing of collection descriptions and serivce details across registries, thus making them discoverable globally.


The three registries that have begun this initiative are the Australian ORCA registry of repository collections, the US OCKHAM registry of the National Science Digital Library, and the UK JISC Information Environment Service Registry (IESR) of collection descriptions and services that provide access to them.


Currently this initiative is at a discussion stage, with an international workshop planned for later this year. I expect that this workshop will include the development of use cases.


The architecture for the initiative still needs to be discussed, but it will certainly be standards-based. All three of the registries currently involved describe resources using the IESR metadata schema, although with some differences, e.g. with nationally relevant property values. The collection description part of the IESR metadata schema is based on the Dublin Core Collections Application Profile, which is now endorsed by the Dublin Core Metadata Initiative. To cope with the differences in use of the IESR schema in the implementation of the three registries, an interchange format may be developed.


The devlopment of the ORCA registry was guided ISO2146 "Registry Services for Libraries and Related Organisations".


IESR already has experience of harvesting resource descriptions from other catalogues using OAI-PMH. Thus I would expect the first prototype experiments of the Global Registries Initiative will involve sharing records by OAI-PMH.

Labels: , ,

New Data in IESR

I'm pleased to report that data records created by DRAI (JISC Digital Repositories and Archives Inventory) (http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2007/project_inventory.aspx) have been imported into IESR (JISC Information Environment Service Registry) (http://iesr.ac.uk).


Also IESR now includes details of ESDS International (http://esds.ac.uk/) Macro and Micro Economic datasets. This information is harvested regularly from the UK Data Archive.

Labels:

11 January 2008

Atom-based service registries

There is an interesting article in the September/October issue of IEEE Internet Computing (ISSN 1089-7801), pp 68-71: Active Web Service Registries by Martin Trieber and Schahram Dustdar of the Vienna University of Technology. They are investigating building service registries that disseminate their records using RSS (currently but intending to upgrade to Atom). Each service provider sets up their own RSS feed describing their services at a known location, ie their own local service registry, aka active web service registry. Then there are wider service registries that are built as aggregators of the distributed local registries by whatever selection. One could envisage building higher-level, even global registries by aggregating the distributed aggregators.

They are discussing registries of Web Services (ie. SOAP) only - partly it is seen as a solution for the demise of public UDDI registries. So they have a narrower remit than IESR's of supporting many service protocols and also descibing collections. Also I get the impression that they see users of this information being people, application developers, who look up details of Web Services to plug them in to applications. There is no mention of machine-to-machine use, apart from the aggregation.

But using Atom, and aggregating local services registries, seems a neat idea. It would certainly lend itself to investigation if new instantiations of service registries were to be designed. It may provide a paradigm for building global registries.

One particular point they make struck me as being interesting to consider. They recommend including examples of use with each registered Web Service, for example a concrete SOAP message that will invoke the Web Service, to help application developers using the information, and so to better advertise and assist take-up of the Web Service.


Another interesting article in the same journal issue is pp 17-25: Requirements and Services for Metadata Management by Paolo Missier, Pinar Alper, Oscar Corcho, Ian Dunlop, and Carole Goble of The University of Manchester. This talks about capturing annotations about data resources, here in a bioinformatics context. The mass of such annotations is in fact a valuable resource in its own right, which it would be useful to capture and associate with the data resource. Their solution is very semantic web / RDF based. But the ideas are interesting, especially in this age of social bookmarking. But maybe capturing annotations about the collections within a registry is still in the 'nice to have, but in the future when we have resource to implement it' category.

Labels: , , ,

04 October 2007

Usage Guides for Collection Description

The Dublin Core Metadata Initiative (DCMI) Collection Description Community is gathering information about usage guidelines for, and examples of, collection description. A webpage that assembles the information already contributed is at:

http://epub.mimas.ac.uk/DC/dc-collections/usage-guides.html


Further contributions are welcome. Please send them to me (ann.apps@manchester.ac.uk) or to the DCMI Collection Description Community email listserv (dc-collections@jiscmail.ac.uk) (you need to be subscribed to the listserv to post to it). Following is the original request sent to the DCMI Collection Description Community for information about usage guidelines.


----------------------------------------


Please would anyone who has a collection-based project / service / application, and who is willing to share details of their usage guidelines, or examples of collection description, send them to this group (or to me). I will create a web page of links to produce an accumulated set of knowledge and expertise in this area.


We learnt at the DC2007 conference that a Dublin Core (DC) Application Profile (actually a DC Abstract Model-conformant Application Profile) is now a package, known as the Singapore Framework, which comprises:



  • Functional requirements (recommended)

  • Domain model (mandatory)

  • Description Set Profile (DSP) (mandatory)

  • Usage Guidelines (optional)

  • Encoding syntax guidelines (optional)


It has been suggested that we should create usage guidelines for collection description. It seems to me that it would be a significant job. And maybe there would need to be guidelines on different levels, eg some for library collections with strict cataloguing rules, and different guidelines for collections in other domains.


A first step would be to assess existing usage guidelines from applications, projects, services that are collection-based. Hence my request at the start of this message.


Thanks in advance for any contributions

Labels: , ,

02 July 2007

Ideas about Registries and IESR

Last week I attended two workshops held at the JISC offices in London: the Strategic e-Content Alliance Registries Workshop; and the Shared Infrastructure Services Workshop. These are various thoughts and ideas about IESR triggered by discussions at these workshops.



There was some attempt at the Registries Workshop to define a Registry. My offering, that was picked over by the meeting was: "A collection of official metadata records that describe, and assist access to, resources". On reflection, 'official' is not the right word, and suggestions included: governed, authorititive, etc. There was discussion about whether a registry differed from a catalogue or an inventory, but the workshop didn't come up with any significant differences.



A Registry for End Users. IESR's original remit was to be a middleware registry providing machine-readable (XML) resource descriptions to be used by other services. There was an initial requirement that IESR should not have a human-readable web interface at all. In fact, IESR does have a web interface, if only so that its content can be viewed by IESR staff, but it provides just a tabular view of the XML data. It is fairly apparent that so far IESR has no significant use of these machine readable descriptions. Maybe using a middleware registry in a dynamic way is too high a barrier for application developers, or maybe they prefer to choose resources to plug into their applications. Certainly IESR's lack of use is mirrored by the sparsity of machine interfaces into resources. It now occurs to me that it may be more sensible to change tack. IESR could be a registry for human readers that also has machine interfaces. This would correspond with other similar registries, which see their primary interface as being the web interface for human users. In the JISC IE technical architecture, 'service registries' are shown as components of the 'shared infrastructure'. But it doesn't seem unreasonable to me for IESR to also provide a service within the 'presentation layer' that is a web interface into the registry.



Surfacing hidden content. This should be one of the purposes of a registry. Surfacing more obscure and less popular content fits with the 'long tail' idea of Web 2.0. Clearly such resources need to be registered in IESR, but once there they will be discoverable. This does imply use of IESR either by people for general resource discovery, or by applications in a dynamic way. This purpose could be thwarted if IESR were used by application developers to cherry-pick resources to plug-in to their applications.



Advertising and promotion of resources. This is another purpose of a registry, which relates to the point above. I heard a presentation at the ELPUB2007 conference in Vienna about how content created by projects subsequently gets little, or sometimes no, use because no-one knows about it. Registering such resources in IESR would advertise their existence.



Collection descriptions are boring. They need to be embedded in other services for display to the end user. This would seem to present a challenge to IESR if we decide to provide a user facing web interface. Maybe we need two views, one for general resource discovery, and a second for application developer users. It would at least seem sensible to move technical details onto a subsidiary web page. Displaying transactional services would be a further challenge.



Funder stakeholder requirements. It is probable that funders as stakeholders will have some requirements on data descriptions and functionality in a registry. Currently the IESR data contribution model allows a Contributor to edit only those records that they have provided. But funder requirements may imply populating data fields across Contributors' records. In particular:



  • There is likely to be a requirement to discover all resources funded by funding body. This implies the need for a new data field isFundedBy. This would be repeatable (there may be more than one funder) and expected to be provided where appropriate.

  • There have been requests for a data field isEndorsedBy, which would allow a governing body to indicate which resources within the registry it endorses for use by organisations under its remit. This seems to be functionality on top of IESR. Possibly it could be an application run by the governing body. Alternatively IESR could consider providing this as an additional service on top of the basic registry, essentially an annotation layer.



Google indexing of registry content. There was some discussion at both workshops about whether registry content should be exposed to Google et al. Does this work against the resource in terms of Google ranking because its description is available both from the resource itself and from the registry? This problem would be compounded if records are shared between registries, when there would be even more descriptions of the same content indexed by Google. However the purpose of 'surfacing hidden content' would be aided by exposing it to Google by the registry. Possibly the answer is to expose primary registry content only to Google, i.e. resource description contributed directly to IESR, but not harvested records.



Perpetual beta. This is another tenet of Web 2.0, recognising that the online world is continually developing and changing. This seems to conflict with objectives to create official services with service level agreements and (often articifial) performance indicators. I guess the problem is that a service has to be paid for in some way and funding for a 'perpetual beta' is unlikely to be forthcoming.

Labels: , , , , ,

08 June 2007

Farewell, and Bon Voyage


Today we bid farewell to Amanda, the IESR Project Manager. Amanda has taken the bold decision to emigrate to Canada, and we wish her and her family all the best.

03 April 2007

Dublin Core Collections Application Profile is Conformant

I am very pleased to report that the Dublin Core Collections Application Profile is now 'conformant'.



The DCMI Usage Board, at their recent meeting, reviewed the updated version of the Dublin Core Collections Application Profile (DCCAP) and agreed that it conforms with the Dublin Core Abstract Model, is internally consistent, and is documented according to guidelines.



The revision of the DCCAP to produce this conformant version was undertaken by the DCMI Collections Application Profile Task Group. So I wish to acknowledge this group for all their hard work, in particular the Task Group leaders Sarah Shreeves and Muriel Foulonneau. Credit should also be given to Pete Johnston, the previous chair of the DCMI Collection Description Working Group, whose hard work over quite some time resulted in the previous, substantial versions of the DCCAP.



The conformant DCCAP is at:
http://www.dublincore.org/groups/collections/collection-application-profile/



There is a summary at:
http://www.dublincore.org/groups/collections/collection-ap-summary/



Several associated documents are:
Collections Type Vocabulary: http://www.dublincore.org/groups/collections/colldesc-type/

Collections Terms: http://www.dublincore.org/groups/collections/collection-terms/

Accrual Policy terms: http://www.dublincore.org/groups/collections/accrual-policy/

Accrual Method terms: http://www.dublincore.org/groups/collections/accrual-method/

Frequency Vocabulary: http://www.dublincore.org/groups/collections/frequency/

Labels: ,

14 February 2007

e-Framework Services Knowledgebase

On Monday 12 February I attended the JISC e-Framework Modelling Workshop. From one of the presentations, about 'Using and Contributing to the e-Framework' (http://www.e-framework.org/) I learnt that the e-Framework activities include developing a knowledge base of Services. The e-Framework is based on a service oriented architecture (SOA), advocating this approach to save development costs by promoting re-use of existing solutions. This does not mean a restriction to SOAP Web Services, many other service protocols that are designed to share data being recognised such as RSS and Z39.50.


Projects are encouraged to register their services in the knowledgebase, both non technical information about what a service does and technical information about how to write such a service. Benefits of this knowledgebase are that developers and institutions can see what others have already done and can possibly re-use contributed software and intelligence, though there is no exclusivity about a particular solution. This knowledgebase should be of value to developers, both as an information source and as a channel through which to distribute their work, if it contains: information on coding with interoperable standards, such as what others have done and specfics such as message sizes; information on data semantics; and links to actual projects and services.


Service details are submitted via a template - a Word document (maybe they need some SOA here...) - and there is a QA process before publication. Services are classified by a genre taken from a vocabulary defined by the e-Framework. The genre essentially describes broadly what the service does, eg Search. Also captured is a 'service expression'. This is a specialisation of a service genre, binding to a particular service, ie. a specified way of doing something. These expressions are standards-based but with more detail for an implementer than the standard may provide, and possibly covering more than one standard (eg. Z39.50 and CQL). The template asks for details about implementation, eg. toolkit used, and also service instances (concrete usable services) and interfaces (eg. WSDL).


The knowledgebase also registers Service Usage Models (SUMs), which are composite services, composed of more basic services. The description of these will include details of process modelling, choreography and workflows, possibly using a business process modelling language, or possibly by UML diagrams. Registration of these SUMs within the e-Framework shoud assist in identifying commonly recurring processes that could be re-used, known as Core SUMs.


I came away from this event wondering how this relates to IESR and service registries generally. They seem to be registering services as details of their software components, rather than how to call and use them. And what is recorded is documentation not actual software for downloading. But the registration of service interfaces and of service instances, even if these are just examples, seems to overlap somewhat with IESR's remit and content. It seems to me that there should be at least some conversation to discuss commonality. Maybe there could be some data sharing to some extent. Of course a big difference is that the e-Framework knowledgebase appears to be aimed at human readers - those who want to develop a service, or those looking for a service to re-use. Whereas the primary purpose of IESR is to provide details to machines as a middleware service.


I considered the e-Framework service genre list when thinking about such a list for IESR a couple of weeks ago. But it seemed to me to be too fine-grained in places and very e-learning biased. In fact some of the finer-grained terms do not seem to fit with the aim of the service genre being the 'big picture' of a service's functionality.


At the start of the workshop we were treated to a showing of an animation that has been developed to demonstrate the benefits of the SOA approach.
See http://www.jisc.ac.uk/whatwedo/programmes/programme_eframework/soa. Enjoy...

Labels: , , ,