Schemas and Ontologies: Building a Semantic Infrastructure for the Grid and Digital Libraries Workshop Report from E-Science Institute, Edinburgh

Library Hi Tech News

ISSN: 0741-9058

Article publication date: 1 July 2003

195

Citation

Shiri, A. (2003), "Schemas and Ontologies: Building a Semantic Infrastructure for the Grid and Digital Libraries Workshop Report from E-Science Institute, Edinburgh", Library Hi Tech News, Vol. 20 No. 7. https://doi.org/10.1108/lhtn.2003.23920gac.003

Publisher

:

Emerald Group Publishing Limited

Copyright © 2003, MCB UP Limited


Schemas and Ontologies: Building a Semantic Infrastructure for the Grid and Digital Libraries Workshop Report from E-Science Institute, Edinburgh

Schemas and Ontologies: Building a Semantic Infrastructure for the Grid and Digital Libraries Workshop Report from E-Science Institute, Edinburgh

Ali Shiri

Introduction

Organised jointly by UKOLN and the UK e-science Core Programme, this workshop brought together GRID and digital library implementers on 16 May 2003 in Edinburgh, Scotland to consider approaches to developing, expressing and sharing schemas and ontologies[1]. The main themes of the workshop were as follows:

  • Developing core vocabularies.

  • Mapping between vocabularies.

  • Sharing languages and models for declaring schemas.

  • Mandating schemas for base-line interoperability.

  • Establishing a distributed network of schema registries.

The workshop comprised of five presentations

[2] and three breakout sessions. There were around 50 participants representing a wide range of communities and disciplines. They included those mainly from computing science departments, the UK Office for Library and Information Networking (UKOLN), the e-science Institute in Edinburgh, Centre for Digital Library Research (CDLR) as well as a number of librarians and information scientists from other organisations.

The first presentation entitled "Building a semantic infrastructure" by David de Roure from the University of Southampton discussed issues surrounding "the semantic GRID" or what he defined as "an underlying computer infrastructure which enables scientists to generate, analyse, share and discuss their insights, experiments and results in a more effective manner". The focus of the semantic GRID is on ontologies and metadata-based middleware that underpin resource sharing and interoperability. One of the examples of the GRID computing is a project called "Hyphen" (www.hyphen.info), a semantic Web project at Southampton University which provides ontology-based access to home pages, departmental Web sites and other repositories of information of the Computer Science departments throughout the UK.

"Why ontologies?" was the title of the presentation by Jeremy Rogers from the University of Manchester whose particular focus was on medical terminologies and ontologies. The presentation provided an historical overview of medical terminologies and discussed issues relating to the breadth and depth of each terminology. Some of the terminologies referred to in the presentation were MeSH, UMLS and SNOMED. The presentation also shed light on problems of using medical thesauri and ontologies from the user point of view and indicated the challenges involved in ontology mapping and visual representation of concepts. The major issues which were raised at the end of the presentation were as follows: there are problems of scaling in cross mapping lists, browsing lists and translating lists, formal ontologies with formal logic may not be the total solution as users may approach them differently and with different mindsets.

The third presentation, a "Publishing and sharing schemas overview" by Rachel Heery and Pete Johnston from UKOLN, University of Bath provided an account of schema registries and the ways in which they can be used for resource discovery and re-use. A schema according to the presentation is "a structured representation that defines and identifies the data elements in an metadata element set such as MARC, Dublin Core". The Metadata for Education Group (MEG) registry was also discussed as a JISC-funded project to offer an interactive environment for schema creation and to enable metadata schema creators to declare their own schemas. The presentation also included a discussion of how ontologies could relate to metadata schema and the ways in that ontologies and schemas could share and reuse their resources. The main points made in this regard were those relating to similar interests in integration between multiple schemas and ontologies and similar interest in developing language to express semantics and the possibility of shared tools. There were also differences highlighted which pointed to such issues as scale, complexity, and focus on semantics, for instance metadata schemas focus on definitions and ontologies focus on relationships between terms. Details of the developmental work on the MEG registry can be found at: www.ukoln.ac.uk/metadata/education

Carole Goble from the University of Manchester delivered her presentation on "Implementing ontologies in GRID environments" in which she elaborated on the GRID environment and in particular on myGRID, an EPSRC e-science pilot project in bioinformatics. The aim of the project is to develop high-level services for e-science experimental management. myGRID aims to provide an environment, whereby all bio services such as databases; archives, tools and instruments can be shared through a flexible architecture within which interoperable metadata and ontologies play a major part. All services within this context should be described in detail using the agreed and shared schemas and ontologies. A host of open questions were also raised by this presentation, which touched on such issues as metadata models, defining semantic types for services, and mapping between ontologies and metadata.

Douglas Tudhope from the University of Glamorgan gave the last presentation titled "Knowledge organisation systems (KOS)". He provided a taxonomy of knowledge organisation systems including such categories as term lists, thesauri, subject headings, classification schemes, ontologies and their characteristics. The presentation also touched on issues surrounding the cross mapping/browsing/searching of thesauri and ontologies and the integration of knowledge organisation systems in digital libraries. One of the interesting issues in the presentation was the idea of a "possible KOS-based terminology server within the JISC Information Environment" with no reference to the HILT project (http://hilt.cdlr.strath.ac.uk/index2.html), which is exactly investigating and developing a pilot terminologies server with the JISC environment in view. The role of RDF and XML in the representation of knowledge organisation systems was also highlighted. He discussed the concept of facet analysis and an ongoing research project at Glamorgan University School of Computing (www.glam.ac.uk/soc/research/hypermedia/facet_proj/index.php) on using the facet analysis theory in dealing with multi-concept terms. The presentation concluded with an emphasis on the need to consider all factors influencing the ontology-based or thesaurus-based retrieval such as the importance of indexing, differences in specificity and exhaustivity of indexing, cost-benefit issues when enriching vocabularies, and the fact that we need to move beyond the minimal assumptions of current Web search engines on users, queries, query structure and collections.

Following the above presentations, three breakout sessions were held to discuss the main challenges associated with building, using and evaluating ontologies. The main points raised at each session are provided after each theme:

  1. 1.

    What are the barriers to sharing ontologies?

  2. 2.
    • Knowledge models are evolving.

    • Tools for building and sharing ontologies.

    • The extent to which the ontology can deal with search terms from users.

    • Users' cultural and educational backgrounds.

    • User studies and research into user behaviour and interaction with ontologies.

    • Quality of ontologies and the way quality can be measured.

    • Lack of awareness of other communities.

  3. 3.

    Software tools and shared services – how can existing tools and infrastructure be improved?

  4. 4.
    • What are the ontologies for?– cross searching;– cross browsing;– query expansion.

    • What do we expect from ontologies?

    • Depends on whether it is for users or for machines.

    • What functions do we expect from ontologies?– disambiguation of search terms;– mapping;– improve search precision.

    • Services associated with ontologies include:– browsing;– sharing (machine to machine).

    • What questions will the user be asking?– hierarchical support;– refining queries.

  5. 5.

    The process of building a community-led ontology – how can we maximise usage and relevance?

  6. 6.
    • Cost-benefit issues, for instance cost of generating ontologies by human (£40 per concept).

    • Mapping between terminologies.

    • How to involve people and communities to develop ontologies.

The workshop concluded by summarising the main points of the breakout sessions. A number of future actions were broadly defined in order to provide a basis for more specific projects and initiatives in the area of building, using and evaluating ontologies.

Future actions

  • Research into user behaviour.

  • How people use ontologies.

  • How people build ontologies.

  • How machines use ontologies.

  • Practice and experiment.

  • Quality of ontologies: benchmarks and certification.

  • Emphasis on the user from the digital library community point of view.

  • Emphasis on models, languages and tools from the knowledge community.

  • Communication between communities.

The future actions highlighted at the end of the workshop reflect the fact that ontologies are defined, built and used in different ways by a broad range of communities. However, this diversity is a source of enrichment and offers fertile ground for innovation, collaboration, and sharing experience and best practices across such communities as the semantic Web, digital libraries, and information retrieval.

Notes

  1. 1.
  2. 2.

    Slides from the above presentations can be found at the following address: http://umbriel.dcs.gla.ac.uk/nesc/action/esi/contribution.cfm?Title=163

Ali Shiri (shiri@dis.strath.ac.uk) is Senior Researcher, Centre for Digital Library Research, Department of Computer and Information Sciences, University of Strathclyde, Glasgow, UK.

Related articles