Friday, 30 November 2007
Tuesday, 27 November 2007
Rob Lemmens from International Institute for Geo-Information Science and Earth Observation talked about end-user tools. He outlined the different approaches of corporate/national Spatial Data Infrastructures (SDIs) which is a centralised approach and Web 2.0 which is community driven. SDIs are based on stricter rules for annotation and accuracy tends to be higher than Web 2.0 tools, although this is changing. Rob outlined the need for a semantic interoperability framework (combination of ontologies, their relationships and methods for ontology-based description of info sources - data sets, services etc) and a semantic interoperability infrastructure (comprises framework and the tools to maintain and use the framework as well as the information sources produced within this framework). Rob's presentation also included a slide outlining the characteristics of an ontology which was a good representation and a demonstration of ontology visualisation (same tool which ASSERT is using for clustering?). Rob concluded by summarising what the geospatial community can learn and take from Web 2.0, for example tagging/tag clouds, tools for building ontologies (community tagging e.g Google Image Labeller), instant feedback (e.g. password strength bars when selecting a new password) - on the negative side, community-driven tagging can lead to weak semantics. Rob suggests combining the best of both SDI and Web 2.0 worlds - map the SDI and Web2.0 ontologies to create dynamic annotations of geo sources, thus improving discovery.
Ulrich Bugel from Fraunhofer Institut IITB presented on ontology based discovery and annotation of resources in geospatial applications. Ulrich talked about the ORCHESTRA project (http://www.eu-orchestra.org/) which aims to design and implement an open service-oriented architecture to improve interoperability in a risk management setting (e.g. how big is the risk of a forest fire in a certain region of the Pyrenees in a given season?). This question has spatial references (cross-border, cross-administration); temporal references (time series and prognostics); thematic reference (forest fire); and conceptual reference (what is risk?). ORCHESTRA will build a service network to address these sorts of question. Interoperability is discussed on 3 levels - syntactic (encodings), structural (schemas, interfaces), semantic (meaning). The project has produced the Reference Model for the ORCHESTRA Architecture (RM-OA), drawing on standards from OGC, OASIS, W3C, ISO 191xx, ISO RM-ODP. Many iterations of the Reference Model which led to Best Practice status at OGC. The ORCHESTRA Architecture comprises a number of semantic services: Annotation Service automatically generates meta-information from sources and relates them to elements of an ontology; Ontology Access Service enabling high-level access and queries to ontologies; Knowledge Base Service; Semantic Catalogue Service.
Ian Holt from Ordnance Survey presented on geospatial semantics research at OS. OS has one of the largest geospatial databases, unsurprisingly, with 400 million features and over 2000 concepts. Benefits of semantics research: quality control, better classification; semantic web enablement, semi-automated data integration, data and product repurposing; data mining - i.e. benefits to OS and to customers. OS has developed a topographic domain ontology which provides a framework for specifying content. www.ordnancesurvey.co.uk/ontology. Developed ontologies for hydrology; administrative geography; buildings and places. Working on addresses; settlements; and land forms. Supporting modules on mereology, spatial relations, network topology. Conceptual ontology- knowledge represented in a form understandable by people vs computational topology - knowledge represented in a form understandable by computers. A controlled natural language called Rabbit has been developed - structured English, compilable to OWL. OS is also part of the OWL 1.1. task force to develop a controlled natural language syntax. A project currently underway developing plug in for Protege with Leeds University - allows natural language descriptions and in the back end, will translate into an OWL model. The first release is scheduled for December with further release planned for March 08. Ian also talked about experimental work to semantically describe gazetteers - an RDF version (downloadable?) to represent the data and OWL ontology to describe the concepts. This work includes administrative regions and work underway to include cities etc. Through their work, OS has experienced some problems with RDF - e.g. may degrade performance (they have >10 billion triples); how much is really needed?. Ian described some work on semantic data integration e.g. "find all addresses with a taxable value over £500,000 in Southampton" so looking at how to merge ontologies (i.e. creating another ontology rather than interoperability between the two). Ian briefly covered some lessons learned - ontologies are never perfect and can't offer complete descriptions of any domain; automatic tools are used as far as possible. Ian also describe work on linking ontologies to databases using D2RQ which maps SPARQL queries to SQL, creating "virtual" RDF. Conclusions : domain experts need to be at the centre of the process; technology transfer is difficult - benefits of semantics in products and applications must be clarified.
Alun Preece from Cardiff University presented on an ontology-based approach to assigning sensors to tasks. The idea is to bridge the gap between people out in the field needing to make decisions (e.g. disaster management) and the data/information produced from networks of sensors and other sources. Issues tackled: data orchestration (determine, locate, characterise resources required); reactive source deployment (repurpose, move, redeploy resources); push/pull data delivery. The approach is ontology-centric and involves semantic matchmaking. Work on proof of concept - SAM (Sensor Assignment for Missions) software prototype and integration with a sensor network. This work is funded by US/UK to support military application - intelligence, surveillance and reconaissance (ISR) requirements. The work uses ontologies to specify ISR requirements of a mission (e.g. night surveillance, intruder detection) and to specify the ISR capabilities provided by different asset types. Uses semantic reasoning to compare mission requirements and capabilities and to decide if requirements are satisfied. For example, if a mission requires Unmanned Aerial Vehicles (UAV), the ontology would specify different types of UAV and the requirements of the mission (e.g. high altitude to fly above weather, endurance) and the semantic matchmaking (exact, subsuming, overlapping, disjoint) then leads to a preferred choice. The project has engaged with domain experts to get the information into the ontology and to share conceptualisations. Alun showed the Mission and Means Framework Ontology which is a high-level ontology which is fleshed out with more specific concepts.
Slides from the workshop will be uploaded to http://www.nesc.ac.uk/action/esi/contribution.cfm?Title=832
Wednesday, 21 November 2007
"Most clearly among our three case studies, the area of Web services demonstrates the manner in which interoperability can stimulate large-scale innovation."
Friday, 16 November 2007
"Bill Pike (Pacific Northwest National Laboratory), in his presentation on integrating knowledge models into the scientific analysis process [...] described the challenge of trying to capture scientific knowledge as it is created, with workflow models that describe the process of discovery. In this way, the knowledge of what was discovered can be connected with
the knowledge of how the discovery was made."
"If future generations of scientists are to understand the work of the present, we have to make sure they have access to the processes by which our knowledge is being formed. The big problem is that, if you include all the information about all the people, organisations, tools, resources and situations that feed into a particular piece of knowledge, the sheer quantity of data will rapidly become overwhelming. We need to find ways to filter this knowledge to create sensible structures... "
"One method for explicitly representing knowledge was presented by Alberto Canas (Institute for Human and Machine Cognition). The concept maps that he discussed are less ambiguous than natural language, but not as formal as symbolic logic. Designed to be read by humans, not machines, they have proved useful for finding holes and misconceptions in knowledge, and for understanding how an expert thinks. These maps are composed of concepts joined up by linking phrases to form propositions: the logical structure expressed in these linking phrases is what distinguishes concept maps from similar-looking, but less structured descriptions such as "mind maps". "
"..analyst Gartner predicts that four out of five companies will have taken the SOA route by 2010...SOA involves a fundamental change to the way firms think about IT - namely, as a series of interoperable business services, rather than as discrete IT systems."
The article also quotes Nick Masterton-Jones, IT Director of Vocalink: "I think SCA is something we're going to see a lot more of in the coming three years" SCA is Service component architecture "an open SOA promoted by major Java vendors to bridge the gap between people who understand the business domain and people who understand system design".
Monday, 12 November 2007
OECD Principles (2007): http://www.oecd.org/document/55/0,3343,en_2649_37417_38500791_1_1_1_37417,00.html
RIN's Stewardship of Digital Research Data (2007): http://www.rin.ac.uk/data-principles
MRC's Guidelines on data sharing: http://www.mrc.ac.uk/PolicyGuidance/EthicsAndGovernance/DataSharing/PolicyonDataSharingandPreservation/index.htm
BBSRC's Guidelines on data sharing: http://www.bbsrc.ac.uk/support/guidelines/datasharing/context.html
Plus some interesting outputs from JISC-funded projects:
- Liz Lyon's Dealing with Data report: http://www.jisc.ac.uk/whatwedo/programmes/programme_digital_repositories/project_dealing_with_data.aspx. A very comprehensive overview with a list of recommendations which is now being reviewed by JISC.
- GRADE project : http://edina.ac.uk/projects/grade/Grade_reportRSSv2.pdf. Found that researchers most commonly use USB stick and email to share small datasets. Also noted that as well as enabling sharing/preservation, a national repository would enable UK to contribute to European and other international intiatives.
- DISC-UK Datashare : http://www.disc-uk.org/docs/state-of-the-art-review.pdf. One interesting finding reported is from Australian colleagues who found that a single repository wasn't proving effective and they subsequently moved towards two distinct repositories: one to enable collaboration on work-in-progress and one for published outputs/datasets.
There's a lot in these links about the wider context; how things look now; barriers to data sharing (e.g. trust, IPR, time); discussion on possible solutions (e.g. social software models, reward/recognition, mandates).
Friday, 9 November 2007
- Savas Parastatidis from Microsoft talking about "the cloud" http://www.ogf.org/OGF21/materials/1031/2007.10.15%20-%20OGF%20-%20Web%202.0-Cloud%20Era%20and%20its%20Impact%20on%20how%20we%20do%20Research.pdf
- Dave de Roure talking about the JISC-funded myExperiment (VRE2) http://www.ogf.org/OGF21/materials/1030/OGF21myExperiment.ppt
And of course not forgetting the geospatial stuff...
"1.) Integrate OGC's Web Processing Service (WPS) with a range of "back-end" processing environments to enable large-scale processing. The WPS could also be used as a front-end to interface to multiple grid infrastructures, such as TeraGrid, NAREGI, EGEE, and the UK's National Grid Service. This would be an application driver for both grid and data interoperability issues.
2.) Integration of WPS with workflow management tools. OGF’s SAGA draft standard is where multiple WPS calls could be managed.
3.) Integration of OGC Federated Catalogues/Data Repositories with grid data movement tools. OGF’s GridFTP is one possibility that supports secure, third-party transfers that are useful when moving data from a repository to a remote service.
However, the real goal is not just to do science, but to greatly enhance things like operational hurricane forecasting, location-based services, and anything to do with putting data on a map. WPS is just a starting point for the collaboration. As the two organizations engage and build mutual understanding of technical requirements and approaches, many other things will be possible. "
Thursday, 8 November 2007
Key issues and lessons
- Projects should use wikis/websites to enable tracking of work through the development lifecycle
- Be prepared to adapt templates for project documentation
- As a Programme Manager, you may need more regular/frequent engagement with projects - the 6-monthly progress report is not going to be sufficient
Useful links (in no particular order)
"The LIFE Project has recently published a revised model for lifecycle costing of digital objects." The project team is looking for comments via the project blog.
More info at:
"King's College London is pleased to announce the establishment of the KCL Centre for e-Research. Based in Information Systems and Services, the Centre will lead on building an e-research environment and data management infrastructure at King's, seeking to harness the potential of IT to enhance research and teaching practice across the College. The Centre also has a remit to make a significant contribution to national, European and international agendas for e-research, and in particular to carry forward in a new context the work of the AHDS across the arts and humanities.
Planning for the new Centre began on 1st October 2007 and a major launch event is planned for Spring 2008. Further information and news about the Centre and its activities will be released over the coming months."
Wednesday, 7 November 2007
- a news item, Search and aggregators set to dominate, on the recent Outsell Information Industry Outlook report:
"Watson Healy said 2008 would be 'year of the wiki', with Web 2.0 technology replacing complex portals and knowledge management, and that 'a critical mass of information professionals would take charge of wikis, blogs or other 2.0 technologies on behalf of their organisations".
- an item, PubMed recasts rules for open access re-use, on the new guidelines recently agreed by the UK PubMed Central Publishers Panel:
"Under the terms of the statement of principles, open access (OA) published articles can be copied and the text data mined for further research, as long as the original author is fully attributed".
Tuesday, 6 November 2007
There've been several publications from this programme of work recently:
- User needs study: How JISC could support Business and Community Engagement
- Evaluation report: JISC Services and the third stream
- Final report: Study of Customer Relationship Management issues in UK HE institutions
- Study: The use of publicly-funded infrastructure, services, and intellectual property for BCE
- Business and Community Engagement: An overview of JISC activitiesPortable Document Format
Friday, 2 November 2007
Good to see NaCTeM :-) A good overview of the current services and a run-through their roadmap:
"NaCTeM's text mining tools and services offer numerous benefits to a wide range of users. These range from considerable reductions in time and effort for finding and linking pertinent information from large scale textual resources, to customised solutions in semantic data analysis and knowledge management. Enhancing metadata is one of the important benefits of deploying text mining services. TM is being used for subject classification, creation of taxonomies, controlled vocabularies, ontology building and Semantic Web activities. As NaCTeM enters into its second phase we are aiming for improved levels of collaboration with Semantic Grid and Digital Library initiatives and contributions to bridging the gap between the library world and the e-Science world through an improved facility for constructing metadata descriptions from textual descriptions via TM."
Other interesting snippets:
- SURFshare programme covering the research lifecycle http://www.surffoundation.nl/smartsite.dws?ch=ENG&id=5463
- a discussion on the use of Google as a repository : "Repositories, libraries and Google complement each other in helping to provide a broad range of services to information seekers. This union begins with an effective advocacy campaign to boost repository content; here it is described, stored and managed; search engines, like Google, can then locate and present items in response to a search request. Relying on Google to provide search and discovery of this hidden material misses out a valuable step, that of making it available in the first instance. That is why university libraries need Google and Google needs university libraries."
- feedback from ECDL conference, including a workshop on a european repository ecology, featuring a neat diagram showing how presentations are disseminated after a conference using a mix of web2.0, repositories and journals http://www.ariadne.ac.uk/issue53/ecdl-2007-rpt/#10