Friday 28 September 2007

Mashups

Interesting sites:

Mashup Dashboard http://www.programmableweb.com/mashups

Repositories Mashup http://maps.repository66.org/

Simile Timeline http://simile.mit.edu/timeline/

Grid and Web2.0

Article on Bill St Arnaud's blog on grid and web2.0

http://billstarnaud.blogspot.com/2007/05/why-web-20-needs-grid-computing.html

NSF call for community-based data interoperability networks

http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=502112&org=OCI&from=home

"Digital data are increasingly both the products of research and the starting point for new research and education activities. The ability to re-purpose data – to use it in innovative ways and combinations not envisioned by those who created the data – requires that it be possible to find and understand data of many types and from many sources. Interoperability (the ability of two or more systems or components to exchange information and to use the information that has been exchanged) is fundamental to meeting this requirement. This NSF crosscutting program supports community efforts to provide for broad interoperability through the development of mechanisms such as robust data and metadata conventions, ontologies, and taxonomies."

Guidance on bid writing

From JISC's eLearning Focus pages : http://www.elearning.ac.uk/features/bidwriting

Unbundling Windows

From BCS News -
"PCs sold in the European Union (EU) should not come with an operating system already installed on them, according to a new report. The publication created by the Globalisation Institute and submitted to the European commission (EC) suggests that it is not in the interests of consumers to keep selling systems that are bundled with Windows."
http://www.bcs.org/server.php?show=conWebDoc.14659

eBank report

Thanks to the UKOLN newsfeed for pointing out a new report from the eBank project - A Study of Curation and Preservation issues in the eCrystals Data Repository and proposed federation. Now in its 3rd phase, eBank is exploring preservation, curation and sustainability issues, with a view to progressing a federation of repositories in crystallography, thus ensuring data remains usable (and reusable by others).

Some of the specific issues considered in the report are:
  • Audit and certification: including a brief overview of recent work and current instruments. The recommendation is to use the lightweight version of the DRAMBORA toolkit due this month, as part of an annual self-audit cycle. It is acknowledged that short-term staff contracts and funding cycles have a detrimental effect on the organisational aspects which need to be in place to ensure sustainability. A further recommendation is to explore LOCKSS or CLOCKSS "to engage the crystallography community in the preservation of its valuable data"
  • Open Archival Information System (OAIS) standard: the report recommends eBank develop a formal deposit, ingest, validation and dissemination policy and that work on Representation Information looks wider than just the eCrystals repository but looks at the whole crystallography domain
  • Metadata: The report recommends further exploration of provenance information as currently versioning is the only type of information stored; and also how preservation metadata can be generated, extracted and maintained automatically

Thursday 27 September 2007

CNI news

News from the CNI points to the following interesting items this week :

CODATA Data Science journal - "Open Data for Global Science"
http://dsj.codataweb.org/special-open-data.html

CT Watch Aug 07 - "The Coming Revolution in Scholarly Communications and Cyberinfrastructure"
http://www.ctwatch.org/quarterly/

Tuesday 25 September 2007

UUK stats

Universities UK have published Patterns of higher education institutions in the UK: Seventh report which includes some useful figures. According to the report, in 2005/06, £3,120,606,000 was received by UK institutions via research grants and contracts, out of a total of £19,503,112,000.
http://bookshop.universitiesuk.ac.uk/downloads/patterns7.pdf

Also useful is their Research & Innovation Facts and Figures which includes income by subject area (the clear leader is clinical medicine) and trends in government expenditure on R&D
http://bookshop.universitiesuk.ac.uk/downloads/facts_research07.pdf

International Journal of Digital Earth

Just heard from the All Hands organisers about the forthcoming International Journal of Digital Earth to be launched next year by Taylor and Francis
http://www.digitalearth-isde.org/publications/journal/229.html

D-Lib articles on libraries and cyberinfrastructure

In Sept/Oct 07 issue, both by Anna Gold...

Cyberinfrastructure, Data, and Libraries, Part 1 : A Cyberinfrastructure Primer for Librarians http://www.dlib.org/dlib/september07/gold/09gold-pt1.html

Cyberinfrastructure, Data, and Libraries, Part 2 : Libraries and the Data Challenge: Roles and Actions for Libraries http://www.dlib.org/dlib/september07/gold/09gold-pt2.html

Mashups

Posted on the Geospatial Semantic Web blog yesterday, a post about Intel's Mash Maker .

I really like the Gartner's Hype Cycle of Emerging Technologies quoted on the page. A quick search turned up a 2007 version but not available for free :-(

Added 15/10/07: Yahoo also have a mashup service, MapMixer.

Promoting IT as a problem-solving profession

The Sept 07 issue of IT NOW (from the British Computer Society) includes a feature on the future of the IT profession and how the profession, given the importance of IT in all that we do, needs to be promoted as a "problem-solving" profession right from school-age. Currently, public perception of IT is skewed towards stereotypes and this isn't helped by the focus on teaching applications (e.g. Word) in schools, rather than on the contribution IT can make (giving the example of NHS CFH which if it works, will change how the NHS operates) to wider society.

Friday 21 September 2007

The Long Tail

Can't remember how I came across this link but I think it was in someone else's blog, referring to Chris Anderson's forthcoming new book (called "Free" I think but could be wrong!).

Anyway, Random House have included excerpts of The Long Tail on the web here.

Thursday 20 September 2007

Digital preservation - shared service across Whitehall

Also in Information World Review, a feature on the 3 year scoping exercise led by National Archives to devise a pan-government service to take on the task of digital preservation across Whitehall.

EU-funded news aggregator

Information World Review (Sept 07) includes a review of the European Media Monitor, which "works by compiling summaries of stories from across the web, which are clustered together and ranked depending on the number of articles that have been produced for a particular topic or language" [review by Daniel Griffin].

Wednesday 19 September 2007

Recent articles



Microsoft's Office Open XML not accepted as standard

Microsoft's bid for 'open' document standard is rebuffed
Article in International Herald Tribune about Microsoft's failed attempt to get their open document format, Office Open XML, recognised as an international standard.

OS MasterMap press release

Press release
OS MasterMap goes online for universities and colleges across Britain
Tens of thousands of students, staff and researchers at universities and further education colleges across Britain have online access to the country’s most advanced digital mapping from this month....

Report from BCS KIDDM Mash-Up

On Monday, I went along to the (BCS KIDDM Knowledge Mash-up ) - only stayed for the morning and was a bit disappointed that the day wasn't as interactive as the title suggested. The talks in the morning were quite high level too but it was interesting. Came across the BCS Geospatial Group for the first time.

Peter Murray has written up some of the day's presentations on his blog.

Conrad Taylor, introducing the day, covered issues around mark-up and tagging, referring to the difficulties of marking up audio/video and unstructured text; time constraints; and difficulties of subject classification.

Tony Rose talked about information retrieval and some of the innovative approaches out there:
  • semantic searching - as demonstrated by hakia and lexxe

  • natural language processing - as demonstrated by powerset and lexxe

  • disambiguation - as demonstrated by quintura
    and ask

  • assigning value to documents - as demonstrated by google

He sees future of search as addressing the following:
  • rich media search
  • multi/cross lingual search
  • vertical search
  • search agents
  • specialised content search
  • human UI
  • social search
  • answer engines
  • personalisation
  • mobile search

Tom Khazaba from SPSS talked about their products for text and data mining and the various applications they're used for (CRM, risk analysis, crime prevention etc). He stressed that the results of text analysis have to be fitted into business processes and mentioned briefly how Credit Suisse have achieved this. He listed the keys of success of text/data mining solutions:
  • ease of use
  • supports the whole process
  • comprehensive toolkit - ie features visualisation, modelling etc so all you need is in one place
  • openness - using existing infrastructure
  • performance and scalability
  • flexible deployment
Ian Herbert, from the Health Informatics SG, talked about the recent work on SNOMED-CT and its application in NHS CFH. SNOMED-CT will allow pre-coordinate and post-coordinate searching. The main challenge has been in capturing the depth of clinical expression. Concepts have qualifiers e.g. pain has a qualifier indicating severity. There has been some work mapping to MeSH although Ian seemed to think this wasn't complete. The key challenge facing the team now is rolling out - there are few examples of its use in a real-time environment. It remains to be seen if health professionals will take well to using during consultations - it is quite a complex system and as Ian admits "users want the biggest bang for their keystroke buck".

Dan Rickman introduced geospatial information systems. He referred to the importance of metadata and ontologies for handling the large volumes of unstructured data. In geospatial information, there is also a temporal aspect as many applications will view an area over time. He mentioned OS' work on a Digital National Framework which has several principles:
  • capture information at the highest resolution possible
  • capture information once and use many times
  • use existing proven standards etc
Dan also mentioned issues around object-based modelling. The OS has developed TOpographical IDentifiers (TOIDs) to identify every feature in Britain. He also mentioned the Basic Land and Property Unit (BLPU) which would be used to describe, for example, a hospital (which may have many buildings, "owned" by different organisations). He also talked about neogeography which has arisen from the web2.0 explosion.

Sunday 16 September 2007