- From BBC News: HP, Intel and Yahoo collaborate on cloud computing, initially with 6 data centres working with a select group of researchers - http://news.bbc.co.uk/1/hi/technology/7531352.stm
- Microsoft demonstrates its Sphere prototype - http://blog.seattlepi.nwsource.com/microsoft/archives/144629.asp
Wednesday, 30 July 2008
Cognition launches Semantic Medline
"...enables complex health and life science material to be rapidly and efficiently discovered with greater precision and completeness using natural language processing (NLP) technology"
I tried a quick search "exercise and depression" just to see it working - results are mostly relevant on the first couple of pages - it does offer you to select the correct meaning e.g. of depression (feeling of sadness/hopelessness) but still seems to bring up records referring to other meanings (e.g. ST segmental depression) - although I guess it's impossible to avoid that - and the definitions might be more useful if sourced from a medical dictionary which they don't appear to be. It would be interesting to compare results using MeSH.
Given that my search retrieved over 7000 results, it would also be useful to have some options for narrowing the search - suggesting additional search terms (e.g. are you interested in a particular population e.g. postnatal?)
Monday, 28 July 2008
"The mobile web has reached a "critical mass" of users this year, according to a report by analysts Nielsen Mobile.
The US is the most tech savvy nation with nearly 40 million Americans - 16% of all US mobile users - using their handset to browse on the move.
The UK and then Italy come a close second and third in the 16 countries surveyed by the analyst firm.
'PC internet users visit more than 100 domains per month, on average,' the report said.
'By contrast, the average mobile internet user in the US visited 6.4 individual websites per month.' UK use was slightly less at 5.5 per month."
Clearly, has implications for how to deliver content effectively ... could be a good way of delivering alerts, prompts, small chunks of quality content, bitesize e-learning...
Friday, 25 July 2008
"The Open Web Foundation's goal it to provide a home for community created specs. with mentorship, resources and infrastructure. Hopefully this will help teams spend time on making the spec."
ps Thanks to Ian for pointing this out
Thursday, 24 July 2008
- Google launched Knol this week, taking on Wikipedia although it does take a different approach, making authors more visible than on Wikipedia, with more emphasis on authority and reputation. Individuals can contribute but I'm not clear how contributions are validated - it recommends contributors write a bio to establish credentials and you can set permissions for others to edit your "knol" - but essentially it seems to be up to the reader to judge based on the writer's credentials. It also lets writers select IPR options, defaulting to Creative Commons. A lot of the knols there now relate to health so I'd be interested to know more about their quality framework.
- Steve Prentice from Gartner tells the BBC that the days of interacting with your computer via your mouse are numbered
- New Scientist reports "UK to get superfast broadband by 2012" (speeds of up to 100 megabits per second) -
- CILIP Gazette 11-24 July includes a feature on the latest TFPL Connect event, exploring implications of a recent CMI report on the world of work in 2018. Delegates discussed the move towards portfolio working; the role of knowledge managers; flexible working; increasing emphasis on "alliance-building", strategic planning and political skills.
- Central Office for Information releases guidelines on inclusion for public sector websites
- Interesting article reporting on James Evans' research in Science, Great minds think (too much) alike suggesting that access to more journal literature is actually resulting in fewer citations
- Article in Times Higher reporting on the suggestion by Bahram Bekhradnia, director of the Higher Education Policy Institute that HEFCE's new Research Excellence Framework should be based on peer review not solely data metrics
Tuesday, 22 July 2008
- Government web sites - Government unsure how many, how expensive and who using http://www.theregister.co.uk/2008/04/29/government_websites_uncontrolled/
- Wired story on data deluge and impact on science http://www.wired.com/science/discoveries/magazine/16-07/pb_theory
- some interesting thoughts on community building from Stan Garfield http://www.communities.hp.com/online/blogs/garfield/archive/2008/06/18/building-people-to-people-networks.aspx
- from Computing 10 July, Gartner predict a move from 1% to 20% of corporate mailboxes using a "cloud-computing provisioned model" for email by 2012
- MoSCoW tool for requirements gathering - http://en.wikipedia.org/wiki/MoSCoW_Method
- Information Commissioner's Office calls for review of 10-year old Data Protection Act
- Do you speak Geek? from BCS :-) http://www.bcs.org/server.php?show=ConWebDoc.20051
- interesting post on one of the BCS blogs about use of the internet (notably web2.0) in healthcare and the issue of quality/integrity of information posted and found both by patients and professionals - http://www.bcs.org/server.php?show=ConBlogEntry.480
- from JISC IPR newsletter, The European Commission has adopted a Recommendation on the management of intellectual property in knowledge transfer activities by universities and other public research organisations - http://www.jisc.ac.uk/whatwedo/projects/ipr/iprconsultancy/newsletter30.aspx
- New report from RIN on information handling by researchers. From the press release, "Although developing the personal, professional and career management skills of researchers is currently high on the agenda in the UK’s higher education sector, training on information seeking and information management is uncoordinated and generally not based on any systematic assessment of needs, according to a new report from the Research Information Network (RIN). A greater effort is required to ensure that training provision is more effectively coordinated and managed by agents with an interest in this agenda: libraries and other information training providers, institutional and faculty research committees within universities, central training units and research funders." Mind the Skills Gap: information-handling training for researchers (www.rin.ac.uk/training-researchinfo).
- Article by David Lewis on what libraries should be doing in the current climate to curate content - http://www.ala.org/ala/acrl/acrlpubs/crlnews/backissues2008/may08/librarybudgetsscholcomm.cfm
Friday, 18 July 2008
- Microsoft buys up Powerset, in its attempt to take on Google
- HEFCE announces 22 pilot institutions to test the new REF (http://www.timeshighereducation.co.uk/story.asp?sectioncode=26&storycode=402609)
- NHS Choices selects Capita as preferred bidder
- Google is experimenting with a Digg-like interface
- Amazon S3 experienced service outage on 20 July - one of the risks of relying on the cloud, I guess
- Encyclopaedia Britannica goes wiki
- Proquest to acquire Dialog business from Thomson Reuters
- Information : lifeblood or pollution? has some interesting thoughts about when information has value and when there is so much information it loses its value. Jakob Nielsen is quoted: 'Information pollution is information overload taken to the extreme. It is where it stops being a burden and becomes an impediment to your ability to get your work done.' Possible solutions are rating the integrity of information and clearer provenance.
- International initiative licenses resources across 4 European countries about a deal negotiated via the Knowledge Exchange with Multi-Science, ALPSP, BioOne, ScientificWorldJournal, and Wiley-Blackwell.
- A fun way of describing the amount of data Google handles
Thursday, 17 July 2008
Session 1 - Legal and policy issues
This session followed the format of a debate, with Prof Charles Oppenheim arguing for the motion that institutions retain IPR and Mags McGinley arguing that IPR should be waived (with the disclaimer that both presenters were not necessarily representing their personal or institution's views).
Charles argued that institutional ownership encourages data sharing. Curation should be done by those with the necessary skills - curation involves copying and can only be done effectively where the curator knows they are not infringing copyright therefore the IPR needs to be owned "nearby". He also explained how publishers are developing an interest in raw data repositories and wish to own the IPR on raw as well as published data. There is a real need to encourage authors from blindly handing over the IPR on raw data. He suggested a model where the author is licensed to use and manipulate data (e.g. deposit in repository) and the right to intervene should they feel their reputation is under threat. The main argument focused on preventing unthinking assignment of rights to commercial publishers.
Mags suggested that curation is best done when no-one asserts IPR. There may in fact be no IPR to assert and she explained that there is often over-assertion of rights. There is in general a lot of confusion and uncertainty around IPR which leads to poor curation - Mags suggested the only way to prevent this confusion is to waive IPR altogether. Data is more than ever now the result of collaboration relying on multiple (and often international) sources of data so unravelling the rights can be very difficult - there could be many, even 100s of owners across many jurisdictions. Mags concluded with the argument that it is easier to share data which is unencumbered by IPR issues and quoted the examples of Science Commons and CC0.
A vote at this point resulted in : 5 for the motion supporting institutional ownership; 10 against; 7 abstaining.
A lively discussion followed - here are the highlights:
- it's important to resolve IPR issues early
- NERC model - researchers own IPR and NERC licenses it (grant T&Cs)
- in order to waive your right, you have to assert it first
- curation is more than just preservation - the whole point is reuse
- funders have a greater interest in reuse than individual researchers - also have the resources to develop skills and negotiate T&Cs/contracts
- not just a question of rights but responsibilities too
- issues of long-term sustainability e.g. AHDS closure
- incentives to curate - is attribution enough?
- what is data? covered range of data including primary data collected by researcher, derived data, published results
- are disciplines too different?
- duty to place publicly funded research in the public domain? use of embargoes?
- can we rely on researchers and institutions to curate?
- "value" of data?
- curation doesn't necessarily follow ownership - may outsource
- proposal to change EU law on reuse of publicly funded research - HE now exempt - focuses on ability to commercially exploit - HEIs may have to hand over research data??
Session 2 - Capacity and skills issues
This session looked at 4 questions:
- What are the current data management skills deficits and capacity building possibilities?
- What are the longer term requirements and implications for the research community?
- What is the value of and possibilities for accrediting data management training programmes?
- How might formal education for data management be progressed?
- who are we trying to train? How do we reach them? The need for training has to appear on their "radar" - best way to reach researchers is via lab, Vice-Chancellor, Head of School of funding source.
- training should be badged e.g. "NERC data management training"
- "JISC" and "DCC" less meaningful to researchers
- a need to raise awareness of the problem first
- domain specific vs generic training
- need to target postgrads and even undergrads to embed good practice early on
- need to cover entire research lifecycle in training materials
- how is info literacy delivered in institutions now? can we use this as a vehicle for raising awareness or making early steps?
- School of Chemistry in Southampton has accredited courses which postgrads must complete - these include an element of data management
- lack of a career path for "data scientists" is a problem
- employers increasingly looking for Masters graduates as perceived to be better at info handling
- new generation of students - have a sharing ethic (web2.0) but not necessarily a sense of structured data management
- small JISC-funded study to start soon on benefits of data management/sharing
- can we tap into records management training? a role here for InfoNet?
- can we learn from museums sector? libraries sector?
- Centre for eResearch at Kings are developing "Digital Asset Management" course, to run Autumn 09
- UK Council of Research Repositories has a resource of job descriptions
- role of data curators in knowledge transfer - amassing an evidence base for commercial exploitation
- also a need for marketing data resources
Session 3 - Technical and infrastructure issues
This session explored the following questions:
- what are the main infrastructure challenges in your area?
- who is addressing them?
- why are these bodies involved? might others do better?
- what should be prioritised over the next 5 years?
Other areas touched on included:
- the role of the academic and research library
- roles and responsibilities for data curation
- how can we anticipate which data will be useful in the future?
- What is ‘just the right amount of effort’?
- What are the selection criteria – what value this data might have in the future (who owns it, who’s going to pay for it), how much effort and money would you have to regenerate this data (eg do you have the equipment and skills to replicate it?)
- not all disciplines are the same therefore one size doesn't fit all
- what should be kept? data, methodology, workflow, protocol, background info on researcher? How much context is needed?
- how much of this context metadata can be sourced directly e.g. from proposal?
- issues of ownership determine what is stored and how
- what is the purpose of retaining data - reuse or long-term storage? Should a nearline/offline storage model be used? Infrastrucutre for reuse may be different from that for long-term storage?
- Should we be supporting publication of open notebook science? (and publishing of failed experiments). What about reuse/sharing if there’s commercial gains?
- within a research environment – can we facilitiate the data curation using the carrot of sharing systems? (IT systems in the lab)
- additional context beyond the metadata
- how do we help institutions understand their infrastructural needs
- what has to happen with the various dataset systems (fedora etc) to help them link with the library and institutional systems
Tuesday, 8 July 2008
"Information is both more extensive than data and many instances of it are logically stronger than data. Information is irreducible to data. [...] This makes knowledge and information synonymous. Knowledge and information collapse into each other"
"And the wise person must not only have wide appropriate knowledge, but they must act in accordance with the knowledge they have."
The article also mentions evidence, but in a different context to the "evidence-based practice" use - this is more related to knowledge (some discussion of whether this means "know-that" or "know-how") and wisdom.
Wednesday, 2 July 2008
Some interesting thoughts of how documentation should be produced - working with the customer, to provide communication not documentation for documentation sake, writing to a good enough standard.
Some of the questions posed could apply to anyone writing any kind of documentation. Interestingly, they don't advise using templates as each system is different so will require different documentation - the thinking is that the template is resource-intensive to create; the template will ask for detail which isn't always relevant but people will attempt to write something; thus reviews take longer because there's so much more information to read through.