Citation
(2009), "New & noteworthy", Library Hi Tech News, Vol. 26 No. 7. https://doi.org/10.1108/lhtn.2009.23926gab.001
Publisher
:Emerald Group Publishing Limited
Copyright © 2009, Emerald Group Publishing Limited
New & noteworthy
Article Type: New & noteworthy From: Library Hi Tech News, Volume 26, Issue 7.
Microsoft's Bing.com – Better Search Experience, Better Results, Better Decisions
In May 2009 Microsoft Corp. unveiled Bing, a new Decision Engine and consumer brand, providing customers with a first step in moving beyond search to help make faster, more informed decisions. Bing is specifically designed to build on the benefits of today's search engines but begins to move beyond this experience with a new approach to user experience and intuitive tools to help customers make better decisions, focusing initially on four key vertical areas: making a purchase decision, planning a trip, researching a health condition or finding a local business.
The result of this new approach is an important beginning for a new and more powerful kind of search service, which Microsoft is calling a Decision Engine, designed to empower people to gain insight and knowledge from the Web, moving more quickly to important decisions. The new service was fully deployed worldwide on Wednesday, June 3.
The explosive growth of online content has continued unabated, and Bing was developed as a tool to help people more easily navigate through the information overload that has come to characterize many of today's search experiences. Results from a custom comScore Inc. study across core search engines show that as many as 30 percent of searches are abandoned without a satisfactory result. The data also showed that approximately two-thirds of the remaining searches required a refinement or requery on the search results page.
“Today, search engines do a decent job of helping people navigate the Web and find information, but they don't do a very good job of enabling people to use the information they find,” said Steve Ballmer, Microsoft CEO. “When we set out to build Bing, we grounded ourselves in a deep understanding of how people really want to use the Web. Bing is an important first step forward in our long-term effort to deliver innovations in search that enable people to find information quickly and use the information they've found to accomplish tasks and make smart decisions.”
Based on the customer insight that 66 percent of people are using Internet search more frequently to make complex decisions, Microsoft identified three design goals to guide the development of Bing: deliver great results; deliver a more organized experience; and simplify tasks and provide insight, leading to faster, more confident decisions. The new service, built to go beyond today's search experience, includes deep innovation on core search areas including entity extraction and expansion, query intent recognition and document summarization technology as well as a new user experience model that dynamically adapts to the type of query to provide relevant and intuitive decision-making tools.
- •
Great search results. Relevant search results are still a top priority for people, yet Microsoft studies show that only one in four search queries deliver a satisfactory result. Bing helps identify relevant search results through features such as Best Match, where the best answer is surfaced and called out; Deep Links, allowing more insight into what resources a particular site has to offer; and Preview, a hover-over window that expands over a search result caption to provide a better sense of the related site's relevancy. Bing also includes one-click access to information through Instant Answers, designed to provide the sought-after information within the body of the search results page, minimizing the need for additional clicks.
- •
Organized search experience. More and more customers are regularly spending time with search engines, engaging in complex, multi-query and multi-session searches. Respondents also said an organized search experience would be twice as useful in helping find information and accomplishing tasks faster. Bing includes a number of features that organize search results, including Explore Pane, a dynamically relevant set of navigation and search tools on the left side of the page; Web Groups, which groups results in intuitive ways both on the Explore Pane and in the actual results; and Related Searches and Quick Tabs, which is essentially a table of contents for different categories of search results. Collectively, these and other features in Bing help people navigate their search results, cut through the clutter of search overload and get right down to making important decisions.
- •
Simplify tasks and provide insight. Microsoft's research identified shopping, travel, local business and information, and health-related research as areas in which people wanted more assistance in making key decisions. The current state of Internet search is not optimized for these tasks, but the Bing Decision Engine is optimized for these key customer scenarios. For example, while a consumer is using Bing to shop online, the Sentiment Extraction feature scours the Internet for user opinions and expert reviews to help leverage the community of customers as well as product experts in trying to make a buying decision.
Microsoft is committed to building better tools to help people find the shortest distance from their initial search query to the point of making an informed decision. Bing is an important first step toward this long-term vision and a strong indicator of Microsoft's commitment to move search technology forward for customers.
Zentity v.1 – Microsoft's Repository Platform – Is Available Now!
Microsoft has announced the public release of Zentity V 1.0. As part of Microsoft's commitment to support the academic community, Microsoft Research has developed “Zentity”, a Research Output Repository Platform which provides a suite of building blocks, tools, and services to create and maintain an organization's digital library ecosystem.
Zentity provides a built-in ScholarlyWorks data model with pre-defined entities, such as Lecture, Publication, Paper, Presentation, Video, File, Person, and Tag along with basic properties for each of these and well known relationships such as Author, Cites, Version, etc. The platform also provides support to create custom entities and design custom data models using the Extensibility API.
Included is an easily extendible (ASP.Net) web interface. The web interface is built using custom controls developed as part of the included UI Toolkit controls. The web interface can be customized with (CSS) style sheets to integrate with an organization's existing web site, or the ASP.Net controls can be deployed directly into the organization's current web presence.
The Search API is included, which supports Advanced Query Syntax similar to that provided by Windows Search. Included in installation is support for services such as RSS, OAI-PMH, OAI-ORE, AtomPub, and SWORD. Also provided is a pluggable Security model for Authentication and Authorization to allow an administrator to secure repository content. Extensive MSDN-style documentation for each API is included to enable developers to build new services or custom applications.
The platform is based on Microsoft's technologies (SQL Server 2008 and .NET Framework version 3.5 SP1) hence taking advantage of their robustness, their quality support infrastructure, and the plethora of developer-focused tools and documentation. New applications on top of the platform can be developed using any .NET language and the Visual Studio 2008 SP1 environment. The platform focuses on the management of academic assets – such as people, books/papers, lectures, presentations, videos, workflows, datasets, and tags – as well as the semantic relationships between them. In this latest release, developers can declaratively (or at runtime) easily introduce their own asset and relationship types. Support for various formats and services such as full-text search, OAI-PMH, RSS and Atom Syndication, BibTeX import and export, SWORD, AtomPub, RDFS, and OAI-ORE are included as part of the distribution.
http://research.microsoft.com/en-us/projects/zentity/
University of Michigan and Google Expand Agreement
The University of Michigan (U-M) announced in May 2009 that it had expanded its agreement with Google Inc. to create digital copies of millions of U-M library books and journals. The amended agreement, which strengthens library preservation efforts and increases the public's access to books, is possible because of Google's pending settlement with a broad class of authors and publishers. The U-M library is the first in the nation to expand its partnership with Google. The contract amendment is an important step in ensuring that the university's vision of broad public access to its print collection becomes a reality.
The agreement opens up the U-M library's extensive collections of 8 million works to readers and students throughout the USA with free previews, the ability to buy access to the university's collections online and through subscriptions at other institutions. Through provisions in Google's pending settlement with authors and publishers and the amended U-M agreement, Google will provide a free public access terminal, allowing every public and collegiate library in the country that chooses – from those insmall towns to those at large universities – equal access to the U-M materials.
The agreement also calls for Google to contribute millions of dollars to establish up to two new research centers where scholars will be able to conduct research that would not be possible without the large number of digitized works.
The amended U-M agreement also provides for:
- •
Expanded opportunities for U-M and Google to provide users with print disabilities immediate access to millions of books.
- •
Improved digital copies for preservation efforts to protect against the inevitable deterioration of our books and also protect against loss or damages such as that experienced by New Orleans-area libraries after Hurricane Katrina.
- •
The creation of new opportunities for large-scale analysis of the written record.
- •
The expansion of the collaborative effort among libraries to build a shared storehouse of digital library content called the HathiTrust through support from Google.
- •
The ability of U-M and other participating libraries to review, and through arbitration challenge, the pricing for institutional subscriptions to ensure Google fulfills its commitment to enable widespread adoption of these services.
Link to full press release: http://www.ns.umich.edu/htdocs/releases/story.php?id=7162
HathiTrust: http://www.hathitrust.org/
US Colleges, Universities Partner with Bookshare to Provide Accessible Textbooks
Bookshare has announced a University Partnership Program to significantly increase the availability of accessible materials and textbooks on behalf of the hundreds of thousands of US post-secondary students who have a disability that keeps them from effectively reading printed books.
The Bookshare University Partnership will foster the growth of accessible materials for all US students with qualified print disabilities through contributions of books scanned on college and university campuses legally under a copyright exemption in US Copyright Law (17 U.S.C. 121, often referred to as the Chafee Amendment). Under the Chafee Amendment, Bookshare membership is available to people who provide proof of a print disability, such as blindness or low vision, a reading disability or a physical disability that makes it difficult or impossible to read standard print. Eleven US colleges and universities now participate in the program: Arizona State University, De Anza Community College, CA, Indiana University, Michigan State University, Monterey Peninsula Community College, CA, The Ohio State University, Texas A&M University, University of California at Berkeley, University of Montana, University of Idaho, and The Hadley School for the Blind, IL.
Typically, post-secondary students must wait months after the start of a semester before getting their textbooks in a format they can read. Each year, across the country, university personnel engage in a labor intensive process at the beginning of a term to scan books or obtain digital files from publishers to provide students who have qualified print disabilities with accessible textbooks.
Adhering closely to the Chafee Amendment, Bookshare will only accept donations of books purchased and scanned for students with qualified print disabilities or given to a college or university by a publisher with express permission to share the book with groups like Bookshare. Each book scanned on campus and donated to Bookshare for distribution reduces the duplication of effort nationally, minimizing the cumulative cost of scanning books. Scanning and proofreading a book can cost $100 to $1000 depending on its complexity; a collaborative sharing program will save campuses time and money on an annual basis.
With an increased collection of post-secondary books, many more students will find the materials they need in the Bookshare library and will benefit from a better overall educational experience. Colleges, universities, or post-secondary schools can become Bookshare organizational members, sign up their students with qualified print disabilities, and recommend that students register for individual memberships. Bookshare membership includes two free ebook reader software programs that read the text of the books aloud, READ:OutLoud from Don Johnston and Victor Reader Soft from HumanWare. Campuses with Bookshare memberships can install these applications on all computers used by students with print disabilities. Students with individual memberships can install the applications on their personal computers.
Every book downloaded is fingerprinted using Bookshare's Digital Rights Management (DRM) technology. Universities contributing books will benefit from increased protection against illegal file sharing. The Bookshare DRM technology maintains a record of each downloaded book to identify potential misuse and copyright infringement.
Bookshare adds over 1,000 legally scanned books per month from universities, the NIMAC (National Instructional Materials Accessibility Center), publishers, and volunteers. The collection now tops over 46,000 books, including textbooks, literature, teacher-recommended reading, New York Times best sellers, newspapers, and periodicals.
BookGlutton.com Brings Social Networking Approach to On-Line Reading
A new website, BookGlutton.com, launched in January 2009, and its founders think it will change the way people read books online. Described as a cross between a book, a computer and a book group, BookGlutton has created a web-based e-book reader that lets users discuss the book from the inside. Using web 2.0 technologies BookGlutton has turned books into conversations, allowing people around the world to connect and chat about books on a page-by-page level. Travis Alber and Aaron Miller, BookGlutton's founders, say they came up with the idea because they have friends all over the world and wanted to be able to discuss books with them. “We needed to talk and look at the book, just like you would in a physical book club,” says Alber.
Using BookGlutton's Unbound Reader is simple. After choosing something to read and opening the book (all via a web browser), two panels flank the text – Talk and Mark. The Talk panel allows users to have a real-time conversation with other users, which can be filtered down to only include users inside the same chapter (this keeps users from hearing end-of-the-book discussions). The Mark panel lets users make margin notes on any paragraph, and allows other users to respond.
BookGlutton's Unbound Reader is unique in its ability for people in different parts of the world to converse in real time about what they read, something no other e-book technologies incorporate. Unlike device-based e-book readers (Sony Reader, Amazon Kindle), and desktop e-book applications (Microsoft Reader, Adobe Digital Editions), the Unbound Reader is built entirely on open web standards. And unlike websites that feature e-books, BookGlutton does not require a user to download or install anything in order to read books. It adopts common features of device- and desktop-based e-book readers like pagination, typography, bookmarking, and annotation and puts them on the Web to allow conversing in a context-specific way about literature.
Not only does BookGlutton's social networking approach and web delivery make it the first web-based e-book “reader,” as well as the first major exercise in social reading – it may also solve many of the problems with the device, desktop and download e-book categories. Instead of having to pay $300 to buy hardware, users can read on computers they already own, and instead of having to pay more than the price of a paper edition and install proprietary software on their desktop computers, users can read for free in their web browsers. BookGlutton is also positioned to be read on other kinds of technology. The re-flowable format lends itself to smaller or larger screen sizes, and BookGlutton will support multiple devices in the future.
Most of the content a user can read on BookGlutton.com is public domain. There are books in English, French and Spanish, with other languages on the way. In the future, BookGlutton will offer contemporary, copyrighted content for modest fees; public domain work will continue to be free. Alber and Miller also hope that user-generated content will keep the catalog fresh and conversations frequent, particularly for writers. The upload tool lets any user upload personal work.
In May 2009, BookGlutton.com released its social reading platform as a widget; now users anywhere on the web can embed BookGlutton's books and community inside their own site. Customized versions that allow social networks and other book sites to add their own logos, skins, and catalogs to the widget are also available.
The Book Launcher widget is free to use for both individuals and institutions, and currently pulls from BookGlutton's catalog of free books. “We hope libraries, schools, and reading groups, in addition to the average reader, will take advantage of the Book Launcher widget,” says Aaron Miller, BookGlutton CTO.
BookGlutton uses the new catalog standard from the Open Publication Distribution System (http://code.google.com/p/openpub/wiki/OPDS). The “Book Launcher” widget is free to use and is easily embedded on any webpage using a free snippet of code provided by BookGlutton.com. All book detail pages will have an “embed this book” option. This is BookGlutton's second API; last year it launched the first free HTML to EPUB converter, used by ebook owners interested in converting their work into a more ebook-friendly format.
Download the Book Launcher widget: http://www.bookglutton.com/api/index.html
Springer Launches Exemplar Linguistic Tool for Authors, Editors, Researchers
In May 2009 Springer Science + Business Media launched the we bsite SpringerExemplar.com, a free linguistic tool designed to support the publishing process for authors, editors and the scientific community in general. The online service Exemplar allows researchers to quickly see how a particular term or phrase is used in peer-reviewed, published literature. It is now available and open to anyone with an Internet connection.
Springer, as an internationally renowned scientific publisher, has partnered with the Centre for Biomedical and Health Linguistics to develop this conceptual online service. The user searches for a specific term, which retrieves a list of results of where and how the keyword is used in published Springer content. The left column displays lists of different categories: the year the article was published, the top subject areas in which the term is used, the highest concentration of articles by country, and the name of the journal where the keyword is used most. By clicking on the keyword itself, the user is linked directly to the journal article where the word is located.
“It is our hope that the research community will take advantage of this tool to help consolidate the use of subject-specific terminology and facilitate the communication of scholarly information through standardization,” said Olaf Ernst, President, eProducts Management and Innovation at Springer. Researchers have found this tool helpful in locating and analyzing rare and emerging terminology.
Centre for Biomedical and Health Linguistics is a non-profit organization dedicated to facilitating communications in biomedical and health education, research, clinical care, policy making and policy implementation.
Exemplar website: http://springerexemplar.com/
OpenPublish Publishing Suite Now Available
Phase2 Technology and the Thomson Reuters Calais Initiative have debuted OpenPublish, a complete Calais-powered publishing suite for the popular open source platform. Tailored to meet the needs of today's online publishers and media providers, OpenPublish offers semantic metatagging from OpenCalais and a seamless connection to the Linked Data cloud. It taps the power of Drupal as a social publishing platform and supports everything from news coverage to Web 2.0 trends, social publishing and the increasingly in-demand topic hubs.
OpenPublish offers a solution for publishers struggling to keep up with rapidly expanding technology and user expectations. It allows editors to rethink online news presentation and use readily available tools to reach a larger audience, simultaneously publishing stories to a Copyright Management System (CMS) while integrating news and open data sources in ways not previously possible.
“OpenPublish empowers content creators to connect to the Linked Content Economy, a rapidly evolving ecosystem of enhanced and connected content that puts the Semantic Web and Linked Data to work in everyday media,” said Thomas Tague, Calais Initiative lead, Thomson Reuters. “We make it easy to enhance the value of content, improve the user experience and extend syndication to next-generation search engines, directories, social media applications and more.”
OpenPublish features automated metatagging with the Thomson Reuters Calais Web Service and helps publishers leverage suggestive tagging, geo-coding and relevancy ranking. It also enables publishers to tag entire libraries of archived content within hours for better search results, which can lead to more traffic and superior content monetization.
“Semantic technologies used with the Drupal platform help publishers enhance the user experience of their web properties by improving search rankings and making content easier to find,” said Dries Buytaert, creator of the Drupal open source project and Acquia co-founder and CTO. “We welcome the OpenPublish suite to our community and encourage its future development.”
Phase2 Technology also offers the Calais Collection, a modular integration of the Calais Web service into the Drupal platform. Now in version 3.0, the Collection has been upgraded to support Calais 4.0 and its connection to the Linked Data cloud. To learn more, visit http://drupal.org/project/opencalais.
Phase2 is nationally recognized for providing technology leadership on the web for non-profit, commercial, and media publishing clients using open source technology including Java and the Drupal social publishing platform. The experienced team of consultants is known for specialty in content management systems, open data integration, community applications, CRM, and custom application development. Phase2 practices our own AgileApproach, an agile development methodology that ensures quality and time to market of our software solutions. For more information, go to www.phase2technology.com.
The Calais initiative supports the interoperability of content and advances Thomson Reuters mission to deliver intelligent information. It leverages the company's investment in semantic technologies to offer free metadata generation services, robust developer tools, and an open standard for the generation of semantic content. It also gives users an automatic connection to the Linked Data cloud and a global metadata transport layer that helps them tap next-generation search services, directories, and social media applications. More information at: OpenCalais.com.
Open Publish Service: http://www.OpenSourceOpenMinds.com/OpenPublish
Open Publish download site: http://www.phase2technology.com/project/openpublish
Scribd Launches Scribd Store
In May 2009, Scribd announced the beta launch of Scribd Store, where anyone, both amateurs and professional authors, can upload and sell their written works to a readership of more than 60 million. The Scribd Store expands Scribd's library of free original documents to include for-purchase works, many of which are new, exclusive or hard-to-find elsewhere.
Scribd Store offers a revenue sharing agreement that gives sellers 80 per cent of revenue. Prices are set by the seller and currently range from $1 for a graphic novel to $5,000 for an in-depth China marketing research report. Sellers can also choose Scribd's automated pricing option, which generates an optimal price tag based on a cost-sales analysis of similar items in the Scribd Store. With Scribd Store's flexible pricing, publishers have complete control over price and packaging. Sellers can specify selling whole documents, a chapter or an exact selection of pages, or in installments. They can also choose whether to serialize their books for $1.00/chapter.
Documents can be read on Scribd.com, downloaded to a PC, printed, or made accessible through web-enabled mobile phones. The company expects to soon launch an iPhone application to give readers and buyers access to documents across multiple platforms; the mobile-optimized version of Scribd.com already being popular. The beta version of Scribd Store will be open at launch to buyers and sellers in the USA, with international launches to follow.
Sellers on Scribd Store must own the digital rights to the works they wish to sell and provide detailed information about their ownership of those works in order to sell their works through Scribd Store. Sellers can also easily manage their digital rights and benefit from Scribd's existing CMS that helps prevent the upload of unauthorized works onto the site. Every document uploaded to Scribd is compared to the CMS database and duplicate documents are automatically removed from Scribd.
Scribd Store: http://www.scribd.com/store
Imprezzeo Partners with Nstein to Integrate Image and Text Searching and Mining Products
Imprezzeo, an image search software company, announced in March 2009 a partnership with Nstein Technologies, Inc., a leading supplier of digital publishing solutions, including text mining, web content management, and digital asset management. Nstein will integrate Imprezzeo's image-based search engine with its text mining and search products to allow its customers to conduct faster, more accurate image searches. With this partnership, Imprezzeo offers a software development kit (SDK), access to Imprezzeo developer resources and dedicated technical support. This allows Nstein Technologies to rapidly and seamlessly integrate and deploy Imprezzeo's Image Search technology within digital publishing networks and into broader business processes and workflows.
Imprezzeo Image Search is the first image recognition and search product to use both content-based image retrieval (CBIR) and facial recognition (FR), allowing customers to use images to search for images, rather than textual search terms. The technology generates image search results that closely match a sample image either chosen by the user from an initial set of search results that can then be refined, or from an image uploaded by the user. Imprezzeo is capable of searching millions of images in seconds.
To view a demonstration of Imprezzeo Image Search, visit http://www.imprezzeo.com/demos
Nstein Technologies: http://www.nstein.com
ACQUINE Research Project: Computer Prediction of Aesthetic Quality of Images
ACQUINE (Aesthetic Quality Inference Engine) is a machine-learning based online system of computer-based prediction of aesthetic quality for color natural photographic pictures. The researchers believe that this is an important step in the intersection of computer science research and the arts because it shows that computers can learn about and exhibit “emotional responses” to visual stimuli like humans do. It has been developed at Penn State since about 2005. Dr. Ritendra Datta (now with the Google engineering office in Pittsburgh) was the main developer, working with Prof. Jia Li and Prof. James Z. Wang. The system was placed online for public use in April 2009. Dr. Dhiraj Joshi (now with the Kodak Research Labs) contributed to an earlier prototype. This is a work-in-progress and hence it undergoes algorithmic changes from time to time, in an effort to improve performance. The work is Patent Pending.
Because of the limitations on the sources from which Acquine was able to gain some understanding about aesthetics, the opinions expressed by Acquine can be biased by the group of people associated with the sources. Whereas Acquine is possibly less biased than individual people at the time of photo assessment, there are no absolute unbiased opinions on aesthetics.
Acquine is designed mainly to assess the aesthetic quality of color natural professional photographs. It is NOT designed for computer graphics, artificially-produced diagrams, figures, paintings, composite pictures, casual family photos, screenshots, out-of-focus shots, advertisement images, photos of industrial products, cartoons, political photos, news photos, etc. At the moment, Acquine cannot understand the great complexity of our human society and should not be used for assessing photos with a lot of cultural meanings.
The National Science Foundation has funded past work related to this project. The team is currently seeking funding to significantly improve the technology. AQUINE is freely available and instructions on how to upload photos and use the software are available on the About ACQUINE page: http://acquine.alipr.com/about.html
ACQUINE website: http://acquine.alipr.com/
Wolfram|Alpha Computational Knowledge Engine Officially Launched
Wolfram Alpha LLC announced in May 2009 the general availability of Wolfram|Alpha, the world's first computational knowledge engine, offered for free on the web.
Wolfram|Alpha draws on scientist Stephen Wolfram's groundbreaking work on Mathematica, the world's leading technical computing software platform, and on the discoveries he published in his paradigm-shifting book, A New Kind of Science. Over 200,000 people from throughout the world have contacted the company to learn more about Wolfram|Alpha since news of the service first surfaced broadly in March.
The long-term goal of Wolfram|Alpha is to make all systematic knowledge immediately computable and accessible to everyone. Wolfram|Alpha draws on multiple terabytes of curated data and synthesizes it into entirely new combinations and presentations. The service answers questions, solves equations, cross-references data types, projects future behaviors, and more. Wolfram|Alpha's examples pages and gallery show a few of the many uses of this new technology.
“Fifty years ago,” said Stephen Wolfram, the founder and CEO of Wolfram Research, “when computers were young, people assumed that they'd be able to ask a computer any factual question, and have it compute the answer. I'm happy to say that we've successfully built a system that delivers knowledge from a simple input field, giving access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms. Wolfram|Alpha signals a new paradigm for using computers and the web.”
Wolfram|Alpha is made up of four main “pillars” or components:
- •
(1) Curated Data. Wolfram|Alpha contains terabytes of factual data covering a wide range of fields. Teams of subject-matter experts and researchers collect and curate data, transforming it into computable forms that can be understood and operated on by computer algorithms.
- •
(2) Dynamic Computation. When Wolfram|Alpha receives a user query, it extracts the relevant facts from its stored computable data and then applies a collection of tens of thousands of algorithms, creating and synthesizing new relevant knowledge.
- •
(3) Intuitive Language Understanding. To allow Wolfram|Alpha to understand inputs entered in everyday language, its developers examine the ways people express ideas within fields and subject matters and continually refine algorithms that automatically recognize these patterns.
- •
(4) Computational Aesthetics. Wolfram|Alpha also represents a new approach to user-interface design. The service takes user inputs and builds a customized page of clearly and usefully presented computed knowledge.
Wolfram|Alpha has been entirely developed and deployed using Wolfram Research, Inc.'s Mathematica technology. Wolfram|Alpha contains nearly six million lines of Mathematica code, authored and maintained in Wolfram Workbench. In its launch configuration, Wolfram|Alpha is running Mathematica on about 10,000 processor cores distributed among five colocation facilities, using gridMathematica-based parallelism. And every query that comes into the system is served with webMathematica.
The Wolfram|Alpha launch process has been documented on the Wolfram|Alpha blog and on its Twitter and Facebook accounts. The site first went live for testing on Friday, May 15, 2009, and has been rigorously tested and further performance-tuned since then in preparation for its official launch.
Wolfram|Alpha's long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. The company aims to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything. Wolfram|Alpha builds on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.
Wolfram Research was founded in 1987 by Stephen Wolfram, who continues to lead the company today. The company is headquartered in the United States, with offices in Europe and Asia.
BCR Wins Grant to Support Digital Preservation Workshops for Digital Collaboratives
The National Endowment for the Humanities (NEH) awarded $218,154 to Bibliographical Center for Research (BCR) to support workshops on digital preservation developed specifically for leaders of collaborative digitization programs (CDPs). Knowledge and expertise of digital preservation and the ability to develop and execute digital preservation programs is critical to ensure the long-term viability of the digital materials now being created by libraries and cultural heritage organizations. Existing digital collaboratives will benefit from these workshops designed to make long-term digital preservation a core responsibility of the collaborative.
With the grant from NEH, BCR plans to assist existing digital collaboratives and their members: to develop the capacity for assuming responsibility for long-term accessibility of digital collections under their stewardship and to expand the number of collaboratives doing so; to conduct on-site digital preservation readiness assessments; and to test the efficacy of the CRL/OCLC Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) as a planning tool in a collaborative digital environment.
BCR's Digital and Preservation Services program provides leadership, expertise, and education through training and consulting services for libraries, cultural heritage organizations, and digital collaboratives. Formed after the CDP merged into BCR in April 2007, DPS offers collaborative and outreach services designed to support digitization efforts as well as access to digital and teacher toolboxes and best practice documents. During this grant, BCR will partner with PALINET (n/k/a Lyrasis) and OCLC Western.
BCR website: http://www.BCR.org
PEER Guidelines on Deposit, Assisted Deposit and Self-Archiving Now Available
PEER is a pioneering collaboration between publishers, repositories, and the research community, which aims to investigate the effects of the large-scale deposit (so called Green Open Access) on user access, author visibility, journal viability, and the broader European research environment. Supported by the EC eContentplus programme, the project will run until 2011, during which time over 50,000 European stage-2 (accepted) manuscripts from up to 300 journals will become available for archiving.
Guidelines documenting the procedures for publisher deposit and for subsequent transfer to participating PEER repositories are presented in the project's latest document following extensive consultation with both target groups within PEER.
An author helpdesk for the project is being established and will shortly be made available via the PEER website. This will serve as the major information point for EU based authors of articles in participating PEER journals who wish to self-archive their accepted manuscripts after receiving an invitation to do so.
The report sets out a major advance in repository practice in the use of the SWORD protocol which allows application-level deposit of material into repositories. It is presented as a cohesive sub-report (Appendix B), and it is expected that this may become a ready reference tool in its own right.
The Guidelines set out in the document should be read in conjunction with “D2.1 Draft report on the provision of usage data and manuscript procedures for publishers and repository managers” also available at the PEER website.
http://www.peerproject.eu/reports/
WARC File Format Published as International ISO Standard
The International Internet Preservation Consortium announced in June 2009 the publication of the WARC file format as an international standard: ISO 28500:2009, Information and documentation – WARC file format.
For many years, heritage organizations have tried to find the most appropriate ways to collect and keep track of World Wide Web material using web-scale tools such as web crawlers. At the same time, these organizations were concerned with the requirement to archive very large numbers of born-digital and digitized files. A need was for a container format that permits one file simply and safely to carry a very large number of constituent data objects (of unrestricted type, including many binary types) for the purpose of storage, management, and exchange. Another requirement was that the container need only minimal knowledge of the nature of the objects.
The WARC format is expected to be a standard way to structure, manage, and store billions of resources collected from the web and elsewhere. It is an extension of the ARC format [http://www.archive.org/web/researcher/ArcFileFormat.php], which has been used since 1996 to store files harvested on the web. WARC format offers new possibilities, notably the recording of HTTP request headers, the recording of arbitrary metadata, the allocation of an identifier for every contained file, the management of duplicates and of migrated records, and the segmentation of the records. WARC files are intended to store every type of digital content, either retrieved by HTTP or another protocol. The motivation to extend the ARC format arose from the discussion and experiences of the International Internet Preservation Consortium [http://netpreserve.org/], whose core mission is to acquire, preserve, and make accessible knowledge and information from the Internet for future generations. IIPC Standards Working Group put forward to ISO TC46/SC4/WG12 a draft presenting the WARC file format. The draft was accepted as a new Work Item by ISO in May 2005.
Over a period of four years, the ISO working group, with the Bibliothèque nationale de France [http://www.bnf.fr/] as convener, collaborated closely with IIPC experts to improve the original draft. The WG12 will continue to maintain [http://bibnum.bnf.fr/WARC/] the standard and prepare its future revision. Standardization offers a guarantee of durability and evolution for the WARC format. It will help web archiving entering into the mainstream activities of heritage institutions and other branches, by fostering the development of new tools and ensuring the interoperability of collections.
Several applications are already WARC compliant, such as the Heritrix [http://crawler.archive.org/] crawler for harvesting, the WARC tools [http://code.google.com/p/warc-tools/] for data management and exchange, the Wayback Machine [http://archive-access.sourceforge.net/projects/wayback/], NutchWAX [http://archive-access.sourceforge.net/projects/nutch/] and other search tools [http://code.google.com/p/search-tools/] for access. The international recognition of the WARC format and its applicability to every kind of digital object will provide strong incentives to use it within and beyond the web archiving community.
General information about the IIPC can be found at: http://netpreserve.org
WARC Standard specifications: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=44717
Zotero 2.0 Released
An extension to the popular open-source web browser Firefox, the Zotero research tool includes the best parts of older reference manager software (like EndNote) – the ability to store author, title, and publication fields and to export that information as formatted references – and the best parts of modern software and web applications (like iTunes and del.icio.us), such as the ability to interact, tag, and search in advanced ways. After an extensive development and testing period and the addition of even more features to make academic research easier, more collaborative, and ready for the future, Zotero 2.0 was released in May 2009. New features include:
- 1.
(1) Syncing
- •
automatic synchronization of collections among multiple computers. For example, sync your PC at work with your Mac laptop and your Linux desktop at home;
- •
free automatic backup of your library data on Zotero's servers;
- •
automatic synchronization of your attachment files to a WebDAV server (e.g. iDisk, Jungle Disk, or university-provided web storage).
- 1.
(2) People
- •
Zotero users get a personal page with a short biography and the ability to list their discipline and interests, create an online CV (simple to export to other sites), and grant access to their libraries;
- •
easily find others in one's discipline or researchers with similar interests;
- •
follow other scholars – and be followed in return.
- 1.
(3) Groups
- •
create and join public and private groups on any topic;
- •
access in real time new research materials from your groups on the web or in the Zotero interface;
- •
easily move materials from a group stream into your personal library;
- 1.
(4) Even More Functionality That Makes Your Research Easier
- •
automatic detection of PDF metadata (i.e., author, title, etc.);
- •
automatic detection and support for proxy servers;
- •
trash can with restore item functionality so you do not accidentally lose important materials;
- •
rich-text notes;
- •
a new style manager allowing you to add and delete CSLs and legacy style formats.
Zotero 2.0 was created with generous funding from the Andrew W. Mellon Foundation.
Additionally, in June 2009, various sources reported that the lawsuit brought against George Mason University over the Zotero product was dismissed. http://arstechnica.com/web/news/2009/06/thomson-reuters-suit-against-zotero-software-dismissed.ars
Zotero information and download: http://www.zotero.org/
Updated Version of Kete Software Released
Kete 1.2 is now available and, according to the developers, massively improves on Kete 1.1. Kete is open source software that can be used to create online areas for collaboration for a community. It has been called a “relational wiki” and “a mashup between content management and knowledge management”.
To obtain the code see the downloads page for details or browse the code online at http://github.com/kete/kete/. An in-depth list of features and issues resolved can be found at http://kete.net.nz/documentation/topics/show/260-kete-12-features including guides on their use.
Some highlights of the new features include:
- •
Smarter extended fields – Kete now supports richer ways of customizing content forms and item display.
- •
Baskets for everyone – site administrators can elect to allow users to create their own baskets for their stuff. They can simplify basket forms to be quick to fill out and prevent users from changing things they should not.
- •
Use images to put faces to names - users can designate their uploaded images as portraits of themselves. They can also use their existing gravatar (global avatar) if they like.
- •
More fun with images and uploaded files – thumbnails of images related for topics in latest topics display, result pages, and RSS feeds to give better illustration of what an item is about. Support for hosting podcasts. Harvesting of embedded metadata from uploaded files such as a geotag of where an image was taken.
- •
Privacy features refinements – new and refined tools to make it easier to take advantage of Kete's privacy controls. Great for things like intranets and extranets.
- •
The beginnings of cross site integration – basket administrators may now display entries from RSS feeds on basket homepages. This can be used for showing off the latest books in the online catalog, for example.
The Installation Guide at http://kete.net.nz/documentation/topics/show/114-installation has been updated for Kete 1.2. For existing users of Kete, upgrade steps are outlined at Upgrading to Kete 1.2 Release.
Kete 1.2's major work was funded by Auckland City Libraries, Te Reo o Taranaki, Katipo, Horizons Regional Council, and HLT. Kete is a Ruby on Rails application. Kete also uses Zebra, from IndexData (http://www.indexdata.com) as the basis for its search and browsing functionality.
Kete website: http://kete.net.nz/
Downloads: http://kete.net.nz/site/topics/show/25-downloads
Twithority Delivers Authority-Based Search Results from Twitter
Tsavo Media (www.tsavo.com), a new company focused on the delivery and monetization of niche content to digital consumers, in March 2009 formally introduced Twithority, “Twitter Search by – Authority,” a new search engine for the popular microblogging service Twitter.
Twithority returns Twitter search results rapidly, looking back as many as 1,000 results. It sequences results by rank, with highest ranking users first, and time (with most recent tweets first), within the top 10,000 Twitter users. Twitter is a free service that lets users keep in touch with one another through the exchange of quick, 140-character answers to one simple question: what are you doing?
Tsavo will integrate Twithority into its new Daymix network of original, dynamic consumer content sites – manolith.com, twirlit.com, kidglue.com, and nibbledish.com. Twithority will serve as an informal metric for Tsavo's content sites, along the lines of Google's zeitgeist feature.
“Given the enormous number of conversations on Twitter, how do you highlight what is most important, what the authorities and influencers are saying?” said Mike Jones, CEO and founder, Tsavo Media. “That's where Twithority comes in. We're filtering results in a meaningful way and working with manageable numbers, so Twitter can be of even greater value. In a broader sense, Twithority reflects Tsavo's entire approach to content. Throughout all of our properties, we're looking at ways to get content filtered to deliver the best providers, the best information, the best user experience possible – quickly and efficiently.”
Bibliotheca and iTeam Announce Strategic Partnership
Bibliotheca Inc., open RFID solutions partner of choice for libraries worldwide, has announced a strategic partnership with iTeam Resources Inc., a library industry leader in PC and print management solutions. This partnership will result in the seamless integration of iTeam PC and print management software products with Bibliotheca's renowned BiblioChip® self-service software interface, which already integrates with a wide range of integrated library systems.
As a result of this integration, library patrons can use a single library self-check station interface to check out library materials, reserve time on library PCs and pay for printing charges, library fines and other library charges via credit and debit cards, cash, on-line and stored value cards. The BiblioChip-iTeam products featuring integrated capabilities will be available for delivery June 1, 2009. Bibliotheca and iTeam made this joint announcement at the 2009 annual conference of the Texas Library Association, held in Houston, March 31-April 3.
iTeam Resources Inc., with headquarters in Orlando, Fla., and operations throughout the United States, Europe, South America, and Asia, offers products for print cost recovery, computer reservation and time management, wireless printing, e-commerce, self-service cost recovery equipment, library/identification card systems, library combo and smart cards.
Bibliotheca Inc. delivers a complete line of RFID solutions that optimize library circulation workflow, staff productivity, and customer service for more than 350 libraries worldwide.