rhinoceros
Archon
Gender:
Posts: 1318 Reputation: 8.06 Rate rhinoceros
My point is ...
|
|
Google to scan 50 million books in 10 years!
« on: 2004-12-14 17:21:47 » |
|
Google is undertaking a huge project: to scan and digitize 50 milion titles from 5 leading research libraries in the next 10 years. Stanford, Michigan, NY Public Library, Harvard, and Oxford. They have already started, using their own technology to make it possible.
Google Print (Beta) http://print.google.com/
Google's mission is to organize the world's information and make it universally accessible and useful. Since a lot of the world's information isn't yet online, we're helping to get it there. Google Print puts the content of books where you can find it most easily – right in Google search results. <snip>
Google to digitize millions of books http://www.siliconvalley.com/mld/siliconvalley/10412659.htm
Google is launching an ambitious effort to make digital copies of some of the world's largest university library collections and will incorporate the texts into its vast Web index, apparently the largest project of its kind ever attempted.
As envisioned, almost anyone with a computer could instantly tap into enormous academic libraries -- some with texts dating back centuries.
Stanford, Harvard and Oxford universities, as well as the University of Michigan and the New York Public Library, are participating in the program, which could span years and involve scanning and indexing well more than 10 million books and periodicals. <snip>
Google is using its own, secret scanning and digitizing technology that it says will not harm older, delicate books. At the University of Michigan, the company has already equipped a special room with scanners and has been processing thousands of books a week since June.
Books will roll into Google's Web search index as they are scanned and digitized.
The full text of all publications will be scanned. But how much of each publication is accessible will depend on copyright restrictions.
Books that are in the public domain will probably have their full text available through the search engine. For works that are protected by copyright -- the majority -- Google will show either bibliographic information or snippets of text that appear around a Google user's search term.
When possible, in the search results, Google will point users to libraries where they can access the publications, or merchants online where they can purchase copies.
At Stanford, the company will copy 2 million books as part of a pilot program, University Librarian Michael A. Keller said. Harvard's pilot program will begin with 40,000 randomly selected books from the university's vast collection of 15 million titles, some of which date back four centuries. <snip>
At the University of Michigan, where Google co-founder Larry Page received his bachelor of science degree in engineering, the project is more ambitious. Google will digitize about 7 million titles, a process that could take six years. Google and the university have been working on the project for about two years. <snip>
Michigan http://www.freep.com/money/tech/mwend14e_20041214.htm
"Going as fast as we can with the traditional means of doing this, it would take us about 1,600 years to do all 7 million volumes," he said. "Google will do it in six years. <snip> If we were to do this job ourselves, it would probably cost us $600 million. That's just the human cost of preparing the material for scanning, packing it up and sending it out to vendors and then quality-control checking of the results. This is easily a billion-dollar effort. I can't imagine there's anything out there on this scale. Nothing has been conceived on this scale. It's access to a research collection that we never would have dared imagine possible. Anyone with an Internet connection now has access to a vast research library." <snip>
Oxford http://www.admin.ox.ac.uk/po/041214a.shtml
NY Times: Google Is Adding Major Libraries to Its Database http://www.nytimes.com/2004/12/14/technology/14google.html
|