Hirdetés
. Hirdetés

Google bringing search to historical manuscripts

|

History buffs can search George Washington's manuscripts online today for terms like "revolution," but only thanks to the tireless workers who transcribed the hand-written documents into digital form.

Hirdetés

Soon, many other hand-written historical documents could be made available for the public to search -- and through considerably less effort -- if a research project funded by Google Inc. and being executed by three universities works out as planned.

The project, announced by Dublin City University (DCU) on Thursday, all started on a whim. DCU professor Alan Smeaton has been working on technology that can recognize objects that appear in videos. His technology can detect an object, like a car or an airplane, in the frame of a video, then extract the image to compare it to a database of images to identify it or enable it to be searched.

On a whim, Smeaton and his colleagues decided to find out if their shape matching technology could be used to identify words, so they tried it out on the archive of former U.S. President George Washington, which consists of 304,000 digital images and is available on the Library of Congress Web site. It worked well, Smeaton said.

Smeaton decided to use George Washington's archive because it includes hand written documents that have been transcribed. That meant that he could compare the results from his technology with the results from the current search system.

He had been talking to people he knows who work at Google in Dublin about the video matching technology, and happened to mention the George Washington manuscript trial. "They were interested so we did some more experiments and showed them the results and they decided to fund a project," he said.

Smeaton wouldn't say how much funding Google has committed but said it will cover a year's worth of work by three or four researchers at DCU, as well as the same number of researchers each at the University of Buffalo and the University of Massachusetts at Amherst.

The goal of the project is to demonstrate that the technique is workable and scalable, Smeaton said. If so, Google can decide to employ the technology. The researchers are not locked into making the technology only available to Google, however, Smeaton said. They plan to publish their findings as scientific research.

Ironically, it's easier to apply the technology to some manuscripts that are much older than Washington's. DCU is also involved in a project with the Dublin Institute of Advanced Studies which is digitizing manuscripts, the oldest of which dates back to the twelfth century, written in Irish. Those documents, beautifully and ornately designed by monks, are actually much easier to develop a search mechanism for, Smeaton said. "The monks were laboriously toiling over this and using great consistency across entire manuscripts," he said. "George Washington wouldn't be."

Google has also been at work scanning books from large libraries in an effort to make the contents searchable. The project, Google Book Search, has come under fire from some authors who are unhappy that Google is including books still protected by copyright without expressly gaining permission from the authors. Using the new shape matching technology to make hand written manuscripts searchable is unlikely to be met with similar criticism, since the documents are historical and wouldn't be protected by copyright.

Hirdetés
0 mp. múlva automatikusan bezár Tovább az oldalra »

Úgy tűnik, AdBlockert használsz, amivel megakadályozod a reklámok megjelenítését. Amennyiben szeretnéd támogatni a munkánkat, kérjük add hozzá az oldalt a kivételek listájához, vagy támogass minket közvetlenül! További információért kattints!

Engedélyezi, hogy a https://www.computertrends.hu értesítéseket küldjön Önnek a kiemelt hírekről? Az értesítések bármikor kikapcsolhatók a böngésző beállításaiban.