Thursday, July 15, 2010

Googland

Googland


[G] Translating Wikipedia

Posted: 15 Jul 2010 02:02 AM PDT

Official Google Blog: Translating Wikipedia

(Cross-posted from the Google Translate Blog)

We believe that translation is key to our mission of making information useful to everyone. For example, Wikipedia is a phenomenal source of knowledge, especially for speakers of common languages such as English, German and French where there are hundreds of thousands—or millions—of articles available. For many smaller languages, however, Wikipedia doesn't yet have anywhere near the same amount of content available.

To help Wikipedia become more helpful to speakers of smaller languages, we're working with volunteers, translators and Wikipedians across India, the Middle East and Africa to translate more than 16 million words for Wikipedia into Arabic, Gujarati, Hindi, Kannada, Swahili, Tamil and Telugu. We began these efforts in 2008, starting with translating Wikipedia articles into Hindi, a language spoken by tens of millions of Internet users. At that time the Hindi Wikipedia had only 3.4 million words across 21,000 articles—while in contrast, the English Wikipedia had 1.3 billion words across 2.5 million articles.

We selected the Wikipedia articles using a couple of different sets of criteria. First, we used Google search data to determine the most popular English Wikipedia articles read in India. Using Google Trends, we found the articles that were consistently read over time—and not just temporarily popular. Finally we used Translator Toolkit to translate articles that either did not exist or were placeholder articles or "stubs" in Hindi Wikipedia. In three months, we used a combination of human and machine translation tools to translate 600,000 words from more than 100 articles in English Wikipedia, growing Hindi Wikipedia by almost 20 percent. We've since repeated this process for other languages, to bring our total number of words translated to 16 million.

We're off to a good start but, as you can see in the graph below, we have a lot more work to do to bring the information in Wikipedia to people worldwide:

Number of non-stub Wikipedia articles by Internet users, normalized (English = 1)

We've also found that there are many Internet users who have used our tools to translate more than 100 million words of Wikipedia content into various languages worldwide. If you do speak another language we hope you'll join us in bringing Wikipedia content to other languages and cultures with Translator Toolkit.

We presented these results last Saturday, July 10, at Wikimania 2010 in Gdańsk, Poland. We look forward to continuing to support the creation of the world's largest encyclopedia and we can't wait to work with Wikipedians and volunteers to create more content worldwide.

Posted by Michael Galvez, Product Manager
URL: http://googleblog.blogspot.com/2010/07/translating-wikipedia.html

[G] Focused on Creativity and Innovation - Imagination Group goes Google

Posted: 15 Jul 2010 01:57 AM PDT

Official Google Enterprise Blog: Focused on Creativity and Innovation - Imagination Group goes Google

Editor's note: Continuing our "Going Google Everywhere" series, we've invited Matt Ballantine, CIO of Imagination Group, a global communications agency whose work with world famous brands spans all aspects of integrated, experiential and digital marketing. Imagination is an independent agency, with 12 offices around the world, and the full complement of specialists in-house, from brand consultants to architects, advertising specialists to interior designers, retail specialists and event producers to direct marketers and digital experts. Imagination's clients include Aston Martin, Guinness, oneworld Alliance, Disney, Ford, Johnson & Johnson, Goldman Sachs, Shell and Samsung. Learn more about other organizations that have gone Google on our community map.

Throughout my career I've been bemused by how, despite best intentions, most IT projects have failed to deliver any real depth of business change. Technology issues inevitably crop up through the lifetime of the project, and the first contingencies to be cut are in the plans for communication, training and business change.

We've probably all seen it - server issues, network issues, compatibility of operating systems, patching, software release bugs... the list goes on. And all the while, the business engagement work gets squeezed (if it was ever planned in depth in the first place).

The Cloud is offering an opportunity for IT departments to fundamentally change their approach to delivering services into organisations. In the two years that I've been leading IT transformation at global communications agency Imagination, I've been describing a vision where our IT team is here to help the business exploit the technology we procure, and where we leave most of the deep technical work to experts at our partners. Expertise in-house of how Imagination uses its technology to become more collaborative, more global and more creative is of real value. Understanding how to patch together operating systems on servers just isn't.

As of April 2010, Imagination has Gone Google. My team moved 600 user accounts and 2TB of legacy email data spread across 14 locations in nine countries. We worked with partners to help manage the transition and my in-house team lead by project manager Sue Chick, were able to complete the technical migration work with a minimum of fuss and effort. In turn this meant that we could focus on helping the business start to exploit new possibilities.

Only last week, I received an excited email from our Creative Director in Sydney, Australia, who had just watched the final rehearsal of a product launch being run for a client in Hong Kong via video chat at his desk. We're only just starting to see how our teams can take the tools that we have made available to them to change how we work for the better. The Imagination IT team is now aligned to help those processes happen.



Posted by Dave Armstrong, Google Apps Team
URL: http://googleenterprise.blogspot.com/2010/07/focused-on-creativity-and-innovation_14.html

[G] BlueSpace and Google Earth Enterprise: taking visualization to the skies

Posted: 15 Jul 2010 01:57 AM PDT

Official Google Enterprise Blog: BlueSpace and Google Earth Enterprise: taking visualization to the skies

Editor's Note: Justin Marston is the CEO of BlueSpace, an enterprise software company focused on the defense and intelligence communities. BlueSpace has built a next generation command and control application using their security middleware and Google Earth Enterprise. They are currently showcasing it as part of the Coalition Warrior Interoperability Demonstration (an international "war game" exercise) with support from a government agency. The BlueSpace app showcases just how far you can take Google Earth Enterprise as a visualization environment.

Geospatial visualization of multiple streams of data has been critical to the defense and intelligence communities for a long time. Whether it's showing aircraft flying around, soldiers taking a hill or different types of intelligence – seeing it on a map has been key to understanding a conflict.

In the second World War, the allies used maps with little models to show units, and moved them with poles to update their locations. With modern radar and GPS systems, things are a bit more sophisticated, but much of the mapping functionality has lagged behind. Many of the currently deployed command and control (C2) systems use flat, two-color vector maps with triangles showing units.

Visualization of AWACS plane in Google Earth

BlueSpace and AWACS
Before BlueSpace engaged them, AWACS was already actively working with 3D visualization. AWACS is the US Air Force Airborne Warning and Control System: a forward deployed radar platform (the planes with big spinning discs on top). The vision of the AWACS program has been to move away from a black screen with green triangles on it, and move towards a more visually rich C2 environment for operators that can show the terrain in which they are working.

How has BlueSpace helped? Well, we have focused on two problems – high quality, real-time visualizations and creating a Unified Operating Picture.

High Quality, Real-time Visualizations
The first problem is creating a much more "real" view of the battle theater, with 3D models moving around in real-time based on input data feeds giving latitude and longitude references for units. Our design goal was to create something more like a real-time video game using Google Earth's richness of graphics and capabilities.

BlueSpace is demonstrating its Multi-Level Security Command and Control (MLS C2) application at 5 different locations for the Coalition Warrior Interoperability Demonstration (CWID), a joint exercise between the US, UK, Canada, Australia and NATO (among others) to help find and prove technologies and systems that can help better orchestrate coalition warfare. For the exercise, BlueSpace worked with its partners to model around 100 units including aircraft, ground units and boats and of these units move around in real-time based on data feeds being fed to the application.

You can take a look at some of the interface, captured from Google Earth in this unclassified video: http://www.bluespace.com/mlsc2.html

A Unified Operating Picture
Wars used to be fought by relatively small numbers of allies, with each nation focused on a particular theater. As warfare has evolved over the last two decades, the reach of aircraft, missiles, satellites etc. have blurred lines between the different services and often between nations.

MLS C2 User Interface using Google Earth Enterprise
for geospatial visualization of ground, air and sea units

Right now, the NATO configuration of the AWACS planes can have up to 14 different screens on each AWACS aircraft – one for the US aircraft, one for the British, one for the Canadian, one for the German, etc. So when something new comes up on radar, operators may have to look at up to 14 screens to figure out what is going on.

BlueSpace has taken these separate pictures and consolidated them into a single Unified Operating Picture (UOP) that spans all the different networks, providing one Google Earth environment, with all the units in that environment, no matter which nation or service they serve. This means an operator on an AWACS plane only has to look at one screen to see what is happening – a vast improvement.

Google Earth's extensive capabilities allow an operator to fully utilize this unified operating picture to see terrain, roads, etc. in their relation to the plotted units. In addition, Google Earth's full camera controls provide the viewing flexibility necessary to interact with those units.

BlueSpace and Google
We see a great future for Google Earth Enterprise in our C2 system. Being able to see the helicopter, visually recognize its type immediately and see which mountains are next to it when the pilot calls in, "I'm taking fire from the ridge on the left" makes a big difference in a real fight. Doing all of that across many different security domains in a Unified Operating Picture that spans multiple networks – that's a game changing capability.

Posted by Natasha Wyatt, Google Earth Enterprise team











URL: http://googleenterprise.blogspot.com/2010/07/bluespace-and-google-earth-enterprise.html

[G] New keyword targeting feature rolling out globally

Posted: 14 Jul 2010 11:20 PM PDT

Inside AdWords: New keyword targeting feature rolling out globally

After a successful open beta test in the UK and Canada, we're pleased to announce that the broad match modifier is now rolling out globally in most languages*. To recap the original broad match modifier beta launch announcement:
The broad match modifier is a new AdWords targeting feature that lets you create keywords which have greater reach than phrase match and more control than broad match. Adding modified broad match keywords to your campaign can help you get more clicks and conversions at an attractive ROI, especially if you mainly use exact and phrase match keywords today.

To implement the modifier, just put a plus symbol (+) directly in front of one or more words** in a broad match keyword. Each word preceded by a + has to appear in your potential customer's search exactly or as a close variant. Close variants include misspellings, singular/plural forms, abbreviations and acronyms, and stemmings (like "floor" and "flooring"). Synonyms (like "quick" and "fast") and related searches (like "flowers" and "tulips") aren't considered close variants.

The graphic below illustrates the relative reach of different keyword match type strategies.


Be sure there are no spaces between the + and modified words, but do leave spaces between words. Correct usage: +formal +shoes. Incorrect usage: +formal+shoes.
Here's what one major UK retail company said about their experience using the feature:
We're always interested in ways to increase our volumes while keeping our CPA down. As a result, we've added broad match modified keywords to several campaigns where previously we only had phrase and exact match keywords. After a few weeks of testing, we're pleased to see these campaigns showed significant increases in conversion and volume, whilst keeping the CPA down. Therefore, we will be looking to scale our use of modified broad match keywords in all our campaigns to take full advantage of these great results.
If you mainly use broad match keywords in your account, you should know that switching your existing broad match keywords to modified broad match will likely lead to a significant decline in your click and conversion volumes and will not directly improve Quality Score. To maintain volume, keep existing broad match keywords active, add new modified broad match keywords, and adjust bids to achieve your target ROI based on the results you see.

You can begin using the feature by logging into your AdWords account, through the AdWords Editor and through the AdWords API. For more details, guidelines on usage, and answers to common questions, check out the original blog post and the AdWords help center.

Posted by Dan Friedman, Inside AdWords crew

*Except Chinese, Japanese, Thai, Arabic and Hebrew languages, which are coming soon. We'll update this post when the feature becomes available in those languages.
URL: http://adwords.blogspot.com/2010/07/new-keyword-targeting-feature-rolling.html

[G] Google Books goes Dutch

Posted: 14 Jul 2010 07:26 PM PDT

Inside Google Books: Google Books goes Dutch

Posted by Philippe Colombet, Strategic Partnership Development Manager

In recent months, I've got to know a group of people in the Hague who are working on an ambitious project to make the rich fabric of Dutch cultural and political history as widely accessible as possible - via the Internet.

That team is from the National Library of the Netherlands, the Koninklijke Bibliotheek (KB), and as of today, we'll be working in partnership to add to the library's own extensive digitisation efforts. We'll be scanning more than 160,000 of its public domain books, and making this collection available globally via Google Books. The library will receive copies of the scans so that they can also be viewed via the library's website. And significantly for Europe, the library also plans to make the digitised works available via Europeana, Europe's cultural portal.

The books we'll be scanning constitute nearly the library's entire collection of out-of-copyright books, written during the 18th and 19th centuries. The collection covers a tumultuous period of Dutch history, which saw the establishment of the country's constitution and its parliamentary democracy. Anyone interested in Dutch history will be able to access and view a fascinating range of works by prominent Dutch thinkers, statesmen, poets and academics and gain new insights into the development of the Netherlands as a nation state.

This is the third agreement we've announced in Europe this year, following our projects with the Italian Ministry of Cultural Heritage and the Austrian National Library. The Dutch national library is already well underway with its own ambitious scanning programme, which will eventually see all of its Dutch books, newspapers and periodicals from 1470 onwards being made available online. By any measure, this is a huge task, requiring significant resources, and we're pleased to be able to help the library accelerate towards its goal of making all Dutch books accessible anywhere in the world, at the click of a mouse.

It's exciting to note just how many libraries and cultural ministries are now looking to preserve and improve access to their collections by bringing them online. Much of humanity's cultural, historical, scientific and religious knowledge, collected and curated over centuries, sits in Europe's libraries, and its great to see that we are all striving towards the same goal of improving access to knowledge for all.

Google and other technology companies have an important role to play in achieving this goal, and we hope that by partnering with major European cultural institutions such as the Dutch national library, we will be able to accelerate the rapid growth of Europe's digital library.

(Cross-posted from the European Public Policy Blog)
URL: http://booksearch.blogspot.com/2010/07/google-books-goes-dutch.html

[G] Our commitment to the digital humanities

Posted: 14 Jul 2010 03:01 PM PDT

Official Google Research Blog: Our commitment to the digital humanities

Posted by Jon Orwant, Engineering Manager for Google Books, Magazines and Patents

(Cross-posted from the Official Google Blog)

It can't have been very long after people started writing that they started to organize and comment on what was written. Look at the 10th century Venetus A manuscript, which contains scholia written fifteen centuries earlier about texts written five centuries before that. Almost since computers were invented, people have envisioned using them to expose the interconnections of the world's knowledge. That vision is finally becoming real with the flowering of the web, but in a notably limited way: very little of the world's culture predating the web is accessible online. Much of that information is available only in printed books.

A wide range of digitization efforts have been pursued with increasing success over the past decade. We're proud of our own Google Books digitization effort, having scanned over 12 million books in more than 400 languages, comprising over five billion pages and two trillion words. But digitization is just the starting point: it will take a vast amount of work by scholars and computer scientists to analyze these digitized texts. In particular, humanities scholars are starting to apply quantitative research techniques for answering questions that require examining thousands or millions of books. This style of research complements the methods of many contemporary humanities scholars, who have individually achieved great insights through in-depth reading and painstaking analysis of dozens or hundreds of texts. We believe both approaches have merit, and that each is good for answering different types of questions.

Here are a few examples of inquiries that benefit from a computational approach. Shouldn't we be able to characterize Victorian society by quantifying shifts in vocabulary—not just of a few leading writers, but of every book written during the era? Shouldn't it be easy to locate electronic copies of the English and Latin editions of Hobbes' Leviathan, compare them and annotate the differences? Shouldn't a Spanish reader be able to locate every Spanish translation of "The Iliad"? Shouldn't there be an electronic dictionary and grammar for the Yao language?

We think so. Funding agencies have been supporting this field of research, known as the digital humanities, for years. In particular, the National Endowment for the Humanities has taken a leadership role, having established an Office of Digital Humanities in 2007. NEH chairman Jim Leach says: "In the modern world, access to knowledge is becoming as central to advancing equal opportunity as access to the ballot box has proven to be the key to advancing political rights. Few revolutions in human history can match the democratizing consequences of the development of the web and the accompanying advancement of digital technologies to tap this accumulation of human knowledge."

Likewise, we'd like to see the field blossom and take advantage of resources such as Google Books that are becoming increasingly available. We're pleased to announce that Google has committed nearly a million dollars to support digital humanities research over the next two years.

Google's Digital Humanities Research Awards will support 12 university research groups with unrestricted grants for one year, with the possibility of renewal for an additional year. The recipients will receive some access to Google tools, technologies and expertise. Over the next year, we'll provide selected subsets of the Google Books corpus—scans, text and derived data such as word histograms—to both the researchers and the rest of the world as laws permit. (Our collection of ancient Greek and Latin books is a taste of corpora to come.)

We've given awards to 12 projects led by 23 researchers at 15 universities:
  • Steven Abney and Terry Szymanski, University of Michigan. Automatic Identification and Extraction of Structured Linguistic Passages in Texts.
  • Elton Barker, The Open University, Eric C. Kansa, University of California-Berkeley, Leif Isaksen, University of Southampton, United Kingdom. Google Ancient Places (GAP): Discovering historic geographical entities in the Google Books corpus.
  • Dan Cohen and Fred Gibbs, George Mason University. Reframing the Victorians.
  • Gregory R. Crane, Tufts University. Classics in Google Books.
  • Miles Efron, Graduate School of Library and Information Science, University of Illinois. Meeting the Challenge of Language Change in Text Retrieval with Machine Translation Techniques.
  • Brian Geiger, University of California-Riverside, Benjamin Pauley, Eastern Connecticut State University. Early Modern Books Metadata in Google Books.
  • David Mimno and David Blei, Princeton University. The Open Encyclopedia of Classical Sites.
  • Alfonso Moreno, Magdalen College, University of Oxford. Bibliotheca Academica Translationum: link to Google Books.
  • Todd Presner, David Shepard, Chris Johanson, James Lee, University of California-Los Angeles. Hypercities Geo-Scribe.
  • Amelia del Rosario Sanz-Cabrerizo and José Luis Sierra-Rodríguez, Universidad Complutense de Madrid. Collaborative Annotation of Digitalized Literary Texts.
  • Andrew Stauffer, University of Virginia. JUXTA Collation Tool for the Web.
  • Timothy R. Tangherlini, University of California-Los Angeles, Peter Leonard, University of Washington. Northern Insights: Tools & Techniques for Automated Literary Analysis, Based on the Scandinavian Corpus in Google Books.
We have selected these proposals in part because the resulting techniques, tools and data will be broadly useful: they'll help entire communities of scholars, not just the applicants. We look forward to working with them, and hope that over time the field of digital humanities will fulfill its promise of transforming the ways in which we understand human culture.
URL: http://googleresearch.blogspot.com/2010/07/our-commitment-to-digital-humanities.html

[G] Read the DoubleClick Ad Exchange White Paper: The Value of Dynamic Allocation

Posted: 14 Jul 2010 02:42 PM PDT

DoubleClick Publisher Blog: Read the DoubleClick Ad Exchange White Paper: The Value of Dynamic Allocation

We're often asked to quantify the incremental value DoubleClick Ad Exchange can provide compared with publishers' existing yield management techniques. According to proprietary research conducted in the first half of 2010, the combined effects of auction pressure and Dynamic Allocation in DoubleClick Ad Exchange resulted in an average CPM lift of 136% compared with fixed, upfront, pre-negotiated sales of non-guaranteed inventory.

In a new white paper, we take a step back to explain how publishers are managing yield across their pool of non-guaranteed inventory today, and what steps they can take to create efficiencies and boost overall revenue. Key elements of the white paper include:
  • How publishers segment and sell ad inventory. How manual optimization processes often fail to capture all available revenue opportunities
  • Dynamic Allocation explained. What it is, how it works, and what it means for publishers' bottom lines.
  • Auction pricing mechanics. Real-time pricing's core advantages over the use of historical CPMs for non-guaranteed ad space.
  • A brief look forward. The potential for DoubleClick Ad Exchange and its ecosystem of publishers, technology providers, advertisers and agencies.
Download the white paper.

Posted by Campbell Foster, Product Marketing Manager
URL: http://doubleclickpublishers.blogspot.com/2010/07/read-doubleclick-ad-exchange-white.html

[G] Google Maps can now send destinations directly to more than 20 car brands worldwide

Posted: 14 Jul 2010 01:39 PM PDT

Google LatLong: Google Maps can now send destinations directly to more than 20 car brands worldwide

When we started the "Send-To-Car" service on Google Maps more than three years ago with BMW, only a few car makers offered connected services to their drivers. The industry has come a long way since then. Several car manufacturers have made industry-changing commitments to bringing connectivity to the majority, if not the entirety of their car line.

We see more and more cars with connected navigation and entertainment systems leaving the assembly line and the trend is here to stay. That's fantastic news for both drivers and the automotive industry.

The Google Maps Send-To-Car service has grown, and many car manufacturers have joined over time. Just recently, we announced Audi's connected car navigation system which includes Send-To-Car, and the Google Automotive team is thrilled to announce that we have extended the partner base of our Google Maps Send-To-Car service further to include Ford and GM.

As of today, drivers of Ford, Lincoln and Mercury vehicles in the US enabled with Ford SYNC can now send business listings or addresses found on Google Maps directly to their cars.


(Photo credit: Ford)

In addition, millions of OnStar equipped GM vehicles can now make use of this innovative service. Great news if you own a Buick, Cadillac, Chevrolet, GMC, Hummer, Pontiac, Saab or Saturn! Check out this GM video to see the service in action:



Drivers can then use their car maker's turn-by-turn navigation system to be guided to their selected destination. With today's additions, drivers can send destinations from Google Maps directly to their connected vehicles in 19 countries and more than 20 different brands.

In the US alone, Send-To-Car is now available on more than 15 car brands and we hope to see even more partners join us soon.


We think this is a great convenience for drivers - Prepare your route at your desk, send the destination to your car, and safely enjoy your ride - hands on the wheel, eyes on the road.

We also like to think that in the age of green driving, not having to print paper directions anymore is a great start of a green trip!

You can find more information on Send-to-Car on the Google Maps help center.

Drive safely!




Posted by Markus Mühlbauer, Engineering and Product Manager
URL: http://google-latlong.blogspot.com/2010/07/google-maps-can-now-send-destinations.html

[G] Announcing the AdSense in Your City program

Posted: 14 Jul 2010 11:22 AM PDT

Inside AdSense: Announcing the AdSense in Your City program

This summer and fall, the AdSense team is coming to visit you! In an effort to work more closely with our publishers, we've launched the AdSense in Your City program. As part of the program, members of the AdSense team will be traveling to five cities this summer to hear directly from you, as well as to share best practices, top optimization tips, and new products.

Last month, we kicked off our first AdSense in Your City event in Mountain View, California. Sixty publishers came to Google to learn how to make more money with AdSense, to meet members of the AdSense team, and to get to know each other.

Today we're heading to Santa Monica, and later this summer we'll be visiting Chicago, New York, and Boston. While attendance is very limited due to space constraints, we have a few more spots in some cities. If you'd like to request an invitation to an event, please sign up here. Though invitations will be sent on a first-come, first-served basis, we'll do our best to include as many of you as possible. We'll also be sure to make sessions available online early this fall.

Throughout the summer, look for updates on the blog from the AdSense team who will be traveling to these cities. We'll also be tweeting live from the events (follow us at http://twitter.com/AdSense) and posting videos and publisher interviews.

And this is just the beginning. We'd like to expand this program to be able to travel to more cities around the United States and to meet with more of you face to face. Are you a publisher in Austin? Seattle? Orlando? Leave us a comment and let us know if we should bring AdSense in Your City to your city next!

Posted by Talia Brodecki - Product Marketing Manager
URL: http://adsense.blogspot.com/2010/07/announcing-adsense-in-your-city-program.html

[G] YouTube Summer School, Session 1: Matter & Motion

Posted: 14 Jul 2010 11:02 AM PDT

YouTube Blog: YouTube Summer School, Session 1: Matter & Motion

School's out for summer, but around here we're (nerd alert) still pretty excited about numbers, facts and learning in general. And it looks like we're not alone. More than half a million people are now subscribed to YouTube EDU channels, and since October 2009 we've seen a 77% jump in channels and an 89% jump in videos on the educational platform.

While summer often provides a welcome respite for students, we know that some are looking for ways to keep their brains active over the break. If you want to keep those math formulas fresh and foreign languages top-of-mind, this could be the stuff that gives you a leg up on the non-YouTubers in your class come the fall. Since we want nothing more than to help you rule the world, we are creating playlists, by topic, of the videos you might want to watch to get ahead of the curve. Each week, we'll post a new playlist to the blog and the YouTube channel. Think of it as a mini virtual summer school, but without the research papers and early-morning start times.

First up, class, we're focusing on physics, the study of matter and its motion through space and time. This playlist has everything from Einstein's general theory of relativity to physics of football:









Next week, we'll feature must-see videos about art. And throughout our little summer school, do let us know if or how EDU has made an impact on your life, whether in school or achieving your dreams overall. Please leave a comment below (but note: comments are moderated due to spam). We'd love to hear about your experiences in online education.

Class dismissed!

Mandy Albanese, Communications Associate, recently watched "Bicycle Wheel Gyroscope."


URL: http://feedproxy.google.com/~r/youtube/PKJx/~3/HKaPByWKQsQ/youtube-summer-school-session-1-matter.html

[G] Google Books goes Dutch

Posted: 14 Jul 2010 09:10 AM PDT

Official Google Blog: Google Books goes Dutch

In recent months, I've got to know a group of people in the Hague who are working on an ambitious project to make the rich fabric of Dutch cultural and political history as widely accessible as possible - via the Internet.

That team is from the National Library of the Netherlands, the Koninklijke Bibliotheek (KB), and as of today, we'll be working in partnership to add to the library's own extensive digitisation efforts. We'll be scanning more than 160,000 of its public domain books, and making this collection available globally via Google Books. The library will receive copies of the scans so that they can also be viewed via the library's website. And significantly for Europe, the library also plans to make the digitised works available via Europeana, Europe's cultural portal.

The books we'll be scanning constitute nearly the library's entire collection of out-of-copyright books, written during the 18th and 19th centuries. The collection covers a tumultuous period of Dutch history, which saw the establishment of the country's constitution and its parliamentary democracy. Anyone interested in Dutch history will be able to access and view a fascinating range of works by prominent Dutch thinkers, statesmen, poets and academics and gain new insights into the development of the Netherlands as a nation state.

This is the third agreement we've announced in Europe this year, following our projects with the Italian Ministry of Cultural Heritage and the Austrian National Library. The Dutch national library is already well underway with its own ambitious scanning programme, which will eventually see all of its Dutch books, newspapers and periodicals from 1470 onwards being made available online. By any measure, this is a huge task, requiring significant resources, and we're pleased to be able to help the library accelerate towards its goal of making all Dutch books accessible anywhere in the world, at the click of a mouse.

It's exciting to note just how many libraries and cultural ministries are now looking to preserve and improve access to their collections by bringing them online. Much of humanity's cultural, historical, scientific and religious knowledge, collected and curated over centuries, sits in Europe's libraries, and its great to see that we are all striving towards the same goal of improving access to knowledge for all.

Google and other technology companies have an important role to play in achieving this goal, and we hope that by partnering with major European cultural institutions such as the Dutch national library, we will be able to accelerate the rapid growth of Europe's digital library.

Cross-posted from our European policy blog.

Posted by Philippe Colombet, Strategic Partnership Development Manager
URL: http://googleblog.blogspot.com/2010/07/google-books-goes-dutch.html

[G] Our commitment to the digital humanities

Posted: 14 Jul 2010 08:25 AM PDT

Official Google Blog: Our commitment to the digital humanities

(Cross-posted on the Google Research Blog)

It can't have been very long after people started writing that they started to organize and comment on what was written. Look at the 10th century Venetus A manuscript, which contains scholia written fifteen centuries earlier about texts written five centuries before that. Almost since computers were invented, people have envisioned using them to expose the interconnections of the world's knowledge. That vision is finally becoming real with the flowering of the web, but in a notably limited way: very little of the world's culture predating the web is accessible online. Much of that information is available only in printed books.

A wide range of digitization efforts have been pursued with increasing success over the past decade. We're proud of our own Google Books digitization effort, having scanned over 12 million books in more than 400 languages, comprising over five billion pages and two trillion words. But digitization is just the starting point: it will take a vast amount of work by scholars and computer scientists to analyze these digitized texts. In particular, humanities scholars are starting to apply quantitative research techniques for answering questions that require examining thousands or millions of books. This style of research complements the methods of many contemporary humanities scholars, who have individually achieved great insights through in-depth reading and painstaking analysis of dozens or hundreds of texts. We believe both approaches have merit, and that each is good for answering different types of questions.

Here are a few examples of inquiries that benefit from a computational approach. Shouldn't we be able to characterize Victorian society by quantifying shifts in vocabulary—not just of a few leading writers, but of every book written during the era? Shouldn't it be easy to locate electronic copies of the English and Latin editions of Hobbes' Leviathan, compare them and annotate the differences? Shouldn't a Spanish reader be able to locate every Spanish translation of "The Iliad"? Shouldn't there be an electronic dictionary and grammar for the Yao language?

We think so. Funding agencies have been supporting this field of research, known as the digital humanities, for years. In particular, the National Endowment for the Humanities has taken a leadership role, having established an Office of Digital Humanities in 2007. NEH chairman Jim Leach says: "In the modern world, access to knowledge is becoming as central to advancing equal opportunity as access to the ballot box has proven to be the key to advancing political rights. Few revolutions in human history can match the democratizing consequences of the development of the web and the accompanying advancement of digital technologies to tap this accumulation of human knowledge."

Likewise, we'd like to see the field blossom and take advantage of resources such as Google Books that are becoming increasingly available. We're pleased to announce that Google has committed nearly a million dollars to support digital humanities research over the next two years.

Google's Digital Humanities Research Awards will support 12 university research groups with unrestricted grants for one year, with the possibility of renewal for an additional year. The recipients will receive some access to Google tools, technologies and expertise. Over the next year, we'll provide selected subsets of the Google Books corpus—scans, text and derived data such as word histograms—to both the researchers and the rest of the world as laws permit. (Our collection of ancient Greek and Latin books is a taste of corpora to come.)

We've given awards to 12 projects led by 23 researchers at 15 universities:
  • Steven Abney and Terry Szymanski, University of Michigan. Automatic Identification and Extraction of Structured Linguistic Passages in Texts.
  • Elton Barker, The Open University, Eric C. Kansa, University of California-Berkeley, Leif Isaksen, University of Southampton, United Kingdom. Google Ancient Places (GAP): Discovering historic geographical entities in the Google Books corpus.
  • Dan Cohen and Fred Gibbs, George Mason University. Reframing the Victorians.
  • Gregory R. Crane, Tufts University. Classics in Google Books.
  • Miles Efron, Graduate School of Library and Information Science, University of Illinois. Meeting the Challenge of Language Change in Text Retrieval with Machine Translation Techniques.
  • Brian Geiger, University of California-Riverside, Benjamin Pauley, Eastern Connecticut State University. Early Modern Books Metadata in Google Books.
  • David Mimno and David Blei, Princeton University. The Open Encyclopedia of Classical Sites.
  • Alfonso Moreno, Magdalen College, University of Oxford. Bibliotheca Academica Translationum: link to Google Books.
  • Todd Presner, David Shepard, Chris Johanson, James Lee, University of California-Los Angeles. Hypercities Geo-Scribe.
  • Amelia del Rosario Sanz-Cabrerizo and José Luis Sierra-Rodríguez, Universidad Complutense de Madrid. Collaborative Annotation of Digitalized Literary Texts.
  • Andrew Stauffer, University of Virginia. JUXTA Collation Tool for the Web.
  • Timothy R. Tangherlini, University of California-Los Angeles, Peter Leonard, University of Washington. Northern Insights: Tools & Techniques for Automated Literary Analysis, Based on the Scandinavian Corpus in Google Books.
We have selected these proposals in part because the resulting techniques, tools and data will be broadly useful: they'll help entire communities of scholars, not just the applicants. We look forward to working with them, and hope that over time the field of digital humanities will fulfill its promise of transforming the ways in which we understand human culture.

Posted by Jon Orwant, Engineering Manager for Google Books, Magazines and Patents
URL: http://googleblog.blogspot.com/2010/07/our-commitment-to-digital-humanities.html

No comments:

Post a Comment