Sunday, July 11, 2010

The Demographics of Web Searching: A Report Complete with 16 Footnotes and 27 References

Your life on the web is on open book.

Link to July 11 Slashdot post.

Excerpt:   adaviel sends a link to work out of Yahoo Research indicating that demographics can help Web searches; e.g. a women searching for "wagner" probably wants the 18th-century German composer, while for men in the US "wagner" is a paint sprayer.

Link to abstract and report.

Abstract: How does the web search behavior of ``rich'' and ``poor'' people differ? Do men and women tend to click on different results for the same query? What are some queries almost exclusively issued by African Americans? These are some of the questions we address in this study. Our research combines three data sources: the query log of a major US-based web search engine, profile information provided by 28 million of its users (birth year, gender and zip code), and US-census information including detailed demographic information aggregated at the level of ZIP code. Through this combination we can annotate each query with, e.g., the average per-capita income in the ZIP code it originated from. Though conceptually simple, this combination immediately creates a powerful demographic profiling tool. The main contributions of this work are the following. First, we provide a demographic description of a large sample of search engine users in the US and show that it agrees well with the distribution of the US population. Second, we describe how different segments of the population differ in their search behavior, e.g. with respect to the diversity of formulated queries or with respect to the clicked URLs. Third, we explore applications of our methodology to improve web search and, in particular, to help issuing query reformulations. These results enable the creation of a powerful tool for improved user modeling in practice, with many applications including improving web search and advertising. For instance, advertisements for ``family vacations'' could be adapted to the (expected) income of the person issuing the query, or search suggestions shown to users could be adapted to items that are more interesting given their particular characteristics.

No comments: