I spent some time today reading through the WWW2009 paper on “Mapping the World’s Photos”, from David Crandall et al . The paper reports on a work analyzing a large number (35 million) of photographs extracted from Flickr, including their metadata. The interesting point of the paper is that they combine various analysis tools: they analyse the users’ tags, the geo location in the photos’ metadata, timing information of a series of photos from the same user, and image processing analysis of the photos’ content. The combination of many different types of information leads to a better clustering of the photo data: photos can be organized in terms of location (either on large scale, ie, on the level of, say, a city, or on a much smaller scale, ie, on the level of a landmark like the Eiffel Tower). This clustering can be done without a priori knowledge of the image contents themselves.
There is the technology/algorithmic side of the paper that I cannot really comment on, I am not familiar enough with the clustering algorithms they used. However, at least for me, the more interesting aspect of the paper is the “social’’ one. As the authors say:
As researchers discovered a decade ago with large-scale collections of Web pages, studying the connective structure of a corpus at a global level exposes a fascinating picture of what the world is paying attention to. In the case of global photo collections, it means that we can discover, through collective behavior, what people consider to be the most significant landmarks both in the world and within specific cities […]; which cities are most photographed […] which cities have the highest and lowest proportions of attention-drawing landmarks […]; which views of these landmarks are the most characteristic […]; and how people move through cities and regions as they visit different locations within them […]. These resulting views of the data add to an emerging theme in which planetary-scale datasets provide insight into different kinds of human activity — in this case those based on images[…].
And this, of course, is really fascinating. But… it can also be dangerous if not done with care, because it is way too easy to jump on false conclusions. Indeed, study of such corpus cannot and should not be done, at least in my view, without a careful consideration of social, cultural, and economical issues. (This is of course no critique on the authors at all who concentrated on the algorithmic aspect only and did a great work at that!)
Let me take one example from the paper: the clustering algorithm produces a table “showing the most photographed places on Earth ranked by number of distinct photographers”. The first 15 cities on the list are: New York, London, San Francisco, Paris, Los Angeles, Chicago, Washington, Seattle, Rome, Amsterdam, Boston, Barcelona, San Diego, Berlin, and Las Vegas. 8 cities from the US, 7 from Western Europe. None from Canada, Asia, Africa, Australia, Latin America… Fascinating (and highly photogenic!) cities like Kyoto, Beijing, Rio de Janeiro, or Istambul are missing. This is not the fault of the authors: this is what this particular data set, ie, Flickr, gives you. However, can we, should we say that the World is not paying attention to these cities? I do not think so. To really draw conclusions, one would have to look at the demography of Flickr users, at economic issues, whether different communities use Flickr or some other photo site elsewhere in the World… The lack of Japanese cities in the list (knowing that Japanese make tons of pictures everywhere they go!) seems to indicate that their attitude towards social sites like Flickr might be different than what we are used to in “the West”. People going to Cairo may not have the same type of sophisticated cameras and easy Internet access to produce Flickr-quality pictures. And there may be many other different aspects that I do not even think of at this moment…
This is indeed an exciting line of research. But we, computer scientists, should be modest enough to realize that drawing social conclusions from such data requires us to work with experts in other disciplines. We could then come up with defensible conclusions that would be interesting to explore and exploit. Ie, the future, in this respect, lies in interdisciplinary work.
- Crandall, David, Backstrom, Lars, Huttenlocher, Daniel and Kleinberg, Jon (2009) ‘Mapping the World’s Photos’, In Maarek, Y. and Nejdl, W. (eds.), Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, ACM Press, pp. 761-770. Available online.