When you think back on all the cities you’ve traveled to, what images come to mind? Do you think about people you met or the people you traveled with? The music you heard, food you ate, art you saw or souvenirs you bought? As a lover of architecture both personally and professionally (I’m also an editor at The Architect’s Newspaper), my mind automatically flips through a slideshow of building facades, city streets and the unifying architectural details I saw repeatedly over the course of my trip. If this is how you remember cities, too, it turns out you and I aren’t the only ones.
Recently, a group of researchers from Carnegie Mellon and the Laboratoire d’Informatique de l’École Normale Supérieure in Paris (also known as the Computer Science department at ENS) developed a software that mines visual data from Google Street View images of cities and defines the most prominent characteristics found in each place. From a bank of 40,000 images, the software worked overnight with 150 processors to detect more than 250 million defining characteristic of twelve major cities, including Barcelona, London, New York and Paris.
The program works by comparing all the images and defining the individual details that differentiate images of one city from another. After some heavy duty computing, the software defined Paris by images of gilded ironwork on balconies, balustrades and cornices over doors and windows as well as the city’s signature street signs. It was a little trickier to visually define US cities, which are younger and stylistically less unified than older cities in Europe or Asia, but the software came up with bay windows for San Francisco and fire escapes for New York.
You can learn more from a quick how-to video on the software or test your street smarts by taking the Paris-NonParis test, in which you look at 100 images and take a stab at which city you think they’re from (hint: only 50 of them are from Paris).
According to the researchers, given a large repository of geotagged imagery, we seek to automatically find visual elements, e.g. windows, balconies, and street signs, that are most distinctive for a certain geo-spatial area, for example the city of Paris. This is a tremendously difficult task as the visual features distinguishing architectural elements of different places can be very subtle. In addition, we face a hard search problem: given all possible patches in all images, which of them are both frequently occurring and geographically informative? To address these issues, we propose to use a discriminative clustering approach able to take into account the weak geographic supervision. We show that geographically representative image elements can be discovered automatically from Google Street View imagery in a discriminative manner. We demonstrate that these elements are visually interpretable and perceptually geo-informative. The discovered visual elements can also support a variety of computational geography tasks, such as mapping architectural correspondences and influences within and across cities, finding representative elements at different geo-spatial scales, and geographically-informed image retrieval.
This article was originally published on Core77.