Skip to content Skip to sidebar Skip to footer

Google Http://maps.google.com/maps/geo Query With Non-english Characters

I'm creating a Python (using urllib2) parser of addresses with non-english characters in it. The goal is to find coordinates of every address. When I open this url in Firefox: http

Solution 1:

I can reproduce this behavior, and at first I was dumbfounded as to why it's happening. Closer inspection of the HTTP requests with wireshark showed that the requests sent by Firefox (not surprisingly) contain a couple more HTTP-Headers.

In the end it turned out it's the Accept-Language header that makes the difference. You only get the correct result if

  • an Accept-Language header is set
  • and it has a non-english language listed first (the priorities don't seem to matter)

So, for example this Accept-Language header works:

headers = {'Accept-Language': 'de-ch,en'}

To summarize, modified like this your code works for me:

# -*- coding: utf-8 -*-import urllib2

psc = '10000'
name = 'Malešice'
url = 'http://maps.google.com/maps/geo?q=%s&output=csv' % urllib2.quote('Czech Republic %s %s' % (psc, name))
headers = {'Accept-Language': 'de-ch,en'}

req = urllib2.Request(url, None, headers)
response = urllib2.urlopen(req)
data = response.read()

print'Parsed url %s, result %s\n' % (url, data)

Note: In my opinion, this is a bug in Google's geocoding API. The Accept-Language header indicates what languages the user agent prefers the content in, but it shouldn't have any effect on how the request is interpreted.

Post a Comment for "Google Http://maps.google.com/maps/geo Query With Non-english Characters"