-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative names to specify language #1
Comments
Hi! Sorry for the late response. This is a good point. The current source just lists the alternative names. I'll see if I can find a source for the alternative place names with the language they belong to and update the script. |
I have written a script that may be solving this. It does many things, for example it drops all administrative area names of each location unless there are two or more cities with the same in in the same country. But most importantly it creates join between these two datasets: global_cities_url = 'http://download.geonames.org/export/dump/allCountries.zip' and then by specifying the languages in comma separated string it will get all the languages needed. Feel free to use this as example |
Ah nice, that's cool. I'm glad you figured something out. I found the alternateNamesV2 dataset as well and have managed to implement it into the script, but now I'm trying to optimise it somehow as 18 million lines is a lot to process... |
Yeah the alternate names dataset is huge and loading this into RAM alone takes around a minute or two. There's another problem - many places have few alternate names so the script also needs to take that into account. I only needed one alternate place per language so my script selects only one |
Hi.
The script returns a reliable dataset and would be useful for my project.
Would it be possible to reformat the dataset to specify the alternative name in which language it is provided? So instead of this:
To return something like
Because without this I don't know how to use these alternative names.
The text was updated successfully, but these errors were encountered: