Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nominatim: Wikipedia importance import step fails #45

Closed
dekzz opened this issue Mar 28, 2023 · 1 comment
Closed

nominatim: Wikipedia importance import step fails #45

dekzz opened this issue Mar 28, 2023 · 1 comment

Comments

@dekzz
Copy link
Contributor

dekzz commented Mar 28, 2023

Hello,

when import wikipedia is enabled (with default url: wikipediaUrl: https://nominatim.org/data/wikimedia-importance.sql.gz) it fails with the following error:

  Importing wikipedia importance data
  Traceback (most recent call last):
    File "/usr/local/bin/nominatim", line 14, in <module>
      exit(cli.nominatim(module_dir='/usr/local/lib/nominatim/module',
    File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 264, in nominatim
      return parser.run(**kwargs)
    File "/usr/local/lib/nominatim/lib-python/nominatim/cli.py", line 126, in run
      return args.command.run(args)
    File "/usr/local/lib/nominatim/lib-python/nominatim/clicmd/setup.py", line 101, in run
      if refresh.import_wikipedia_articles(args.config.get_libpq_dsn(),
    File "/usr/local/lib/nominatim/lib-python/nominatim/tools/refresh.py", line 144, in import_wikipedia_articles
      execute_file(dsn, datafile, ignore_errors=ignore_errors,
    File "/usr/local/lib/nominatim/lib-python/nominatim/db/utils.py", line 62, in execute_file
      remain = _pipe_to_proc(proc, fdesc)
    File "/usr/local/lib/nominatim/lib-python/nominatim/db/utils.py", line 25, in _pipe_to_proc
      chunk = fdesc.read(2048)
    File "/usr/lib/python3.10/gzip.py", line 301, in read
      return self._buffer.read(size)
    File "/usr/lib/python3.10/_compression.py", line 68, in readinto
      data = self.read(len(byte_view))
    File "/usr/lib/python3.10/gzip.py", line 488, in read
      if not self._read_gzip_header():
    File "/usr/lib/python3.10/gzip.py", line 436, in _read_gzip_header
      raise BadGzipFile('Not a gzipped file (%r)' % magic)
  gzip.BadGzipFile: Not a gzipped file (b'<h')

After some reasearch I found out it was due to curl being blocked as described here, in combination with storing curl (failed) response to wiki file.

I'll create PR for this which will set user agent on curl in order to fetch data from nominatim server.

@dekzz
Copy link
Contributor Author

dekzz commented Mar 30, 2023

Merge sha bb88549.

@dekzz dekzz closed this as completed Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant