Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Error 522 not caught with max_tries ? #668

Open
kongomongo opened this issue Aug 30, 2021 · 2 comments
Open

HTTP Error 522 not caught with max_tries ? #668

kongomongo opened this issue Aug 30, 2021 · 2 comments

Comments

@kongomongo
Copy link
Contributor

kongomongo commented Aug 30, 2021

Hi there,

I thought max_tries was more or less a catchall for any error. So if my interval for urlwatch is every 5 mins and max_tries is 12, no matter the error if it vanishes within 60 mins I get no error.

Or so i thought.

Can you explain this?

urlwatch -v --test-filter 1
2021-08-30 21:51:14,647 cli INFO: turning on verbose logging mode
2021-08-30 21:51:14,705 minidb DEBUG: PRAGMA table_info(CacheEntry)
2021-08-30 21:51:17,751 main INFO: Using /root/.config/urlwatch/urls.yaml as URLs file
2021-08-30 21:51:17,751 main INFO: Using /root/.config/urlwatch/hooks.py for hooks
2021-08-30 21:51:17,752 main INFO: Using /root/.cache/urlwatch/cache.db as cache database
2021-08-30 21:51:17,752 util INFO: Registering <class 'hooks.AllKeyShopTop'> as akstop
2021-08-30 21:51:17,752 util INFO: Registering <class 'hooks.RegexSubUpper'> as re.sub.upper
2021-08-30 21:51:17,850 main INFO: Found 25 jobs
2021-08-30 21:51:17,850 handler INFO: Processing: <url url='https://xxx.yy/forum/register.php?' ignore_cached=True headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 OPR/71.0.3770.287'} ignore_http_error_codes=522 name='xxx.yy' filter=[{'element-by-tag': 'body'}, {'html2text': {'method': 'lynx'}}, {'re.sub': {'pattern': '(?i)(ist jetzt )(..:..)( Uhr)', 'repl': '\\1XX:XX\\3'}}, {'re.sub': {'pattern': '(?i)(Es ist: )(..-..-...., ..:..)', 'repl': '\\1XX-XX-XXXX, XX:XX'}}, {'re.sub': {'pattern': '(?m)(\\.php\\?+s=)[a-f0-9]{8,}([^a-f0-9])', 'repl': '\\1PX_IGNORED\\2'}}, 'strip'] max_tries=12 treat_new_as_changed=True>
2021-08-30 21:51:17,850 minidb DEBUG: SELECT data, timestamp, tries, etag FROM CacheEntry WHERE guid = ? ORDER BY timestamp DESC, tries DESC LIMIT ? ['1879761a4956f0fd90d855d1c05d8b35abff8cee', 1]
2021-08-30 21:51:17,975 connectionpool DEBUG: Starting new HTTPS connection (1): xxx.yy:443
2021-08-30 21:51:48,876 connectionpool DEBUG: https://xxx.yy:443 "GET /forum/register.php HTTP/1.1" 522 None
Traceback (most recent call last):
  File "/usr/local/bin/urlwatch", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/cli.py", line 112, in main
    urlwatch_command.run()
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/command.py", line 408, in run
    self.handle_actions()
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/command.py", line 210, in handle_actions
    sys.exit(self.test_filter(self.urlwatch_config.test_filter))
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/command.py", line 138, in test_filter
    raise job_state.exception
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/handler.py", line 113, in process
    data = self.job.retrieve(self)
  File "/usr/local/lib/python3.7/dist-packages/urlwatch/jobs.py", line 292, in retrieve
    response.raise_for_status()
  File "/usr/local/lib/python3.7/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 522 Server Error:  for url: https://xxx.yy/forum/register.php

Am I doing it wrong?

@kongomongo
Copy link
Contributor Author

even adding ignore_http_error_codes: 522 does not help

@thp
Copy link
Owner

thp commented Nov 7, 2021

--test-filter won't work on max tries. Try running urlwatch with --verbose in your cron job and check the output. It should be quite verbose regarding max_tries ("Using max_tries of ...", "Error while executing...", "This was try ... of ...", "We are not at ... tries", ...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants