-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes 414 request-uri too long
for import_articles_by_title
by using a POST request .2
#6210
Fixes 414 request-uri too long
for import_articles_by_title
by using a POST request .2
#6210
Conversation
…ing a POST request to send the query parameters in the `request.body` instead of a GET request: - Adds `kn.wikipedia.org` to list of stubbed wiki validations - Adds 3 API post methods, `api_post`, `do_post` and `too_many_requests` for creating and sending the post requests - Refactors `import_articles_by_title` since response is now a hash - Adds a test that confirms `414 request-uri too long` no longer occurs
….` to `results = response&.`: - Prevents: 'NameError: uninitialized constant ArticleImporter::TYPICAL_ERRORS' when `ApiErrorHandling` sets the error level - Prevents: 'NoMethodError: undefined method `dig' for nil:NilClass results = response.dig('query', 'pages') ^^^^' when response is nil ( for example when methods that call `import_articles_by_title` are mocked, the api_post response becomes `nil`)
@@ -60,4 +66,44 @@ def import_articles_from_title_query(results) | |||
end | |||
Article.import articles, on_duplicate_key_update: [:title, :namespace] | |||
end | |||
|
|||
############### | |||
# API methods # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all about interacting with the MediaWiki API, independently of ArticleImporter, so it probably should not be part of this class. It should be either in WikiApi or in a new WikiPostApi.
|
||
def do_post(api_url, params) | ||
uri = URI(api_url) | ||
https = Net::HTTP.new(uri.host, uri.port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test whether you can do a post request with the mediawiki_api gem?
It is not well-documented, but it looks like it will support it as a parameter: https://github.com/wikimedia/mediawiki-ruby-api/blob/master/lib/mediawiki_api/client.rb#L157
I think that means you can add http_method: :post
as a query parameter, and it will make it as a POST request instead of a GET. If that works, maybe just adding the extra parameter will make it work without re-implementing any API methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the first way I tried and it did not work so I used this way instead. However, your comment prompted me to retry and turns out I simply implemented it wrongly the first time. I was able to get it to work by simply adding the extra parameter (although I had to create a separate meta
method in wiki_api
since it kept error-ing). I created a new PR to that effect: #6212
@@ -67,5 +68,50 @@ | |||
expect(Article.find_by(title: 'Istanbul').mw_page_id).to eq(3391396) | |||
end | |||
end | |||
|
|||
it 'returns the correct number of titles without raising error 414' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great test. It should include a comment explaining in a little more detail what the test is doing and why the list of titles is what it is.
414 request-uri too long
for import_articles_by_title
by using a POST request414 request-uri too long
for import_articles_by_title
by using a POST request .2
What this PR does
Fixes
414 request-uri too long
forimport_articles_by_title
by using a POST request to send the query parameters in therequest.body
instead of a GET request:kn.wikipedia.org
to list of stubbed wiki validationsapi_post
,do_post
andtoo_many_requests
for creating and sending the post requestsimport_articles_by_title
since response is now a hash414 request-uri too long
no longer occurs and a new vcr-cassettesome_article_titles.yml
Adds
TYPICAL_ERRORS
array and safe navigation operator&.
toresults = response&.dig('query', 'pages')
to prevent:NameError: uninitialized constant ArticleImporter::TYPICAL_ERRORS
whenApiErrorHandling
sets the error levelNoMethodError: undefined method dig for nil:NilClass results = response.dig('query', 'pages')
when response is nil ( for example when methods that callimport_articles_by_title
are mocked, the api_post response becomesnil
)The idea of
api_post
anddo_post
was gotten from a similar implementation in replica.rb and I tried to follow the flow of wiki_api.rb too.Screenshots
Before:
data:image/s3,"s3://crabby-images/4b3df/4b3dfc833f1eea8db208575556520d133c7d5d69" alt="image"
After:
data:image/s3,"s3://crabby-images/72efc/72efc64e029775e654a65655d7f89934cae78dbc" alt="image"
data:image/s3,"s3://crabby-images/3e338/3e3381b8e5425a73202da2272afd897d3df96a90" alt="image"