-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Tumblr import #548
Open
dmdeller
wants to merge
2
commits into
jekyll:master
Choose a base branch
from
dmdeller:fix-tumblr-import
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Fix Tumblr import #548
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these two changes output the files in different places than the current code does. Perhaps we should check this with a test to ensure a proper Jekyll site is output. I believe the goal was to write all the posts into the _posts directory so that they can be paginated and have all the goodies that come with documents in collections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When run with the flag
--rewrite_urls true
, the Tumblr importer outputs files in two different places:_posts
for the full text blog articlespost
for the redirects. These are simple files that do not contain any text of the blog posts, but instead only contain an HTML meta tag which causes the browser to redirect to the new, canonical blog post URL. The purpose of this is to avoid breaking URLs in the style that Tumblr used to use, prior to the blog being migrated to Jekyll. Tumblr-style URLs look likehttps://blog.domain.example/post/1234567890/blog-post-title-slug
. So, these need to be redirected and this is how it is done.When run without the flag
--rewrite_urls true
, only the former file location is used, and redirects are not generated.The changes in my PR only apply to the latter file location. The former location still continues to generate as normal, and is not affected by my changes (that is handled by different code in a different method). It is true that the latter code does not output to the standard
_posts
location, but as you can see, there is a reason for this.My PR does not change this overall behavior, it is the same as before; it was already being written to both
_posts
andpost
. However, the existing code generates incorrect HTML, as I wrote earlier, and my PR fixes that. My PR does not change the directory structure or file locations.My apologies for not writing a unit test for this. I ran into some difficulty with the development dependencies, because I don’t have MySQL, PostgreSQL, etc. installed (they are not necessary for the Tumblr importer) and I didn’t have time to look into that. As a result I was unable to get the unit tests running locally.
Let me know what you think about my above explanation, and if you would prefer, I can try to look more into getting some test coverage for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my read of the code,
tumblr_url
is the redirect page, andjekyll_url
(the line commented on) is the "new" post URL from_posts
. In the diff I see, both are modified, one tourl_with_slug
(redirect page), and one stops using Jekyll::Document. The thing I'm considering here is the "hard-coded" newjekyll_url
you have here. I think it's possible to import into a Jekyll site path that has a_config.yaml
file which could alter the document url, right? Perhaps it's an edge case, but it's possible.In any event, if it's best to follow this path, then we shouldn't check for
Jekyll::Post
and instead use the samejekyll_url
for both the old versions of Jekyll (with Post) and the new versions of Jekyll (with Document).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the test dependencies. One thing I'd try is to comment out the dev dependencies you don't need, and to run
bundle exec ruby test/test_tumblr_importer.rb
directly to run just the one file. That way,mysql
& friends won't berequire
'd. These tests aren't particularly well-kept, so apologies for that! If you're open to it, you might also try a GitHub Codespace so you don't need to modify your own dev environment. We don't have a devcontainer with all the dependencies setup at this time, but hopefully asudo apt-get install mysql-client
etc could do the trick. I think each gem's repo has installation instructions for Linux (with dependencies) if you go this route. Ruby with C extensions is definitely not the developer's paradise!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's right. In this case,
jekyll_url
is being used to generate the content of the HTML meta tag which redirects the user from the old URL to the new URL.tumblr_url
contains the same value as it did before; the change is a result of the fact that I renamed:slug
to:url_with_slug
, which also contains the same value as it did before.:slug
was a confusing name for it because it actually already contained a full URL, hence why I renamed it. Furthermore, I also needed a hash key with just the slug (not the entire URL), and the only logical name for that was:slug
, which was already in use... (see lines 175-176 on the right side of the diff to see what I mean)Very good point. I was worried there might be something like that. Do you have any suggestion for how to do it properly? I'm afraid I'm not familiar enough with this API.
I was hesitant to change that part of the branch because I didn't have an old version of Jekyll to test with. But I can do this if that's what you recommend.