Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refine autotagger query #68

Merged
merged 4 commits into from
Oct 22, 2024
Merged

Conversation

whatphilipcodes
Copy link

Simple suggestion to make the autotagger use the track title if no album title is present in the import.
The default behavior remains unchanged but in my testing this made the results more accurate if very sparse metadata is used.

@coveralls
Copy link
Collaborator

coveralls commented Sep 20, 2024

Pull Request Test Coverage Report for Build 11106940604

Details

  • 0 of 3 (0.0%) changed or added relevant lines in 1 file are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.6%) to 90.557%

Changes Missing Coverage Covered Lines Changed/Added Lines %
beetsplug/bandcamp/init.py 0 3 0.0%
Files with Coverage Reduction New Missed Lines %
beetsplug/bandcamp/metaguru.py 2 96.08%
Totals Coverage Status
Change from base Build 10925838927: -0.6%
Covered Lines: 1016
Relevant Lines: 1116

💛 - Coveralls

pyproject.toml Outdated Show resolved Hide resolved
beetsplug/bandcamp/__init__.py Show resolved Hide resolved
@whatphilipcodes
Copy link
Author

I found that using both a combination of artist and track as query further improves accuracy.
Also please let me know if you think that it makes sense to rewrite this feature as optionally configurable.

snejus
snejus previously approved these changes Oct 1, 2024
@snejus snejus dismissed their stale review October 1, 2024 16:50

Accidental approval!

@snejus
Copy link
Owner

snejus commented Oct 1, 2024

I've just tested it myself and I completely see why it's required!

As far as I could see, we have an issue with Bandcamp search being too narrow-minded, when we specify the item_type:

If we search for item_type=a (album) with a track title, like 999999999 - LOVE 4 RAVE, the search returns albums with names matching LOVE 4 RAVE. Same issue the other way around, if we search for item_type=t with an album name, for example.

On the other hand, search on MusicBrainz and Discogs is more flexible: you can search albums using one of the track titles and they will most likely find the album you're looking for, it's just that it will most likely have a low confidence score, as you mentioned.

Given your change, I see how using track title when album is not present improves the search! I reckon we can improve it even further: as I mentioned above, Bandcamp takes the item_type parameter very seriously, and it will only return tracks when we ask for tracks (t). Same applies to albums.

What if we don't specify this parameter at all?

@whatphilipcodes
Copy link
Author

Yup that's exactly what I did in the latest change 3886d7b (the type in the search dict is just left as an emtpy string). I also included the artist in the query string as well because for some reason that also worked better. It simply replicates what happens when you use the beetcamp cli with a plain text query which at least for me worked a lot better.

@whatphilipcodes
Copy link
Author

rebased to include #69 and logged changes. Should be good to go.

@snejus
Copy link
Owner

snejus commented Oct 11, 2024

GitHub is confused about the rebase - probably because you need to update the main branch in your fork.

I checked the diff locally and I can see that all dependencies are updated. You want to only update beets in this case, using poetry lock as I mentioned earlier.

@whatphilipcodes
Copy link
Author

whatphilipcodes commented Oct 16, 2024

Sorry I took a while there was an issue with jellyfish not installing correctly with the newest version of poetry (1.8.4).
I once again rebased on the main branch of your repo. I think these are still some leftovers from the first rebase I butchered.
If anything is still acting up please let me know.

@snejus
Copy link
Owner

snejus commented Oct 17, 2024

Still the same issue.

I would suggest either interactive rebase or starting a clean branch and cherry-picking the relevant commits, see below:

git checkout main
# create clean temporary branch
git checkout -b fx-autotagger-query-temp
# cherry-pick the relevant commits
git cherry-pick ':/handle empty album str'
git cherry-pick ':/add: further improve query'
git cherry-pick ':/add: log changes'
# validate the changes are as expected
git diff main
# reset this branch to your new clean branch
git checkout fx-autotagger-query
git reset --hard fx-autotagger-query-temp
# force push it
git push --force
# clean up
git branch --delete fx-autotagger-query-temp

@whatphilipcodes
Copy link
Author

whatphilipcodes commented Oct 22, 2024

Thank you for posting the steps, that really helped!
I chose to cherry-pick and made sure in the diff that only CHANGELOG.md and beetsplug/bandcamp/__init__.py are affected (comparing against main in this repo). I also tested one more time and everything seems to be in order.

@whatphilipcodes
Copy link
Author

With the changes you described there was a merge conflict in CHANGELOG.md as in the meantime there was a release. However after resolving this I saw that the resulting merge also affected the version number in the poetry config and a lot of changes in the log. Is it supposed to be like that or did I mess something up again?

@snejus
Copy link
Owner

snejus commented Oct 22, 2024

With the changes you described there was a merge conflict in CHANGELOG.md as in the meantime there was a release.

It's all good now!! Sorry for the release in the meantime, I realised you will have to deal with one more merge conflict 😅 .

However after resolving this I saw that the resulting merge also affected the version number in the poetry config and a lot of changes in the log. Is it supposed to be like that or did I mess something up again?

That is expected - often these kind of diffs are scary, but as long as you focus on resolving the conflict only, you will end up fine.

Ultimately what you care about is the diff you can see on GitHub
image
, otherwise a simple git diff main check from your branch :-) just gonna quickly test it myself and them we're good to go.

Copy link
Owner

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great, thank you!

@snejus snejus merged commit bec5570 into snejus:main Oct 22, 2024
17 checks passed
@whatphilipcodes
Copy link
Author

Ah that's a relief!
Thank you so much for your patience and your insights, I really had fun learning! :)

snejus pushed a commit that referenced this pull request Oct 29, 2024
Just found that my alterations to the search (#68) break if the artist
is empty (bandcamp search can't handle ' - title' syntax in search).
This fixes the issue. I did not include a changelog entry as this
feature has not been released yet and is already present in the log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants