Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implimented arXivId Parsing for PDF with arXivId #12335

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ar-rana
Copy link
Contributor

@ar-rana ar-rana commented Dec 24, 2024

Used the 'parse' method in ArXivIdentifier to get arXivId and added a testcase for the same using the link give in the issue.
Closes #12000
Closes https://github.com/koppor/jabref/issues/47".

Mandatory checks

  • I own the copyright of the code submitted and I licence it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.

In case of issues with the import order, double check that you activated Auto Import.
You can trigger fixing imports by pressing Ctrl+Alt+O to trigger Optimize Imports.

@ar-rana ar-rana marked this pull request as draft December 24, 2024 15:59
@ar-rana
Copy link
Contributor Author

ar-rana commented Dec 24, 2024

@koppor could you please review this PR and suggest changes where I am wrong.

@Siedlerchr
Copy link
Member

Thanks for your contribution. Please fix the check style issues first

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use OpenRewrite to ensure "modern" Java coding practices.
The issues found can be automatically fixed.
Please execute the gradle task rewriteRun, check the results, commit, and push.

You can check the detailed error output by navigating to your pull request, selecting the tab "Checks", section "Tests" (on the left), subsection "OpenRewrite".

@ar-rana
Copy link
Contributor Author

ar-rana commented Dec 25, 2024

Hello maintainers, could you please review this PR, I have fixed the previous issues.
The extra changes that are being reflected are because I merged the latest changes from upstream, please ignore them they will not be there in the actual PR.

@ar-rana ar-rana marked this pull request as ready for review December 25, 2024 13:20
@Siedlerchr
Copy link
Member

Codewise looks good to me. You have accidentally commited the csl styles submodules, see https://devdocs.jabref.org/code-howtos/faq.html#the-problem for a solution

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Dec 25, 2024
@InAnYan
Copy link
Collaborator

InAnYan commented Dec 25, 2024

Hmm, I tried to import these papers:

https://arxiv.org/abs/2406.14319
https://arxiv.org/abs/2412.06769

But JabRef didn't import it properly, and arXiv was null all the time

@ar-rana
Copy link
Contributor Author

ar-rana commented Dec 27, 2024

I have changed the getArXivId method to this in my local branch:

private String getArXivId(String arXivId) {
        if (arXivId == null) {
            arXivId = ArXivIdentifier.parse(curString).map(ArXivIdentifier::asString).orElse(null);
            if (arXivId != null) {
                if (curString.length() > arXivId.length() + 7) {
                    curString = curString.substring(arXivId.length() + 7);
                    extractYear();
                }
                return arXivId;
            }
        }
        return arXivId;
    }

and modified the ArXivIdentifier.parse method a little by altering the identifier to this String identifier = value.split(" ")[0]; at line 41.

this fixes the null arXiv problem that @InAnYan was encontering when externally importing the papers, but for some reason it still does not import the titles correctly except for the paper that was mentioned in the issue.

Screenshot 2024-12-27 195014

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit tests are failing. In the area "Some checks were not successful", locate "Tests / Unit tests (pull_request)" and click on "Details". This brings you to the test output.

You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.

@InAnYan
Copy link
Collaborator

InAnYan commented Dec 28, 2024

Hi, ar-rana! Thanks for working on this PR!

As holidays come soon, maintainers could be too busy these weeks, so don't worry if we give feedback too late

@@ -609,11 +609,15 @@ private String getDoi(String doi) {

private String getArXivId(String arXivId) {
if (arXivId == null) {
arXivId = ArXivIdentifier.parse(curString).map(ArXivIdentifier::asString).orElse(null);
String arXiv = curString.split(" ")[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to a null pointer if there is no whitespace and you access the index

Copy link
Contributor Author

@ar-rana ar-rana Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Siedlerchr, I tested this method with empty/non-empty strings with/without whitespaces and it would only give a null pointer if the curString is null which does not seem to be the case here. So should I add a change here or leave it, as getDoi also work the same way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Importing a PDF with arXiv Id should fetch the arXiV information using the arXivFetcher
3 participants