Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If an author name begins with a multibyte UTF8 character, the name is not converted to an initial #146

Open
3 tasks done
jnugent opened this issue Dec 5, 2022 · 2 comments

Comments

@jnugent
Copy link

jnugent commented Dec 5, 2022

Please follow the general troubleshooting steps first:

  • I read the README and followed the instructions.
  • I am sure that the used CSL metadata follows the CSL schema.
  • I use a valid CSL stylesheet

Bug reports:

We first received a report of this behaviour in one of our hosted OJS installations. A Norwegian journal who had authors with the first name Åse reported that citations generated with this name did not convert this to Å but instead kept the whole name. I've since duplicated this in other OJS installations, and have also just cloned the citeproc-php repository and changed one of the author names in the sample JSON to Åse and the problem occurs there as well, when running the example/index.php script.

In my investigating I discovered that the file ./src/Util/StringHelper.php contains a method called initializeBySpaceOrHyphen that appears to split the first letter off of a string. If a string like Åse is passed in, the first letter is correctly split off but then the StringHelper::isLatinString method returns true here (it probably should not?) and then the call to ctype_upper fails, which causes the method to return the entire string again. I was able to get this to correctly work by temporarily negating the test for StringHelper::isLatinString in the if statement.

Used CSL stylesheet:

apa.csl

Used CSL metadata

Please replace these lines with your used metadata, for instance:

[
    {
        "author": [
            {
                "family": "Anderson",
                "given": "John"
            },
            {
                "family": "Brown",
                "given": "Åse"
            }
        ],
        "id": "ITEM-2",
        "type": "book",
        "title": "Two authors writing a book"
    }
]
@ronste
Copy link

ronste commented Dec 20, 2022

Hi,

here's another example were this bug kicks in: https://www.cgt-journal.org/index.php/cgt/article/view/11

@RewindLife
Copy link

Our journal also has the same issue with failed abbreviation of authors' first names starting with characters with diactritics.
Can this issue be assigned a bug status? Any idea how quickly this can be fixed. Thank you in advance.

citation abbreviation issue

glorieux-f added a commit to glorieux-f/citeproc-php that referenced this issue Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants