Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to do the address parsing for a specific country #172

Open
mansoor-sajjad opened this issue Jan 10, 2023 · 0 comments
Open

Option to do the address parsing for a specific country #172

mansoor-sajjad opened this issue Jan 10, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@mansoor-sajjad
Copy link
Contributor

The way the pelias-parser Classifiers works is, that they take all the configured tokens for the configured countries and apply them to given address string irrespective of which country the address belong to.

For example the CompoundStreetClassifier reads in the tokens for the following countries:

libpostal.load(this.suffixes, ['de', 'nl', 'sv', 'nb'], 'concatenated_suffixes_separable.txt')
libpostal.load(this.suffixes, ['de', 'nl', 'nb'], 'concatenated_suffixes_inseparable.txt')

The problem here is that if we have define a street type for Denmark (de), which is not the a valid street type for Norway (nb), the classifier will still try to classify the Norwegian addresses with the Danish Street types and the other street types defined for other countries.

So for example we want to add land and lien as valid street types for Norway, but will then it will affect other countries addresses, like the following unit test for French address fails when adding 'lien' as street type for Norway.

address FRA: Rue de l'Empereur Julien Paris
      expected: |-
        [ { street: 'Rue de l\'Empereur Julien' }, { locality: 'Paris' } ]
      actual: |-
        [ { street: 'Rue de l\'Empereur' }, { locality: 'Julien' } ]

This works best when our scope is full earth, where we don't know in advance which country the address belongs to. But in our case for example we know that we have norwegian addresses, so it would be nice if the classifier can be optionally configured to use only the Norwegian street types and not from the other countries.

Solution

The solution is obviously be able to configure the countries we want the parser to work on.
We can use the options parameter in parser/Parser.js to send in the configuration.
And in case of pelias-api, we can add the configuration option in the pelias-config, which pelias-api can send further into the pelias-parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant