Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

which APIs need to be public/private #30

Open
faassen opened this issue Jan 10, 2025 · 6 comments
Open

which APIs need to be public/private #30

faassen opened this issue Jan 10, 2025 · 6 comments

Comments

@faassen
Copy link
Collaborator

faassen commented Jan 10, 2025

While I was working on API documentation I ran into some APIs that are currently public but I think should be private or sealed. There are technical challenges in this, I'll go into that in the end. Another solution is to simply eliminate some of these traits

Here's a list:

in suffixarray:

  • IndexWithSA - this allows the index to construct the location for locate queries. But it's really internal

  • PartialArray - is used in get_sa (see previous) and in Debug and size methods internally. The only implementation is SuffixOrderSampledArray so functionality could be merged into it.

  • ArraySampler trait - could be sealed as it's really only used internally to support NullSampler and SuffixOrderSampler.

in character:

  • Right now Character is exposed. I think it should be but I think it could be sealed as we don't expect anyone to implement these externally? Or perhaps we do - make a custom enum into a character, for instance?

in iter:

  • BackwardIterableIndex and ForwardIterableIndex are currently public traits which are exposed, but they really expose implementation details. Could we make them entirely private? If not, we could seal them entirely, hiding their methods and making it impossible to implement them.

  • We may be able to hide BackwardIterator and ForwardIterator too if we make the methods return an impl Iterator instead.

In search

  • BackwardSearchIndex is automatically implemented for everything that has a BackwardIterableIndex. So maybe we can simply make search_backward a default method on BackwardIterableIndex?

The technical challenges have to do with how some of these details are used; if I make them pub(crate) for instance I get complaints that I'm making something pub that depends on something more private. I think some refactoring is required to make it possible to hide these traits. I haven't delved into this yet but we may want sealed traits. This looks like a useful overview of this pattern.

My goals are more clear than how we accomplish them: expose a compact API surface to end users, and preferably make the traits internal only. Since I don't think many people use this trait, I think we are free to break compatibility without consequences at this stage.

So do you agree with this general goal? If so, I will try to find some time to work on it, or we can collaborate - I feel confident I can figure this out myself, unlike some of the other issues.

@ajalab
Copy link
Owner

ajalab commented Jan 11, 2025

My goals are more clear than how we accomplish them: expose a compact API surface to end users, and preferably make the traits internal only.

Yeah I definitely agree on the principle.

I think we are free to break compatibility without consequences at this stage.

Yes, fortunately :) We may discuss the details in PRs later, but please feel free to update the interfaces as you like.

@faassen
Copy link
Collaborator Author

faassen commented Jan 14, 2025

Taking your comment from the merged PR #32 so we can discuss it more here:

Now we have SearchIndex, which generalizes FMIndex and RLFMIndex. Maybe we can turn BackwardIterableIndex and ForwardIterableIndex to pub (crate) later?

I tried doing that but it's not that simple: Since SearchIndex expands BackwardIterableIndex that needs to be public too, otherwise Rust gives a warning.

I tried quite a few modifications to make that go away, but it's pretty fundamental. I think the way forward may have to be the following:

  • change FMIndex to FMIndexImpl, and RLFMIndex to RFLFMIndexImpl.

  • change SearchIndex trait that derives from BackwardIterableIndex to SearchIndexInternal.

  • create a new public SearchIndex trait that does not depend on BackwardIterableIndex or any other trait.

  • create new FMIndex and RLFMIndex that wrap the impl versions and implement new SearchIndex trait for both.

After this I think we can hide the internal traits as they're for implementation purposes only.

There are still potential issues I haven't figured out yet: the HasPosition trait for instance.

I really hope there's a simpler way to do accomplish this but I haven't figured it out yet! I hope you have some ideas.

@faassen faassen mentioned this issue Jan 14, 2025
@ajalab
Copy link
Owner

ajalab commented Jan 18, 2025

For simplicity, I think we can merge BackwardIterableIndex and ForwardIterableIndex.

  • We won't usually implement either one of them, but always implement both.
  • They have common type parameter T and methods like len.

I guess we can consider the merged trait as the internal trait you mentioned (SearchIndexInternal). Then,

  • FMIndex and RLFMIndex implement the internal trait SearchIndexInternal, which offers low-level APIs like lf_map and fl_map.
  • SearchIndex will be a struct that has SearchIndexInternal instance as a private field, and provides user-friendly APIs like search.

Regarding names, another idea would be renaming FMIndex to something like DefaultFMIndex, and rename SearchIndexInternal to FMIndex to avoid the use of term "internal".

@faassen
Copy link
Collaborator Author

faassen commented Jan 20, 2025

I'll play with this a little, see where we end up. I am not sure "Default" is the right naming, but I'll experiment a little and see whether we can come up with an obvious next step.

@ajalab
Copy link
Owner

ajalab commented Jan 21, 2025

Thanks! I also made a draft that implements the idea #36.

@faassen
Copy link
Collaborator Author

faassen commented Jan 28, 2025

I think we've made a lot of progress here! There's one remaining question whether we want to promote FMIndexBackend as a public trait or want to extract a new trait SearchIndex.

The simplest step is to make public search, iter_forward and iter_backward visible to the public; currently the documentation is hidden but if you want to write code that works with this trait, then we need to document them. Another step would be to extract these methods into a SearchIndex trait and let FMIndexBackend extend this trait.

We can't hide FMIndexBackend from the public API without much more work, but if we leave this as a trait with all its methods hidden it won't be too confusing either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants