Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: make all classes in the common module hashable #39

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

MxMartin
Copy link

@MxMartin MxMartin commented Mar 8, 2023

  • hashability and comparability is useful in many usecases, e.g. when using results as dict key
  • hashability could be achieved by setting frozen = True, but that would be an obstacle to backwards compatibility and makes it harder to understand the code, especially for people not familiar with pydantic
  • implementation of eq is verbose to make it explicit, could use hash(self)==hash(other) as well
  • hash and comparison of Part classes excludes id field as it generally has no real-world meaning and can be derived e.g. from an objects index in the parent route instance.

While working on the code I also did some refactoring to reduce duplication, i.e. I introduced an abstract base class for the parts which contains the fields that are common to all Part classes

- hashability and comparability is useful in many usecases, e.g. when using results as dict key
- hashability could be achieved by setting frozen = True, but that would be an obstacle to backwards compatibility and makes it harder to understand the code, especially for people not familiar with pydantic
- implementation of __eq__ is verbose to make it explicit, could use hash(self)==hash(other) as well
- hash and comparison of Part classes as it generally has no real-world meaning and can be derived e.g. from an objects index in the parent route instance.
@danielnaumau
Copy link
Contributor

danielnaumau commented Mar 9, 2023

Everything else looks good for me but if we implement hash and eq for basic classes, we'll need to implement for all response classes too

@MxMartin
Copy link
Author

MxMartin commented Mar 9, 2023

Everything else looks good for me but if we implement hash and eq for basic classes, we'll need to implement for all response classes too

Absolutely agree with you, and probably for a few others as well. I suggest keeping the PR here open until we step by step added hashability to more classes.

I also added it to ArrivalSearch and DepartureSearch as I had a use case where I wanted to use them as dict key and the corresponsing response as values and I assume that's also a common use case. To make that work the components in the transportation DTOs module had to become hashable as well.

@danielnaumau
Copy link
Contributor

danielnaumau commented Mar 9, 2023

Everything else looks good for me but if we implement hash and eq for basic classes, we'll need to implement for all response classes too

Absolutely agree with you, and probably for a few others as well. I suggest keeping the PR here open until we step by step added hashability to more classes.

I also added it to ArrivalSearch and DepartureSearch as I had a use case where I wanted to use them as dict key and the corresponsing response as values and I assume that's also a common use case. To make that work the components in the transportation DTOs module had to become hashable as well.

But where and how did you want to use them? I was planning to make them private and use only inside sdk to map requests

@danielnaumau danielnaumau marked this pull request as draft March 9, 2023 10:06
@MxMartin
Copy link
Author

MxMartin commented Mar 9, 2023

Everything else looks good for me but if we implement hash and eq for basic classes, we'll need to implement for all response classes too

Absolutely agree with you, and probably for a few others as well. I suggest keeping the PR here open until we step by step added hashability to more classes.
I also added it to ArrivalSearch and DepartureSearch as I had a use case where I wanted to use them as dict key and the corresponsing response as values and I assume that's also a common use case. To make that work the components in the transportation DTOs module had to become hashable as well.

But where and how did you want to use them? I was planning to make them private and use only inside sdk to map requests

That leads to an interesting point: I think that the use cases for the API can vary a lot so making the SDK as adaptable as possible is a useful thing. What I do is I don't use the original TravelTimeSdk class but subclass it to implement own functionality.

The concrete use case was to track which searches go through without an error as any DepartureSearch where only one location causes an error will raise an error. I have therefore started modifying the entire SDK quite a bit (beyond what I pushed here to include in the official SDK) to be able to return not only results but much more detailed info on what the API returns. (I'm thinking about publishing as well but it's a bit of an effort to separate proprietary code from stuff that can be shared ;) )

Other use cases may be recreating searches/api request depending on the initial response. E.g. you want to find connections from A to B for each 10 minute time window in an hour, but instead of making 6 requests which will all return nothing if there are no connections at all it might make sense to first make a request with range_width=3600 and only if that is successfull make a more granular search. That means having to take apart a DepartureSearch and make subsequent queries with just a subset of origin/destination combinations contained in the original request and the was I do development I find it handy to keep a dict that is something like dict[SearchItem, Response | list[Response]], where SearchItem is a single origin/destination combination. Currently I do that by splitting each DepartureSearch into multiple instances of a DepartureSearch with only one item in departure_search_ids. Not ideal, but that way I can still pass these DepartureSearch instances directly to SDK methods without having to convert them first.

But that's just my usecase. Others may have different use cases.
I'd suggest to try and make the SDK as flexible as possible by allowing for all very useful schemas to be reused in own code.
For default use cases the TravelTimeSdk.time_filter() etc. methods are fine, but I guess a lot of use cases will want to make modifcations for one reason or another.

One of the issues I face when subclassing TravelTimeSdk is the __* naming of e.g. the __header method. I'd suggest using simple _* which is less tricky to handle and still shows to any developer that its a private method/attribute, i.e. "use at own risk".

@MxMartin
Copy link
Author

MxMartin commented Mar 9, 2023

One more comment:
For me the SDK has two main uses:

  1. automated parsing of API responses into useful objects (especially useful for the deeply nested responses, like the route property of the TimeFilter API)
  2. Pydantic models that reflect what the API accepts so I don't have to check the API documentation all the time
  3. methods to efficiently bundle and unbundle searches into requests

However in the end I need to make own calculations with the response, so when I receive route I might want to calculate the min time between public transport parts. I could use a function for that that accepts a Route instance or a list of Part instances, but that's not good coding practice in my opinion. Instead I'd prefer to have an min_change_time method on the Route class itself. I could do that by creating my own MyRoute clsas but that means I have to convert all Route instances which requires conversion and that's not good for performance.
I could monkeypatch the SDK's Route class but that's ugly as well.
The best option would be to be able to add some sort of config to the SDK class and save subclasses of Route there and to let the __parse_response method make use of either the default classes or whatever has been passed as custom subclasses.

@danielnaumau
Copy link
Contributor

Okey, thanks @MxMartin for you review, I'll think how to do it and add some new issue. If you need something from sdk, write it in issues, please, so we won't forget about it.

And it'd be nice if you could share with us some code samples how you use our sdk and your modifications, so we'll know how to improve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants