-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add AuthorizedSession #22
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to give this a shot, and update the devel
branch with this PR.
One of the things we'll need to consider is whether or not to make this a breaking change. Most people will have done something like:
import nasdaqdatalink as ndl
data = ndl.get_table(...)
If there's a reverse compatible way users can benefit from a default session, that would be preferred. However, if people need to change code and construct a session beforehand, we should consider adding in some more convenience to do that work for them, while still providing the 'new' way of letting users define the session and any other changes they require to the ApiConfig object.
test/test_connection.py
Outdated
@@ -59,8 +60,8 @@ def test_non_data_link_error(self, request_method): | |||
httpretty.register_uri(getattr(httpretty, request_method), | |||
"https://data.nasdaq.com/api/v3/databases", | |||
body=json.dumps( | |||
{'foobar': | |||
{'code': 'blah', 'message': 'something went wrong'}}), status=500) | |||
{'foobar': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we weren't hit with a lint warning about indentation / formatting.
I'll need to review the lint settings in our CICD.
@@ -17,6 +17,18 @@ class ApiConfig: | |||
retry_status_codes = [429] + list(range(500, 512)) | |||
verify_ssl = True | |||
|
|||
def read_key(self, filename=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is adding read_key as a method to ApiConfig necessary? I'm not sure it's ever used. If users define their own session and pass in ApiConfig object, how should the api key poking be reconciled? It appears this ApiConfig.read_key() only considers file-based API Keys, and is never called anywhere -- am I missing something?
__init__
will call read_key() and kick off how to read the user's key from Env or a file, preferring ENV vars first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should be called read_key_from_file
. I was thinking it would be nice to allow each ApiConfig instances to read keys from different files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but even so it still duplicates the namespace method, which I'd like to avoid having multiple methods that do the same thing. Ideally the behaviour for finding a key is to:
- check if the env var exists, and return what it is so we can set it;
- otherwise, check if a file (user defined or default) exists and read from it: if a value is provided set it;
- we couldn't detect an api key so we must fail
I didn't catch any logic that attempted to call ApiConfig.read_key() or what you propose should be ApiConfig.read_key_from_file() -- so I'm curious whether that's the correct level to provide those helper methods.
ApiConfig is just being used as a dataclass anyway, and at some point in the future we should convert it to one. We can have helper methods in it, for sure, but we need to make sure that it should be responsible for doing those tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was think something like this:
api_config1 = ApiConfig()
# api_config1 will have whatever is inside the ApiConfig class
ndl.get('table', api_config=api_config1)
api_config2 = ApiConfig()
api_config2.read_key_from_file('test2.txt')
ndl.get('table', api_config=api_config2)
Basically, i was trying to avoid using ApiConfig as a mutable storage of global state, but I understand it does introduce additional complexity to the users by enforcing a context creation beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, we'd need to go through some trouble to make ApiConfig immutable. I'm okay with the mutability of ApiConfig, since we should provide a consistent means of allowing users to set their api key if they need more configuration options.
ApiConfig() on create should do heavy lifting of finding API Key in env var or in a file, using defaults.
We should provide convenience methods to allow callers to if they wish to override defaults. We should encourage people to simply use the defaults.
We can modify __init__
to no longer call read_key() as this must being taken care of for callers: if one does not define api_config kw-arg then use our default object state.
We could move the helper methods into ApiConfig instead and avoid duplicating definitions. I would prefer we have consistent logic in one place, rather than maintain multiple definitions of the same logic.
I believe this PR should replace #16. |
Mind rebasing and force-pushing this branch instead? I'd like to avoid including the merge commit in this series. To keep a cleaner history when we do merge you have two options: Technically a third option is to squash it all of this series into one commit, and improve the individual git log message. The majority of the commit log entries don't explain why you're making each change. What I'd prefer to see is atomic commits with a commit message explaining why a the change is being made. It doesn't need to be an essay. For examples of good commit messages, look no further than the linux kernel and git itself. I can review these options with you next week, and get this merged in as I'd like to include this change in the next release. It appears Eric has already bumped the version and change log, which I try to reserve for the commit tag only. Will change with you both and I suspect we'll put out two versions as I'd rather keep the git tag on the same commit that bumped the version and changelog. when cutting a release, I'd prefer to modify those two files separately in a commit and tag it. |
5e2534d
to
feea24e
Compare
feea24e
to
e87eee2
Compare
The ApiConfig in this context is a bit of a nuisance, especially where read_key() is concerned. I'd like to avoid duplicating the logic. The module method read_key() should be read as part of the ApiConfig constructor so that users don't need to explicitly call it. We introduced read_key() in At the end of the day, ApiConfig could be a dataclass that can initialize itself on create, and mutated later. Since we no longer need to support python < 3.7 we can use typing instead of checking isinstance(). I'd rather users know they set the wrong thing at runtime, than we passively set a default for them and they end up having a hard time figuring out why their state is different than what they expect. This would only be true if they wanted to modify the base domain or some other property, but accidentally passed in some wrong object. When AuthorizedSession instances are created, you allow callers to provide the api_config or we'll generate one for you: this is great. However, we still need to perform read_key() to set the state appropriately. I don't see when that's happening, so I expect that the default object won't have tried to figure out the API Key for convenience. Since the class attributes end up being used / referenced instead, I think this subtly gets missed. I made a pass at doing some refactoring of the api_config.py from a module to a class with a lot of behaviour / helper methods instead. However, it appears I've broken get_config_from_kwargs() method you introduced; I'll need to fix that. |
Ah, I should have paid closer attention to how you're using I've made this business a bit more messy. Let me think about it a bit more. I'll roll this change out of this series and keep it to myself for now. |
1066b01
to
e87eee2
Compare
@@ -17,6 +17,18 @@ class ApiConfig: | |||
retry_status_codes = [429] + list(range(500, 512)) | |||
verify_ssl = True | |||
|
|||
def read_key(self, filename=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only invoked in the tests?
if proxies is not None: | ||
self._auth_session.proxies.update(proxies) | ||
|
||
def get(self, dataset, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of this tickets scope, but I would like to see us passing connection details (session, api_config) as explicit arguments, and leave kwargs to params we should send to the API call. This would allow us to avoid all the kwarg popping down the line
Code seems 👍 to me. Even with this code being backwards compatible with the old way of making requests (1 call per session), I feel like we should release it as a major. This way people are more aware about authorized sessions change, and how it can make their use of the package more performant |
I agree. While not a breaking change, it would require people to modify their scripts to take advantage of this feature. We should add some coding examples and perhaps a future PR can address it. While I'm happy to see the README updated, the authorizedsession() accepts an api_config keyword argument, and we should provide the context for why that's useful. Despite the majority of this SDK relying too heavily on global state of ApiConfig's class attributes, we can highlight when it might be useful to not use the defaults. Include an example directory with the use-cases in mind would be helpful. |
This pr adds a new way to set ApiConfig and making requests.
Previously, we use ApiConfig as a global singleton to set api_key and configuration, users cannot use different configurations within a single python executable. In this pr, we will allow users to pass an instance of ApiConfig into request api.
We introduced a new interface in this pr which allow users to making multiple requests within a single session to remove the overhead of creating session every times.