Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type-hinting proposal #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Debate around Fedora Copr development

Just a list of unordered, random, ad-hoc files related to Copr development and brainstorming (at least for now).
Just a list of unordered, random, ad-hoc files related to Copr
development and brainstorming (at least for now).


## Topics

- [Type Hinting](./type-hinting/README.md)
169 changes: 169 additions & 0 deletions type-hinting/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Type Hinting

Author: [@FrostyX](https://github.com/frostyx)

This is not a case against type hinting (static types), this is a case
against over-using them.

## Good motivations

- Type hints nicely complement docstrings. Docstrings are great for
describing the purpose of a function (or class, file) but they suck
for describing their arguments and return value types because they
can be untrue. Type hints nicely complement this.
- Type hints can reveal some errors


## Bad motivations

### A bulletproof solution

Static types are not a bulletproof solution for revealing type-related
bugs. IIRC some study found that ~80% of type-related errors are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in our case, the use of typing would be only the documentation part rather than the checking part. I don't know if mypy is able to check only the part of the code we are currently modifying (and typing) and ignore the rest of the file. I think if we decide to use mypy directly, it will contain tons of errors since Copr wasn't written with thinking of types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found darker project which should be able to check the code incrementally - it should only format and check these parts which are currently changed in git commits. See https://dev.to/akaihola/improving-python-code-incrementally-3f7a

It should contains tool such tools as black, isort, pylint, flake8, mypy and codecov
Maybe it would be worth to investigate what this tool can do and if we want to use some of the tools it offers.

caused by `None` and Mypy won't save us there - it can find many
issues, but I can still introduce tracebacks or unexpected bugs that
Mypy doesn't warn about:

Produces traceback:

```python
def hello(user: Optional[User]) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it depends... if you are using mypy without --strict option the yes - mypy does not warn you about it, because it obviously can't know without context what type the getattr returns... and because it does't have --strict option, then it decides to loosen the rules and ignores it (there is a ton of things which mypy ignores without the --strict option, otherwise the suicide rate of python devs would be high :D).

On the other hand using mypy --strict is IMHO overkill even for startups... it's so pedantic...
But just to show it, mypy with strict option will fail with:

  • error: Returning Any from function declared to return "str"

because it just don't know what the getattr will return so it correctly assumes that the return type will be "Any".

return "Hello " + getattr(user, "name")

print(hello(None))
```

Returns `Hello None`:

```python
def hello(user: Optional[User]) -> str:
return "Hello {0}".format(user)

print(hello(None))
```

Produces traceback:

```python
def hello(user: dict) -> str:
return "Hello {0}".format(user["name"])

print(hello({}))
```

### Performance

Type hinting in python
[doesn't bring any performance improvements][mypy-performance],
it is used merely for linting.


## Disadvantages

### One does not simply know the type

_Insert the Boromir meme_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D


Duck typing is IMHO one of the best Python features ...
Normally, we are not required to know and say exactly what type is
expected but rather how it is supposed to behave.

I may intend to write a function that filters a list somehow:

```python
def odd(numbers: list):
return [x for x in numbers if x%2 == 0]
```

And expect the following usage

```python
print(odd([1, 2, 3, 4]))
```

Later, somebody want to use

```python
print(odd(range(1, 5)))
print(odd(reversed([1, 2, 3, 4])))
```

and Mypy complains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duck typing is a great advantage of python indeed and python static typing (hopefully) is trying not to kill it.

You are telling here that you are expecting list type which is explicit type. If you want to benefit from duck typing, you need to use a type which tells mypy how it's supposed to behave. In this case, you need to tell that the number is some kind of parameter that can be iterated.
For this case, typing offers an Iterable type. So this following code will pass even with --strict option:

def odd(numbers: Iterable[int]) -> list[int]:
    return [x for x in numbers if x % 2 == 0]

Of course this is a trivial example and in reality you often need more complex types and structures that tells mypy how it is supposed to behave. But this is already quite an advanced topic. In the vast majority of more complex cases an Union will suffice. Of course, sometimes even that may not be enough. In these cases, I think it's perfectly fine to type the part with Any, which says that the variable/function... can be anything. Sure you can create complex types which will be correct but this is overkill... it makes sense for startups with experienced mypy guys... or for masochist :D



### Too much clutter

Consider the following code (simplified version from `copr-common`)

```python
SLEEP_INCREMENT_TIME = 1

def send_request_repeatedly(url, method=None, data=None, timeout=3):
sleep = SLEEP_INCREMENT_TIME
start = time.time()
stop = start + timeout

i = 0
while True:
i += 1
if time.time() > stop:
return

# Sending the request, and failing
print("Attempt #{0} to {1}: {2}".format(i, method or "GET", url))

time.sleep(sleep)
sleep += SLEEP_INCREMENT_TIME

send_request_repeatedly("http://example.foo")
```

Now the same code with 100% adherence to static typing

```python
SLEEP_INCREMENT_TIME: int = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy (even with strict options) can in most cases infer the type of a variable and if it can't infer the type, it will complain.

I have a feeling that mypy without strict options doesn't force anyone to type variables, even if it can't infer their type (at least in 99% of cases)


def send_request_repeatedly(url: str, method: Optional[str] = None,
data: Optional[dict] = None, timeout: int = 3,
) -> None:
sleep: int = SLEEP_INCREMENT_TIME
start: float = time.time()
stop: float = start + timeout

i: int = 0
while True:
i += 1
if time.time() > stop:
return

# Sending the request, and failing
print("Attempt #{0} to {1}: {2}".format(i, method or "GET", url))

time.sleep(sleep)
sleep += SLEEP_INCREMENT_TIME

send_request_repeatedly("http://example.foo")
```

I would argue that the code readability is **much** worse.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It looks horrible. But fortunately it is not necessary to type this much (e.g. local variables)

second reason is the formatting style. This code formatting style is pretty but without typing... once there is a lot of parameters in function, the function definition looks terrible. There are some python code formatters which tries to look OK with typing. The one I am using is black. Black is a really strict formater which formats everything for you (so it goes way further than PEP).
There are some cases where black has decided to format the code in a very unreadable way and I disagree with some decisions on how to format what, because it sometimes reduces readability. But the reason I use it is that I absolutely don't have to think about formatting the code, because black does it for me :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the formatted code without redundant types will look like this:

SLEEP_INCREMENT_TIME = 1


def send_request_repeatedly(
    url: str,
    method: Optional[str] = None,
    data: Optional[dict[str, Any]] = None,
    timeout: int = 3,
) -> None:
    sleep = SLEEP_INCREMENT_TIME
    start = time.time()
    stop = start + timeout
    i = 0
    while True:
        i += 1
        if time.time() > stop:
            return
        # Sending the request, and failing
        print("Attempt #{0} to {1}: {2}".format(i, method or "GET", url))
        time.sleep(sleep)
        sleep += SLEEP_INCREMENT_TIME


send_request_repeatedly("http://example.foo")

which IMHO is not that bad (considering that this will pass with the --strict option)



## My proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with these points below except about the private functions/methods... I think it is handy to see the types even inside the logic of project so one can see with what parameters they are dealing with.


Let's use static types only when it makes sense.

- Optionally for public functions, public methods, and class
constructors, but not require them

And avoid using them for:

- Private functions and private methods (starting with and underscore)
- Local variables
- Inside tests
- When types are already defined by something else (`argparse`,
`click`, SQLAlchemy models, etc)




[mypy-performance]: https://mypy.readthedocs.io/en/stable/faq.html#will-static-typing-make-my-programs-run-faster