Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make astroid typable possibily with breaking change to permit most performance increase possibility #2060

Open
Pierre-Sassoulas opened this issue Mar 15, 2023 · 3 comments

Comments

@Pierre-Sassoulas
Copy link
Member

astroid is very dynamic and very hard to type. A lot of effort went into typing it, and mypy is still not activated (#1287). It prevents automated error detection in pylint and it prevent most performance possibility that always start with proper typing, be it mypyc, a partial rewrite in a rust crate (#2014) or cython.

I'm opening this discussion so we can discuss the breaking changes we need to do to make it easier to type astroid in the future.

@nickdrozd
Copy link
Contributor

Many node types have fields that are marked as optional when really they are required. These fields are assumed to be non-optional throughout the Astroid code, and that's a major source of type errors detected by Mypy.

For example, consider the Assert node. It has these fields:

        self.test: NodeNG | None = None
        """The test that passes or fails the assertion."""

        self.fail: NodeNG | None = None  # can be None
        """The message shown when the assertion fails."""

An assert statement without a test is not valid syntax, so the test field needs to be required. On the other hand, the fail message really is optional, so the fail field really is optional.

A symptom of the problem is the frequent appearance of # can be None comments. These show that the existing type annotations are not reliable.

Now, the reason AFAICT that these fields are all marked as optional is that the type annotations are declared in __init__ but the values for those fields are not available until postinit is called. There is an easy solution to this: declare the types outside of __init__. This has the happy side effect of making it possible to get rid of __init__ definitions altogether for many nodes, since all they do besides declaring types is calling super().__init__.

@nickdrozd
Copy link
Contributor

#2061 is a concrete example of the kind of change I'm talking about.

@AWhetter
Copy link
Contributor

Now, the reason AFAICT that these fields are all marked as optional is that the type annotations are declared in __init__ but the values for those fields are not available until postinit is called.

This is exactly why. When I was first defining the types of everything all the way back when I added docstrings to everything, I had to make everything optional because of how __init__ and postinit worked.

There is an easy solution to this: declare the types outside of __init__. This has the happy side effect of making it possible to get rid of __init__ definitions altogether for many nodes, since all they do besides declaring types is calling super().__init__.

We'd risk the potential for some of the attributes to be accidentally left unset by the postinit if we took this approach. If we think that's fine, then this seems like the easier solution.

Personally I'd want to be looking to eliminate the postinit methods where possible. For many node types we create a node and then immediately call the postinit. This wouldn't be possible for nodes where parents need to store mandatory references to the children, and children need to store mandatory references to the parents.
For those nod types, we could annotate those fields as mandatory and deal with mypy complaining about them being unset internally. It wouldn't allow us to use cython, but it would make the type annotations be correct for users at least. We could minimise this being an issue if, for example, we always created child nodes first so that only the parent attribute is temporarily left unset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants