-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: class docstring overrides default description from __schema__ #239
Conversation
@antonagestam I figured I just open the pull request and use the CI pipeline to run tests, could you please approve that? :) First I wanted to see if nothing breaks.
and then mark it as ready. |
src/phantom/schema.py
Outdated
Note that the docstring attached to a phantom type class, if present, | ||
will override any description that is defined in `__schema__()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the behavior we want to introduce. We should still allow users to override with an explicit value for "description"
in __schema__
. A really easy way to achieve that should be to move the implementation from __modify_schema__
to __schema__
where you can literally just do {..., "description": cls.__doc__}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't think that it will have the intended effect. Consider:
class FancyScore(float, Closed, low=0, high=10):
"""My fancy description"""
Now if I call schema()
on a model using it, as I understand, the chain would go:
FancyScore.__modify_schema__()
(which isSchemaField.__modify_schema__
)- uses
FancyScore.__schema__()
- which inherits
Closed.__schema__()
<- provides a default description overriding what follows below - which uses
super().__schema__()
(possibly through a few superclasses) - which uses
SchemaField.__schema__()
<- where you propose to add the docstring
(if FancyScore would also define __schema__
, it would not change that general logic and problem)
If you think that __schema__
should have priority over the docstring, then a different approach is needed - in that case we need to recognize where the description
came from - was it overridden in the same class inside __schema__
, or was it inherited from a base class? If it was inherited, the docstring of the current class should provide the default.
Without changing the general machinery, this would only be achievable ad-hoc by modifying all __schema__
methods. Actually, not sure how this could be done most elegantly without sandwiching some other method into the cascade or introducing metaclass magic (like: check __schema__
of the base class and the new class, compare description
- if it changed/was not present before, then it was overriden. If it was not overridden, use docstring)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the detailed description of the problem, I didn't have all the pieces in my head. I'll have to tinker with it a bit to see if I can come up with something. Otherwise I'll probably be reluctant to introduce this feature, unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no problem! I think the Meta class approach could work, in principle.
When a new class is subclassed:
- no docstring -> all good
- docstring, no defined schema method -> generate schema method with docstring as description
- docstring, defines schema method -> call both inherited and new method to see if description (if any) is "new", if not there or not new, wrap method to add the docstring to the result.
That way it would have the desired behavior, but it's up to you to decide if this is an acceptable solution. The last case is "ugly", but still it's a check needed only once during class creation, and it could (I actually expect it to) be a rare case that both things are present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long wait here, I have now looked closer and your assessment is probably correct, however, that makes me not want to introduce this feature as-is.
I think an alternative, if you think it's worth pursuing, would be introduce a class parameter. This is what I think the spec should be:
class Base(str, Phantom):
@classmethod
def __schema__(cls) -> Schema:
return super().__schema__() | {"description": "base description"}
class A(Base):
"""docstring"""
@classmethod
def __schema__(cls) -> Schema:
return super().__schema__() | {"description": "explicit override"}
class B(Base, docstring_description=True):
"""docstring"""
@classmethod
def __schema__(cls) -> Schema:
return super().__schema__() | {"description": "explicit override"}
class C(Base):
"""docstring"""
class Base2(str, Phantom, docstring_description=True):
"""base docstring"""
class D(Base2):
"""docstring"""
class DataModel(pydantic.BaseModel):
a: A
b: B
c: C
d: D
assert DataModel.schema()["properties"]["a"]["description"] == "explicit override"
assert DataModel.schema()["properties"]["b"]["description"] == "docstring"
assert DataModel.schema()["properties"]["c"]["description"] == "base description"
assert DataModel.schema()["properties"]["d"]["description"] == "base docstring"
What do you think? use_docstring
might be a better name for the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So your modification (over what I proposed) is: instead of introducing the complicated checks and comparisons, to ask the user to opt-in explicitly, which would make the class docstring override any inherited or provided schema description, right?
I would be fine with that, especially if it's called use_docstring
- it would still reduce a lot of boilerplate :)
I updated the PR with the flag-based version. After playing around a bit I decided that when from __future__ import annotations
from phantom import Phantom
from phantom.schema import Schema
import pydantic
class Base(str, Phantom, predicate=lambda _: True):
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "base description"
}
class A(Base):
"""docstring a"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override a"
}
class B(Base, use_docstring=True):
"""docstring b"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override b"
}
class C(B):
"""docstring c"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override c"
}
class D(C, use_docstring=False):
"""docstring d"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override d"
}
class DataModel(pydantic.BaseModel):
a: A
b: B
c: C
d: D
x = DataModel.schema()["properties"]
for k in ["a", "b", "c", "d"]:
print(x[k]["description"])
assert x["a"]["description"] == "explicit override a"
assert x["b"]["description"] == "docstring b"
assert x["c"]["description"] == "docstring c"
assert x["d"]["description"] == "explicit override d" What do you think? |
@apirogov Initial reaction is it feels unintuitive for this to be inherited, ie I think C shouldn't use its docstring. I'll give this some further thought and come back with a proper review as soon as I find the time. |
In my case, there could be many types defined in that way, so a way to set this flag to "infect" all subclasses or toggle this globally would be ideal. But I agree that all unintuitive effects seem to be connected to expectations and possible behaviors wrt. to inheritance. Maybe you are right to not like it, I am not sure myself. In any case this would be just a minor change to disable. Let me know if you have some better alternative idea, or I can also remove the inheritance bit :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments that need to be addressed, and the test suite is currently failing, please make sure it's passing and I can take another round of reviewing this.
In addition to that I think we need to add tests for this feature, that will also help me in understanding how your implementation works.
With the regards to the chosen solution, I still feel like the spec I proposed makes more sense. Perhaps we should discuss in more detail what you found unintuitive about that?
if kwargs: | ||
raise RuntimeError("Unknown phantom type argument(s): {kwargs}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: I don't think we should add this check, any arguments falling through will be handled by the super call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I personally added it after wasting 10 minutes trying to understand why my keyword argument was not taken and it turned out that I had a typo - because the default error by the super call is absolutely not helpful and says (.... takes no keword arguments...) needlessly confusing me.
As the immediate super class does not have an init_subclass
method, I decided to put it where it is to prevent passing unknown keywords to the default implementaion and getting that error.
I personally believe that such a check improves ergonomy over relying on some method upstream choke on it, but this is my personal preference and opinion. If you strongly disagree, I can remove it.
src/phantom/schema.py
Outdated
if cls.__use_docstring__: | ||
if ds := desc_from_docstring(cls): | ||
field_schema["description"] = ds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can we move this to __schema__
? I'd like to avoid adding logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can't. If the docstring is to take precedence over what overrides of __schema__
do, it must happen at the call site of __schema__
, i.e. __modify_schema__
, which is what we already discussed in the initial version I pushed.
Even worse, you still might want to change other fields and thus define __schema__
to set eg another title
or add examples
, but the description should come from the docstring, patched over it. So it must be outside.
In your example C, I might still want docstring c
as description but add other things in __schema__
. If I put this into the default implementation, this whole thing cannot work. Basically see my call stack explanation above.
Concerning your proposal, I simply did not see how intended behavior with inheritance was supposed to be and opted for it. I will remove it. At first I found your example D unintuitive, but if the intent is to pass use_docstring on each subclass, now I see your intuition of it. Will try to adapt it accordingly. |
I pushed new changes. I made the code work to conform to your interface but there are some problems with how you wish your example D should work vs how the behavior in E turns out and is hard to fix. I'd suggest to change the behavior of D (assertion E shows what I expect, but currently it's violated) from __future__ import annotations
from phantom import Phantom
from phantom.schema import Schema
import pydantic
class Base(str, Phantom, predicate=lambda _: True):
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "base description"
}
class A(Base):
"""docstring"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override"
}
class B(Base, use_docstring=True):
"""docstring"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "explicit override"
}
class C(Base):
"""docstring"""
class Base2(str, Phantom, use_docstring=True, predicate=lambda _: True):
"""base docstring"""
class D(Base2):
"""docstring"""
class Base3(str, Phantom, use_docstring=True, predicate=lambda _: True):
"""base docstring"""
@classmethod
def __schema__(cls) -> Schema:
return {
**super().__schema__(), # type: ignore
"description": "base description"
}
class E(Base3):
"""docstring"""
class DataModel(pydantic.BaseModel):
a: A
b: B
c: C
d: D
e: E
x = DataModel.schema()["properties"]
assert x["a"]["description"] == "explicit override"
assert x["b"]["description"] == "docstring"
assert x["c"]["description"] == "base description"
assert x["d"]["description"] == "base docstring"
# Problem:
assert x["e"]["description"] == "base docstring" # would be my intuition
# but making sure that D works as you say makes it not possible for E to work correctly
# there is, again, no way to check whether the description was overridden in E
# or in __schema__ of a parent class. I think instead of trying ugly things to make E work, rather I would change the behavior so that
then it would be consistent and the fixup method could be simplified / removed. If we agree on the implementation and behavior details I would add tests and fix the pipelines and remove dead code/comments and rebase and squash everything for merge (so you don't have to point it out). By the way, the |
I'll close this for now until there's a more clear idea about how to move forward with this. |
fixes #238
I do it in
__modify_schema__
, so that the docstring, if present, is applied last and not first (which would be the case when just adding it in__schema__
)