-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coverage measurement of Cython code in a meson build #6186
Comments
The reason for this difference is that Cython is being run from the build directory e.g.: $ cython -3 src/stuff/thing.pyx -o out.c
$ grep thing.pyx out.c
"src/stuff/thing.pyx",
...
$ cd tmp/
$ cython -3 ../src/stuff/thing.pyx -o out.c
$ grep thing.pyx out.c
"thing.pyx",
... I don't see an immediate |
I don't think that cython's coverage plugin is going to be able to figure out where the files are in a meson build. Instead though what would be useful is if we could just pass in something like a json file that says where the files are. There are several steps to making the coverage report work:
The way that this works with setuptools in my demo is:
In the spin meson setup the directory structure (for CWD) at the time the coverage plugin runs is like: $ tree src build/src build-install/
src
└── stuff
├── __init__.py
├── meson.build
├── test_thing.py
└── thing.pyx
build/src
└── stuff
├── thing.cpython-311-x86_64-linux-gnu.so
└── thing.cpython-311-x86_64-linux-gnu.so.p
├── meson-generated_src_stuff_thing.pyx.c.o
└── src
└── stuff
├── thing.pyx.c
└── thing.pyx.c.dep
build-install/
└── usr
└── lib
└── python3.11
└── site-packages
└── stuff
├── __init__.py
├── __pycache__
│ ├── __init__.cpython-311.pyc
│ ├── test_thing.cpython-311.pyc
│ └── test_thing.cpython-311-pytest-8.2.0.pyc
├── test_thing.py
└── thing.cpython-311-x86_64-linux-gnu.so The current working directory here is project root and I don't think it is reasonable to extend the heuristics currently used in the plugin to the extent that they would be able to locate Stripping this all back all that is really wanted for the plugin to work is:
It is not hard though for me to generate all of the information that the plugin needs in the form of e.g. a json file like: {
# map from __pyx_f[0] to c file path
"c_files": {
"src/stuff/thing.pyx":
"build/src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c"
}
# maps from paths in foo.c etc to source .pyx etc files
"source_files": {
"build/src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c": {
"stuff/thing.pyx": "src/stuff/thing.pyx"
}
}
} With spin/meson we can easily generate this information because we know where we put the $ cat build/src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c.dep
src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c: \
../src/stuff/thing.pyx It would not be hard then to either add a custom target in the project's meson build that builds this json file or for it to be something that What I would need from Cython to make this work is just two things:
I can implement this and send a PR but I'm just wondering if this sounds reasonable and if anyone has any suggestions for what would be the best interface for these last two points. |
This part seems like a problem in the For a
So this is the reverse problem of the above - but harder indeed, because the |
What happens is that
Then cython internally strips off the current working directory but only if the file is inside CWD: cython/Cython/Compiler/Scanning.py Lines 198 to 200 in 45db483
Given that On the other hand though cython knows how to generate paths relative to sys.path i.e. relative to /* "stuff/thing.pyx":1
* cdef class Thing: # <<<<<<<<<<<<<<
* def method(self):
* return 2
*/ These are the paths that are parsed from the c file and used to locate its source I think it would make sense for {
# map from __pyx_f[0] to c file path
"c_files": {
"stuff/thing.pyx":
"build/src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c"
}
# maps from paths in foo.c etc to source .pyx etc files
"source_files": {
"build/src/stuff/thing.cpython-311-x86_64-linux-gnu.so.p/src/stuff/thing.pyx.c": {
"stuff/thing.pyx": "src/stuff/thing.pyx"
}
}
} Note that if we weren't using the src-layout then |
Another option would be for spin to have its own Cython coverage plugin that handles all the path finding. Then the only thing that needs changing in Cython is setting |
In a more complicated situation with static const char *__pyx_f[] = {
"src/flint/types/fmpz.pyx",
"src/flint/utils/conversion.pxd",
"src/flint/types/fmpz.pxd",
"src/flint/utils/typecheck.pxd",
"src/flint/flint_base/flint_base.pxd",
"type.pxd",
}; The tracing events are sent out with any of these filenames and Cython tries to guess which .c file is involved. The pxd files involved are included in many .c files though so there is no straight-forward map from e.g. With the out of tree meson build that becomes: static const char *__pyx_f[] = {
"fmpz.pyx",
"conversion.pxd",
"fmpz.pxd",
"flint_base.pxd",
"type.pxd",
}; This is where the cython/Cython/Compiler/ModuleNode.py Lines 969 to 971 in 45db483
This is possibly where absolute paths become relative for what is shown in the annotated comments: cython/Cython/Compiler/Scanning.py Lines 227 to 232 in 45db483
That's relative to CWD but possibly cythonize changes the working directory: cython/Cython/Build/Cythonize.py Lines 104 to 117 in 45db483
In fact that is one option for making cython get the paths correct:
$ cython --help
...
-w WORKING_PATH, --working WORKING_PATH
Sets the working directory for Cython (the directory modules are searched
from) |
Here is where the relative to sys.path paths are generated: cython/Cython/Compiler/Main.py Lines 484 to 499 in 45db483
|
Nice detective work!
That relative path from The other option may be to tell |
Using
If
Cython has to be able to figure out what is the base directory from which Python/Cython imports are relative so that it can resolve We also need to be able to collect multiple references to It is the
None of these assumptions works with meson. In fact even without meson I think that the path to record should be Then it is just necessary at coverage collection time to understand that the path is not necessarily relative to |
It seems that coverage itself assumes that the path is relative to CWD before calling the plugin. The path needs to be |
A workaround for this is to add a
|
The reason for this difference is that Cython is being run from the build directory
Is there a reason why Cython cannot run from the main project directory here? That would seem the usual setup to me.
|
I don't know enough about the internals of meson/ninja to give a good answer here. Generally the idea in the design is that the build directory can be anywhere and all commands are run from the build directory. When you run meson it generates a dependency graph of commands to be run and then ninja runs them all from the same directory. It is of course possible to make a wrapper that chdir's before running Turning the question around: should the I could understand This already works with cython/Cython/Compiler/Scanning.py Lines 194 to 200 in 45db483
cython/Cython/Compiler/ModuleNode.py Lines 969 to 971 in 45db483
Here basename(file_path) is basically never the right thing to do. I can understand the desire to avoid using an absolute path but basename throws away all directory information making two files in different locations with the same name indistinguishable.
This change would make it fall back to using the import-relative path ( diff --git a/Cython/Compiler/ModuleNode.py b/Cython/Compiler/ModuleNode.py
index 67511fc..f30aab6 100644
--- a/Cython/Compiler/ModuleNode.py
+++ b/Cython/Compiler/ModuleNode.py
@@ -968,7 +968,7 @@ class ModuleNode(Nodes.Node, Nodes.BlockNode):
for source_desc in code.globalstate.filename_list:
file_path = source_desc.get_filenametable_entry()
if isabs(file_path):
- file_path = basename(file_path) # never include absolute paths
+ file_path = source_desc.get_description() # never include absolute paths
escaped_filename = file_path.replace("\\", "\\\\").replace('"', r'\"')
escaped_filename = as_encoded_filename(escaped_filename)
code.putln('%s,' % escaped_filename.as_c_string_literal()) With that one change I could make everything else work outside of the Cython codebase by extending Cython's coverage plugin with one that knows where meson puts the C files. |
This is basically it, and it's not specific to only Meson. CMake is pretty similar, building a project with CMake may look like:
I don't know enough about Bazel & co to say for sure, but I suspect they may behave similarly. If Cython makes an implicit assumption about being run from the project root, that is in general not correct and should be avoided. It stems from a "
+1 relative paths are better especially for paths that end up in the final extensions modules (absolute paths are a hindrance to achieving reproducible builds).
Nice! |
relative paths are better especially for paths that end up in the final extensions modules
… or in the distributed sources. One reason why Cython tries to avoid absolute paths in the C code is to prevent leaking information about the build computer. That can easily be the machine of a random developer.
Also, the C files should generally not depend on the machine (or build system) that created it. Using relative paths from a specific build directory would contradict that goal.
Admittedly, this is difficult to achieve completely because we also store the dependencies in the C file header, which might include system header files.
I understand that the current way is not optimal, but I doubt that there's the one correct way to do it.
|
Thinking about what is "optimal" is hard but in my experience tooling that does not make guesses and does not use implicit assumptions is easier to maintain from the inside and easier to understand and use from the outside. From that perspective I would say that the first thing to do is make it so that all needed information can be provided explicitly and then secondly that defaults are clearly defined, documented and adhered to. From there though I would resist the temptation to guess if the defaults are incorrect in a given situation: the user is better served by an informative error message and should provide the correct information explicitly. A lot of complexity in the coverage plugin comes from trying to guess information that could have easily been supplied by the user if any mechanism were provided for them to do so. It is difficult to change anything now without breaking someone's workflow though. Once guessing is already implemented as part of the documented way of doing things then people will depend on it and will expect the guessing to be extended indefinitely to cover their different scenarios any time it fails. For the
Then all paths recorded in output are relative to I have suggested changing the |
Currently cython implicitly guesses that src is the --base-dir for some purposes but uses CWD for others.
It doesn't guess here, it follows the package directory structure up to the root package. Whether that's inside of a src, source, sources or whatever directory, or right in the project root, does not matter. What matters is the fully qualified module name.
It really only looks at the CWD and the main package directory.
That said, I'll consider using relative paths for files in foreign directories. One issue is where to stop. There isn't really a difference between
../src/pkg/module.py
and
../../../../../usr/lib/python3.10/site-packages/somepkg/module.pxd
The letter seems best represented with an absolute path. Anything in the middle is more or less unclear.
|
I suppose guess is the wrong word for how A directory does not need
Yes and this means if CWD is not a suitable location then relative paths have to be relative to the package directory. That is what the diff I suggested above does: if the main
then we get a file static const char *__pyx_f[] = {
"stuff/thing.pyx",
}; These relative paths would be found by coverage when running at project root and by default coverage would treat that as meaning
and then coverage will find At that point Cython's coverage plugin would not know how to find
Then spin's coverage plugin can find and parse all |
Describe your issue
There is a related
spin
issue: scientific-python/spin#182.I have created a repo to demonstrate how a
spin/meson
build of a Cython project looks and what is needed to get coverage of Cython code working. The README explains the problems:https://github.com/oscarbenjamin/cython_coverage_demo
There are two related things that prevent Cython's coverage plugin from working in a meson-based project:
The paths recorded in the generated C files in a meson build look like
When doing an inplace build with setuptools they instead look like:
I'm not sure if this is just something that should be changed in
spin/meson
when doing a coverage-enabled build or something. Possibly this is just the way thatcython
is invoked. In any case Cython's coverage plugin breaks if the path to the source file is not included here.The other problem is that Cython's coverage plugin assumes that all files are in-tree like:
This is not how meson ever lays things out. Instead in a
spin/meson
build:.c
or.so
files)..c
) are in thebuild
directory..so
and.py
) are assembled in thebuild-install
directory.It is the
build-install
directory that is actually used for running tests and generates the coverage data.When running tests with a coverage-enabled Cython build Cython's coverage plugin fails to find the files that are needed and then coverage reports warnings like:
I can get coverage working if I patch Cython like this:
I also need to patch the generated C files:
That Cython patch is not really suitable for a PR and so I am wondering what is the proper way to fix it. From my perspective it would be fine if there was a way to explicitly tell the coverage plugin what files are where rather than depending on any guesswork.
The text was updated successfully, but these errors were encountered: