Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce basic linking with the Clang driver. #3922

Merged
merged 1 commit into from Apr 30, 2024

Conversation

chandlerc
Copy link
Contributor

This adds the most rudimentary support for linking in the Carbon driver by calling out to the Clang driver and letting it do the linking. This is a super minimal version to start with, just enough to be somewhat useful and to exercise calling into Clang for this.

Also adds a Clang runner to help run the Clang driver. This is a bit more complete, although it sill needs some significant work. For example, it will need some significant enhancements to be able to compile C++ code, etc.

One thing that will likely change here is to better manage streams and work better as a library. For now, this is a bit limited because Clang and LLVM don't provide the necessary support for this.

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good!

toolchain/driver/driver.h Outdated Show resolved Hide resolved
// Allocate one chunk of storage for the actual C-strings and a vector of
// pointers into the storage.
llvm::OwningArrayRef<char> cstr_arg_storage(total_size);
llvm::SmallVector<const char*, 64> cstr_args;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 64 going to be the default value? Why do you specify it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the pointers, I wouldn't expect the small size to be 64 pointers? I thought it was roughly a cache line (8 pointers)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, 64 bytes in CalculateSmallVectorDefaultInlinedElements.

Still though, why do you specify 64 here? Is there some method you're using to choose the number of arguments? Maybe a comment could be added to explain the source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add a comment above about avoiding allocations more of the time, in large part to explain using a custom number here.

But not sure I can give a better or more specific number at this point.

Is that OK? I was trying to address previous comments about small vector sizes, but not sure it's actually working....

toolchain/driver/clang_runner.cpp Outdated Show resolved Hide resolved
toolchain/driver/clang_runner.cpp Outdated Show resolved Hide resolved
toolchain/driver/clang_runner.h Outdated Show resolved Hide resolved
toolchain/driver/clang_runner.h Outdated Show resolved Hide resolved
Comment on lines +93 to +100
std::filesystem::path test_file =
test_tmpdir / llvm::formatv("{0}_{1}_{2}", test_info->test_suite_name(),
test_info->name(), name_suffix)
.str();
// Make debugging a bit easier by cleaning up any files from previous runs.
// This is only necessary when not run in Bazel's test environment.
std::filesystem::remove(test_file);
CARBON_CHECK(!std::filesystem::exists(test_file));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had you considered using gtest's teardown to remove created files, rather than deleting name collisions this way? You might then also be able to create a simpler filename, rather than the complex formula.

Note this may also be masking a duplicate "foo.o" mistake below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this masked the error, and happy to consider how to avoid this.

However, I don't quite see how using teardown would help with the complex temporary formula... I added this in another test and I thought this was needed in part because of test sharding and such.... But maybe I'm mis-remembering.

And when debugging, it was specifically useful to support either TEST_TMPDIR or the system temporary directory, and to have the file be left around after runs.

So I could switch this to the teardown, but I think it'd be somewhat of a wash in terms of code complexity and as I think the complexity above would still be needed, and would trade off debugging a specific failure for masking the typo.... I'm not really sure that's a benefit?

But also not hugely attached to the particular solution I have here, so maybe you see something that is more all-around better?

Copy link
Contributor

@jonmeow jonmeow Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I don't quite see how using teardown would help with the complex temporary formula... I added this in another test and I thought this was needed in part because of test sharding and such.... But maybe I'm mis-remembering.

I've never heard of this issue with sharding: each shard gets its own tmpdir.

That said, you might consider something like llvm::sys::fs::createUniqueDirectory(std::filesystem::temp_directory_path() / test_info->test_suite_name(), result_path) [either off the TEST_TMPDIR path, or using test_tmpdir as the prefix instead], then you shouldn't need to remove files and would still have files available after execution.

https://llvm.org/doxygen/namespacellvm_1_1sys_1_1fs.html#abf677459f5137cfb49f6d81b27ea54a1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about that, but it seemed to be even more complexity to be managing directories in addition to files...

Anyways, happy to revisit this if there is a better pattern, or maybe both this and the other place where we have similar technique can be simplified if I misunderstood the issue or there is a better way to avoid it.

toolchain/driver/clang_runner_test.cpp Outdated Show resolved Hide resolved
toolchain/driver/driver.cpp Show resolved Hide resolved
@jonmeow
Copy link
Contributor

jonmeow commented Apr 29, 2024

Is it worth updating the sieve compile to use this? Or is that more a separate PR?

toolchain/driver/clang_runner.h Outdated Show resolved Hide resolved
toolchain/driver/clang_runner_test.cpp Outdated Show resolved Hide resolved
toolchain/driver/driver.cpp Outdated Show resolved Hide resolved
This adds the most rudimentary support for linking in the Carbon driver
by calling out to the Clang driver and letting it do the linking. This
is a super minimal version to start with, just enough to be somewhat
useful and to exercise calling into Clang for this.

Also adds a Clang runner to help run the Clang driver. This is a bit
more complete, although it sill needs some significant work. For
example, it will need some significant enhancements to be able to
compile C++ code, etc.

One thing that will likely change here is to better manage streams and
work better as a library. For now, this is a bit limited because Clang
and LLVM don't provide the necessary support for this.
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(think I got all the comments, PTAL)

toolchain/driver/driver.h Outdated Show resolved Hide resolved
toolchain/driver/driver.cpp Show resolved Hide resolved
toolchain/driver/clang_runner_test.cpp Outdated Show resolved Hide resolved
toolchain/driver/clang_runner.cpp Outdated Show resolved Hide resolved
// Allocate one chunk of storage for the actual C-strings and a vector of
// pointers into the storage.
llvm::OwningArrayRef<char> cstr_arg_storage(total_size);
llvm::SmallVector<const char*, 64> cstr_args;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the pointers, I wouldn't expect the small size to be 64 pointers? I thought it was roughly a cache line (8 pointers)?

toolchain/driver/clang_runner.cpp Outdated Show resolved Hide resolved
Comment on lines +93 to +100
std::filesystem::path test_file =
test_tmpdir / llvm::formatv("{0}_{1}_{2}", test_info->test_suite_name(),
test_info->name(), name_suffix)
.str();
// Make debugging a bit easier by cleaning up any files from previous runs.
// This is only necessary when not run in Bazel's test environment.
std::filesystem::remove(test_file);
CARBON_CHECK(!std::filesystem::exists(test_file));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this masked the error, and happy to consider how to avoid this.

However, I don't quite see how using teardown would help with the complex temporary formula... I added this in another test and I thought this was needed in part because of test sharding and such.... But maybe I'm mis-remembering.

And when debugging, it was specifically useful to support either TEST_TMPDIR or the system temporary directory, and to have the file be left around after runs.

So I could switch this to the teardown, but I think it'd be somewhat of a wash in terms of code complexity and as I think the complexity above would still be needed, and would trade off debugging a specific failure for masking the typo.... I'm not really sure that's a benefit?

But also not hugely attached to the particular solution I have here, so maybe you see something that is more all-around better?

toolchain/driver/clang_runner_test.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, just not the minor thing with verbose().

Also, note in tests:

FAIL: //toolchain/driver:clang_runner_test (see /private/var/tmp/_bazel_runner/e591f63ed099023de1f206992dfce127/execroot/_main/bazel-out/darwin_arm64-opt/testlogs/toolchain/driver/clang_runner_test/test.log)
==================== Test output for //toolchain/driver:clang_runner_test:
INFO: From Testing //toolchain/driver:clang_runner_test:
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from ClangRunnerTest
[ RUN      ] ClangRunnerTest.Version
================================================================================

There seems to be something weird going on there, in opt mode. I'm assuming any fix would be small though.

// Allocate one chunk of storage for the actual C-strings and a vector of
// pointers into the storage.
llvm::OwningArrayRef<char> cstr_arg_storage(total_size);
llvm::SmallVector<const char*, 64> cstr_args;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, 64 bytes in CalculateSmallVectorDefaultInlinedElements.

Still though, why do you specify 64 here? Is there some method you're using to choose the number of arguments? Maybe a comment could be added to explain the source?

toolchain/driver/clang_runner.cpp Outdated Show resolved Hide resolved
Comment on lines +93 to +100
std::filesystem::path test_file =
test_tmpdir / llvm::formatv("{0}_{1}_{2}", test_info->test_suite_name(),
test_info->name(), name_suffix)
.str();
// Make debugging a bit easier by cleaning up any files from previous runs.
// This is only necessary when not run in Bazel's test environment.
std::filesystem::remove(test_file);
CARBON_CHECK(!std::filesystem::exists(test_file));
Copy link
Contributor

@jonmeow jonmeow Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I don't quite see how using teardown would help with the complex temporary formula... I added this in another test and I thought this was needed in part because of test sharding and such.... But maybe I'm mis-remembering.

I've never heard of this issue with sharding: each shard gets its own tmpdir.

That said, you might consider something like llvm::sys::fs::createUniqueDirectory(std::filesystem::temp_directory_path() / test_info->test_suite_name(), result_path) [either off the TEST_TMPDIR path, or using test_tmpdir as the prefix instead], then you shouldn't need to remove files and would still have files available after execution.

https://llvm.org/doxygen/namespacellvm_1_1sys_1_1fs.html#abf677459f5137cfb49f6d81b27ea54a1

@chandlerc
Copy link
Contributor Author

(merging as discussed, this seems in a good enough state for the initial cut)

@chandlerc chandlerc added this pull request to the merge queue Apr 30, 2024
Merged via the queue into carbon-language:trunk with commit 36d8d29 Apr 30, 2024
7 checks passed
@chandlerc chandlerc deleted the link0 branch April 30, 2024 18:22
chandlerc added a commit to chandlerc/carbon-lang that referenced this pull request May 2, 2024
This adds the most rudimentary support for linking in the Carbon driver
by calling out to the Clang driver and letting it do the linking. This
is a super minimal version to start with, just enough to be somewhat
useful and to exercise calling into Clang for this.

Also adds a Clang runner to help run the Clang driver. This is a bit
more complete, although it sill needs some significant work. For
example, it will need some significant enhancements to be able to
compile C++ code, etc.

One thing that will likely change here is to better manage streams and
work better as a library. For now, this is a bit limited because Clang
and LLVM don't provide the necessary support for this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants