Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce basic linking with the Clang driver. #3922

Merged
merged 1 commit into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
36 changes: 36 additions & 0 deletions toolchain/driver/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,49 @@ filegroup(
data = glob(["testdata/**/*.carbon"]),
)

cc_library(
name = "clang_runner",
srcs = ["clang_runner.cpp"],
hdrs = ["clang_runner.h"],
deps = [
"//common:command_line",
"//common:ostream",
"//common:vlog",
"@llvm-project//clang:basic",
"@llvm-project//clang:driver",
"@llvm-project//clang:frontend",
"@llvm-project//llvm:Core",
"@llvm-project//llvm:Support",
"@llvm-project//llvm:TargetParser",
],
)

cc_test(
name = "clang_runner_test",
size = "small",
srcs = ["clang_runner_test.cpp"],
deps = [
":clang_runner",
"//common:all_llvm_targets",
"//common:check",
"//common:ostream",
"//testing/base:gtest_main",
"//testing/base:test_raw_ostream",
"@googletest//:gtest",
"@llvm-project//llvm:Object",
"@llvm-project//llvm:Support",
"@llvm-project//llvm:TargetParser",
],
)

cc_library(
name = "driver",
srcs = ["driver.cpp"],
hdrs = ["driver.h"],
data = ["//core:prelude"],
textual_hdrs = ["flags.def"],
deps = [
":clang_runner",
"//common:command_line",
"//common:vlog",
"//toolchain/base:value_store",
Expand Down
152 changes: 152 additions & 0 deletions toolchain/driver/clang_runner.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
// Part of the Carbon Language project, under the Apache License v2.0 with LLVM
// Exceptions. See /LICENSE for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

#include "toolchain/driver/clang_runner.h"

#include <algorithm>
#include <memory>
#include <numeric>
#include <optional>

#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/DiagnosticOptions.h"
#include "clang/Driver/Compilation.h"
#include "clang/Driver/Driver.h"
#include "clang/Frontend/CompilerInvocation.h"
#include "clang/Frontend/TextDiagnosticPrinter.h"
#include "common/command_line.h"
#include "common/vlog.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/Program.h"
#include "llvm/Support/VirtualFileSystem.h"
#include "llvm/TargetParser/Host.h"

namespace Carbon {

static auto GetExecutablePath(llvm::StringRef exe_name) -> std::string {
// If the `exe_name` isn't already a valid path, look it up.
if (!llvm::sys::fs::exists(exe_name)) {
if (llvm::ErrorOr<std::string> path_result =
llvm::sys::findProgramByName(exe_name)) {
return *path_result;
}
}

return exe_name.str();
}

ClangRunner::ClangRunner(llvm::StringRef exe_name, llvm::StringRef target,
llvm::raw_ostream* vlog_stream)
: exe_name_(exe_name),
exe_path_(GetExecutablePath(exe_name)),
target_(target),
vlog_stream_(vlog_stream),
diagnostic_ids_(new clang::DiagnosticIDs()) {}

auto ClangRunner::Run(llvm::ArrayRef<llvm::StringRef> args) -> bool {
// TODO: Maybe handle response file expansion similar to the Clang CLI?

// If we have a verbose logging stream, and that stream is the same as
// `llvm::errs`, then add the `-v` flag so that the driver also prints verbose
// information.
bool inject_v_arg = vlog_stream_ == &llvm::errs();
std::array<llvm::StringRef, 1> v_arg_storage;
llvm::ArrayRef<llvm::StringRef> maybe_v_arg;
if (inject_v_arg) {
v_arg_storage[0] = "-v";
maybe_v_arg = v_arg_storage;
}

CARBON_CHECK(!args.empty());
CARBON_VLOG() << "Running Clang driver with arguments: \n";

// Render the arguments into null-terminated C-strings for use by the Clang
// driver. Command lines can get quite long in build systems so this tries to
// minimize the memory allocation overhead.
std::array<llvm::StringRef, 1> exe_arg = {exe_name_};
auto args_range =
llvm::concat<const llvm::StringRef>(exe_arg, maybe_v_arg, args);
int total_size = 0;
for (llvm::StringRef arg : args_range) {
// Accumulate both the string size and a null terminator byte.
total_size += arg.size() + 1;
}

// Allocate one chunk of storage for the actual C-strings and a vector of
// pointers into the storage.
llvm::OwningArrayRef<char> cstr_arg_storage(total_size);
llvm::SmallVector<const char*, 64> cstr_args;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 64 going to be the default value? Why do you specify it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the pointers, I wouldn't expect the small size to be 64 pointers? I thought it was roughly a cache line (8 pointers)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, 64 bytes in CalculateSmallVectorDefaultInlinedElements.

Still though, why do you specify 64 here? Is there some method you're using to choose the number of arguments? Maybe a comment could be added to explain the source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add a comment above about avoiding allocations more of the time, in large part to explain using a custom number here.

But not sure I can give a better or more specific number at this point.

Is that OK? I was trying to address previous comments about small vector sizes, but not sure it's actually working....

cstr_args.reserve(args.size() + inject_v_arg + 1);
for (ssize_t i = 0; llvm::StringRef arg : args_range) {
cstr_args.push_back(&cstr_arg_storage[i]);
memcpy(&cstr_arg_storage[i], arg.data(), arg.size());
i += arg.size();
cstr_arg_storage[i] = '\0';
++i;
}
for (const char* cstr_arg : llvm::ArrayRef(cstr_args).drop_front()) {
CARBON_VLOG() << " '" << cstr_arg << "'\n";
}

CARBON_VLOG() << "Preparing Clang driver...\n";

// Create the diagnostic options and parse arguments controlling them out of
// our arguments.
llvm::IntrusiveRefCntPtr<clang::DiagnosticOptions> diagnostic_options =
clang::CreateAndPopulateDiagOpts(cstr_args);

// TODO: We don't yet support serializing diagnostics the way the actual
// `clang` command line does. Unclear if we need to or not, but it would need
// a bit more logic here to set up chained consumers.
clang::TextDiagnosticPrinter diagnostic_client(llvm::errs(),
diagnostic_options.get());

clang::DiagnosticsEngine diagnostics(
diagnostic_ids_, diagnostic_options.get(), &diagnostic_client,
/*ShouldOwnClient=*/false);
clang::ProcessWarningOptions(diagnostics, *diagnostic_options);

clang::driver::Driver driver(exe_path_, target_, diagnostics);

// TODO: Directly run in-process rather than using a subprocess. This is both
// more efficient and makes debugging (much) easier. Needs code like:
// driver.CC1Main = [](llvm::SmallVectorImpl<const char*>& argv) {};
std::unique_ptr<clang::driver::Compilation> compilation(
driver.BuildCompilation(cstr_args));
CARBON_CHECK(compilation) << "Should always successfully allocate!";
if (compilation->containsError()) {
// These should have been diagnosed by the driver.
return false;
}

CARBON_VLOG() << "Running Clang driver...\n";

llvm::SmallVector<std::pair<int, const clang::driver::Command*>>
failing_commands;
int result = driver.ExecuteCompilation(*compilation, failing_commands);

// Finish diagnosing any failures before we verbosely log the source of those
// failures.
diagnostic_client.finish();

CARBON_VLOG() << "Execution result code: " << result << "\n";
for (const auto& [command_result, failing_command] : failing_commands) {
CARBON_VLOG() << "Failing command '" << failing_command->getExecutable()
<< "' with code '" << command_result << "' was:\n";
if (vlog_stream_) {
failing_command->Print(*vlog_stream_, "\n\n", /*Quote=*/true);
}
}

// Return whether the command was executed successfully.
return result == 0 && failing_commands.empty();
}

} // namespace Carbon
60 changes: 60 additions & 0 deletions toolchain/driver/clang_runner.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
// Part of the Carbon Language project, under the Apache License v2.0 with LLVM
// Exceptions. See /LICENSE for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

#ifndef CARBON_TOOLCHAIN_DRIVER_CLANG_RUNNER_H_
#define CARBON_TOOLCHAIN_DRIVER_CLANG_RUNNER_H_

#include "clang/Basic/DiagnosticIDs.h"
#include "common/ostream.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"

namespace Carbon {

// Runs Clang in a similar fashion to invoking it with the provided arguments on
// the command line. We use a textual command line interface to allow easily
// incorporating custom command line flags from user invocations that we don't
// parse, but will pass transparently along to Clang itself.
//
// This doesn't literally use a subprocess to invoke Clang; it instead tries to
// directly use the Clang command line driver library. We also work to simplify
// how that driver operates and invoke it in an opinionated way to get the best
// behavior for our expected use cases in the Carbon driver:
//
// - Minimize canonicalization of file names to try to preserve the paths as
// users type them.
// - Minimize the use of subprocess invocations which are expensive on some
// operating systems. To the extent possible, we try to directly invoke the
// Clang logic within this process.
// - Provide programmatic API to control defaults of Clang. For example, causing
// verbose output.
//
// Note that this makes the current process behave like running Clang -- it uses
// standard output and standard error, and otherwise can only read and write
// files based on their names described in the arguments. It doesn't provide any
// higher-level abstraction such as streams for inputs or outputs.
class ClangRunner {
public:
// Build a Clang runner that uses the provided `exe_name` and `err_stream`.
//
// If `verbose` is passed as true, will enable verbose logging to the
// `err_stream` both from the runner and Clang itself.
ClangRunner(llvm::StringRef exe_name, llvm::StringRef target,
llvm::raw_ostream* vlog_stream = nullptr);

// Run Clang with the provided arguments.
auto Run(llvm::ArrayRef<llvm::StringRef> args) -> bool;

private:
llvm::StringRef exe_name_;
std::string exe_path_;
llvm::StringRef target_;
llvm::raw_ostream* vlog_stream_;

llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs> diagnostic_ids_;
};

} // namespace Carbon

#endif // CARBON_TOOLCHAIN_DRIVER_CLANG_RUNNER_H_