Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inter-analysis): integreate dsa info into callsite information #54

Open
wants to merge 16 commits into
base: dev
Choose a base branch
from

Conversation

LinerSu
Copy link
Contributor

@LinerSu LinerSu commented Jan 18, 2023

Giving a method to group region variables passed/returned between caller and callee based on the DSA intrinsics.
The information is stored in the callsite_info class for further abstract operations of calls.
For instance, from caller to callee

  • The caller input parameter lists: (a, b, V1, V2, V3, V4, V5)
  • The callee input parameter lists: (c, d, V7, V8, V9, V10, V11)
  • Intrinsics in crabIR:
crab_intrinsic(regions_from_memory_object, V7, V8, V9);
crab_intrinsic(regions_from_memory_object, V10, V11);

The information to group region variables is {{V1, V2, V3}, {V4, V5}} for the caller's inputs and {{V7, V8, V9}, {V10, V11}} for the callee's inputs.

domains::callsite_info<variable_t> callsite {cs.get_func_name(), cs.get_args(), cs.get_lhs(),
fdecl.get_inputs(), fdecl.get_outputs()};
std::vector<std::vector<variable_t>> formal_cls, actual_cls;
group_regions_by_dsa_intrinsics(callee_cfg, cs.get_args(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be computed only once. We should compute it before the whole analysis starts, cache it and then just use it at each call. It should be cached in m_ctx similar to get_widening_set().

intrinsic_stmts.reserve(entry_bb.size());
for (auto const &stmt : entry_bb) {
if (stmt.is_intrinsic()) {
auto intrinsic_stmt = dynamic_cast<const intrinsic_statement_t &>(stmt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be static_cast since we know it cannot fail and we don't want to rely on RTTI

@@ -1032,6 +1032,104 @@ class top_down_inter_transformer final
// crab::CrabStats::count("Interprocedural.num_calling_contexts");
}

/// @brief group regions variables in parameter lists based on target's dsa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do source and target mean? Is it source = caller and target = callee ? If yes, use plese caller/callee

// is the same as the output list in the dsa intrinsic
// The worst case of the following loop is O(m * n) where m is the number of
// parameters and n is the number of dsa intrinsic statements
unsigned int index = 0, param_lst_sz = src_params.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is just executed once (before the analysis starts) performance might not be an issue but the current algorithm is far from ideal. As you mention, for each argument we go over all intrinsics. This is completely unnecessary. I would process the intrinsics only once and put them in some sort of union-find. Then, for each argument in the caller you get the argument in the callee and then you can query the union-find to see all the elements in the equivalence class. Note that you don't a super fancy union-find implementation. You don't even need to write a separate union-find class. Just have an internal map from a variable to its parent and do standard path-compression when you do the queries.

gretadolcetti and others added 7 commits April 3, 2023 22:24
The decoupled domain allows crab to use two different domains during
the two different phases of the analysis.  In the descending
(narrowing) phase it is possible to use an abstract domain which is
more precise than the domain used in the ascending (widening) phase.

The decoupled domains require two new abstract operations
is_asc_phase() and set_phase() in order to manage ascending and
descending phases. By default, start in *descending* (i.e., more
precise) phase. The interleaved fixpoint iterator has been also
modified to notify the underlying domains when there is change of
(ascending or descending) phase.

Implemented by Greta Dolcetti and Enea Zaffanella.
We minimize string allocation and copies.  Also, we add some macros
that allow each domain to turn on or off statistics collection which
adds a counter and a timer per operation. If stats collection is off
then no string allocation or string copy should take place.

For instance, the region domain enables stats by adding

#define REGION_DOMAIN_SCOPED_STATS(NAME) CRAB_DOMAIN_SCOPED_STATS(NAME, 1)

And the interval domain disables stats by adding

#define INTERVALS_DOMAIN_SCOPED_STATS(NAME) CRAB_DOMAIN_SCOPED_STATS(NAME, 0)

The macro CRAB_DOMAIN_SCOPED_STATS is defined in
crab/domains/abstract_domain_macros.def

In spite of all optimizations, the most efficient thing is to disable
stats collection. By default, only the fixpoint solver and the region
domain turn on stats. This is okay because stats gathering is a
feature intended only for developers. If needed then the developer can
turn on more stats timers and counters and recompile the code.
caballa and others added 7 commits April 14, 2023 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants