Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Object domain #44

Draft
wants to merge 69 commits into
base: dev
Choose a base branch
from
Draft

New Object domain #44

wants to merge 69 commits into from

Conversation

LinerSu
Copy link
Contributor

@LinerSu LinerSu commented Feb 14, 2022

Introduce a new object domain based on DSA node and cache operation.

@LinerSu LinerSu force-pushed the object-domain branch 2 times, most recently from 197d6be to d07a649 Compare February 22, 2022 05:22
error_if_not_rgn(x);

if (!is_bottom()) {
auto it = m_flds_rep_map.find(x);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be cleaned up. there are functions for this functionality

} else {
// read from odi map
// weak read
auto obj_dom = m_odi_map[rep];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is better.

Here, and everywhere, you should be very careful with copy-assignments. I believe that

auto obj_dom = ...

is very expensive and should be

auto &obj_dom = ...

However, you sometimes re-assign obj_dom, so you can't use a reference all the time, but should only copy when necessary.

Copy link

@agurfinkel agurfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

Copy link

@agurfinkel agurfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some memory related issues. the rest is good.

does this run on our benchmarks? How does it compare with vstte?

@@ -414,6 +414,11 @@ template <class Dom> class abstract_domain_api:
dom.write(o);
return o;
}

friend bool operator==(const Dom& a1, const Dom& a2) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong place. Better to refactor patricia_tree. I've sent request to @caballa

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is equality of an abstract domain. what do you want to refactor in patricia_tree?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this equality. It is wrong. Forget about it.

Next thread. Patricia tree data structure requires various properties of Value type. In particular, it assumes that operator==() for Value is defined. It should not. This is not good C++ design. Instead, it should be parameterized by another parameter EqualsTo = std::equals_to<Value>

If it depends on other standard properties of Value, they should be exposed in the same way.
If you want, I can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I saw now your comments in slack

return it->second;
}

void get_rgn_vec(variable_t rep, variable_vector_t &rgn_vec) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename get_obj_flds, flag as potential efficiency issue

}

// The odi must exist for each region after instrisic calls
variable_t rep = get_odi(rgn);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I missed a function name problem. There are two separate functions:

get_odi -- returns an object descriptor given object id
get_oid -- returns an object id, given a field.

This should be get_oid, and rep should be called oid.

It would be better to create a new type oid_t for object id

// At this point, the requirement for 2 is satisfied

// The odi must exist for each region after instrisic calls
variable_t rep = get_odi(rgn);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_oid

m_base_dom -= res; // forget res means res == top
// the requirment 3.1 is satisfied
} else {
field_abstract_domain_t *obj_dom_ref_no_const =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caballa please suggest const-safe way to get an interval

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two issues with returning an interval. First, the underlying domain might not keep any interval representation. That's why the returned interval is by value. Second, the method interval_t operator[](...) is not marked as const because some domains (e.g., octagons) needs to apply normalization before producing the interval. I could have a second const version that cannot normalize.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A C++ convention is to provide a method called at() that is same as operator[] but const.

The domain should not expose its internal normalization through const. It should hide it using mutable.
At the very least, you want to provide a const at method, that, for example, does const cast under the hood.

auto obj_dom_ptr = m_odi_map.find(rep);
field_abstract_domain_t *res_obj_dom_ptr;
if (obj_dom_ptr) { // found in odi map, create new one by copying old one
res_obj_dom_ptr = new field_abstract_domain_t((*obj_dom_ptr));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never ever ever allocate anything on the heap, if you ever need to use new, ask.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest of the code is ok, just rewrite the code so that the allocation is on the stack

}
}

/********************** Boolean operations **********************/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stopped at this point

@LinerSu
Copy link
Contributor Author

LinerSu commented Feb 23, 2022

@agurfinkel About how object domain compared with VSTTE, I walked through this example.

entry:
  region_init(V_i:region(int));
  region_init(V_x:region(int));
  region_init(V_y:region(int));
  i := make_ref(V_i:region(int),4,as_0);
  x := make_ref(V_x:region(int),4,as_1);
  y := make_ref(V_y:region(int),4,as_2);
  store_to_ref(V_i:region(int),i:ref,0:int32);
  store_to_ref(V_x:region(int),x:ref,1:int32);
  store_to_ref(V_y:region(int),y:ref,0:int32);
  goto bb1;
bb1:
  goto bb1_t,bb1_f;
bb1_t:
  *i:int32 := load_from_ref(V_i:region(int),i:ref);
  assume(*i <= 99);
  goto bb2;
bb2:
  *x:int32 := load_from_ref(V_x:region(int),x:ref);
  *y:int32 := load_from_ref(V_y:region(int),y:ref);
  *x = *x+*y;
  store_to_ref(V_x:region(int),x:ref,*x:int32);
  *y:int32 := load_from_ref(V_y:region(int),y:ref);
  *y = *y+1;
  store_to_ref(V_y:region(int),y:ref,*y:int32);
  *i:int32 := load_from_ref(V_i:region(int),i:ref);
  *i = *i+1;
  store_to_ref(V_i:region(int),i:ref,*i:int32);
  goto bb2_end;
bb2_end:
  goto bb1;
bb1_f:
  *i:int32 := load_from_ref(V_i:region(int),i:ref);
  assume(-*i <= -100);
  goto bb3;
bb3:
  *x:int32 := load_from_ref(V_x:region(int),x:ref);
  *y:int32 := load_from_ref(V_y:region(int),y:ref);
  goto ret;
ret:
  assert(-*x+*y <= 0);

I use splitDBM for region domain and object domain, the VSTTE can prove assertion in ret block. The object domain does not.
The invariants for object domain are:

entry={}
bb1=Base = {i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

bb1_f=Base = {i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

bb3=Base = {*i -> [100, +oo], i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

ret=Base = {*i -> [100, +oo], *x -> [1, +oo], *y -> [0, +oo], i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

bb1_t=Base = {i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

bb2=Base = {*i -> [0, 99], i-objA<=0, x-objA<=4, y-objA<=8, objA-i<=0, x-i<=4, y-i<=8, objA-x<=-4, i-x<=-4, y-x<=4, x-y<=-4, objA-y<=-8, i-y<=-8},
Object( V_i V_x V_y ) = {V_i -> [0, +oo], V_x -> [1, +oo], V_y -> [0, +oo]}

Here, we lost V_i == V_x in fields domain or *i == *x in base domain. One reason I think is we lost V_i == *i and V_x == *x because the load_ref or store_ref is assigning intervals instead of keeping equality in a domain.

@agurfinkel
Copy link

@LinerSu looks cool. At this point, I only care about performance. Precision is expected to be very low.

@LinerSu LinerSu force-pushed the object-domain branch 3 times, most recently from 700fdc7 to d47ec8c Compare February 24, 2022 20:24
- change global object-id map (locate) function to local
- add efficient join / meet operations on odi map
A symbolic equality domain is an abstract domain which represents equalities
used in analyses for allocation-site abstraction or domain reduction.
In short, giving two program variables are known to hold equal values in concrete semantics
if the analyzer determines that the variables hold equal symbolic term.
A special seperate domain for odi map. Especially, it covers
different cases during domain operations such as join or meet.
object.reduction_level=FULL_REDUCTION to perform reduction before
each transfer function;
object.reduction_level=REDUCTION_BEFORE_CHECK to perform
reduction only before assertion;
object.reduction_level=NO_REDUCTION for no reduction.
- `object_domain`: avoid deep copy during updating odi values;
support new statistics; refactor cache operations.
- `odi_map_domain`: perform cache flushing before join / meet
each odi value; refactor cache operations.
- `symbolic equality domain`: supports new join operation in
quadratic running time
@igcontreras
Copy link

It seems that you are adding new commits by @caballa and others to master, can you update so that we only see the changes of your commits in the review?

@LinerSu LinerSu changed the base branch from master to dev April 21, 2023 19:53
Copy link

@igcontreras igcontreras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass over equality domain

}
};

template <class InputIt1, class InputIt2>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

document this function, what does it do? why is it needed?


template <class InputIt1, class InputIt2>
bool set_is_absolute_complement(InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use begin/end names if they are iterators

bool set_is_absolute_complement(InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2) {
while (first1 != last1 && first2 != last2) {
if (*first1 < *first2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is assumed that the elements in the set are ordered, this should be documented


template <class Domain> class equivalence_class {
private:
std::size_t m_size;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to use size_t for this field? How big do we expect equivalence classes to be?

return SYMBVAR(term::term_op_val_generator_t::get_next_val());
}

template <class Domain> class equivalence_class {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If equivalence_class is not going to be used outside of this file, you can hide it and make it a struct. Then you don't need to make getters and setters

/// @param elements a vector of elements required to keep
/// @note After project operation, the state only keeps equivalence classes
/// for element in vector elements
void project(const std::vector<element_t> &elements) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this maintaining the sizes of the classes?

break;
}
}
element_t new_rep = keep_rep ? rep : s[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create new representatives? Can't you just forget what is not in elements?

CRAB_ERROR(domain_name(),
"::rename with input vectors of different sizes");
}
for (unsigned i = 0, size = old_elements.size(); i < size; ++i) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use iterators, they are more efficient

++it;
}
if (parent_old_v) {
// insert a key-value pair for new_v

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to replace a key instead of remove and insert?

for (auto &kv : equiv_classes) {
const element_t &rep = kv.first;
const element_set_t &s = kv.second;
if (s.size() == 1) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what we talked about. If our equality domains are spread out, the value of the class matters even if the class is of size 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using this domain? Should this be removed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants