Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9121 Innodb fails to update spatial index #5277

Open
wants to merge 1 commit into
base: 8.0
Choose a base branch
from

Conversation

VarunNagaraju
Copy link
Contributor

@VarunNagaraju VarunNagaraju commented Apr 2, 2024

https://perconadev.atlassian.net/browse/PS-9121

When a spatial index is used on a geomtery column which
contains values with more precision, the comparison
of minimum bounding rectangle(MBR) for the geometric values
can result in unexpected errors because of trunation/rounding
of data.

Since the underlying data type used for representing the
coordinates of MBR is double, it supports upto 15 digits after
the decimal place and any data after after those digits is subject
to rounding and truncated. The comparison uses math::equals() method
in Boost GIS which does approximate epsilon comparison for doubles.

So, another MBR comparison method is provided which does exact
comparison thereby improving the precision of comparison.

Comment on lines 88 to 91
long double epsilon = 1e-20;
long double min_corner_dist = boost::geometry::distance(box1.min_corner(), box2.min_corner());
long double max_corner_dist = boost::geometry::distance(box1.max_corner(), box2.max_corner());
if(fabsl(min_corner_dist) > epsilon || fabsl(max_corner_dist) > epsilon)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just giving a reference to the original code from boost::geometry 1.77.0

Suggested change
long double epsilon = 1e-20;
long double min_corner_dist = boost::geometry::distance(box1.min_corner(), box2.min_corner());
long double max_corner_dist = boost::geometry::distance(box1.max_corner(), box2.max_corner());
if(fabsl(min_corner_dist) > epsilon || fabsl(max_corner_dist) > epsilon)
if (!geometry::math::equals(get<min_corner, Dimension>(box1), get<min_corner, Dimension>(box2))
|| !geometry::math::equals(get<max_corner, Dimension>(box1), get<max_corner, Dimension>(box2)))

Copy link
Contributor

@dlenev dlenev Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Varun!

I think simply making epsilon used in comparison smaller, like you do (it seems that old code uses DBL_EPSILON for this purpose) won't work for all cases. For example, I get the same issue using the following test:

create table a (
id int primary key,
a geometry NOT NULL,
SPATIAL KEY  (a)
)ENGINE=InnoDB;

insert into a values (1,point(0.9e-25,0));
update a set a =        point(0.1e-24,  0);
delete from a where id = 1;
insert into a values (1,point(0.9e-25,0));

Which will be considered equal by your test...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion to solve this problem we should look into direction of making the MBR comparison in secondary index update case consistent/compatible with the way comparison is done during R-tree look up or perhaps studying if it is necessary in UPDATE at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the following addition to MBR comparison in update seems to fix both cases:

diff --git a/storage/innobase/row/row0upd.cc b/storage/innobase/row/row0upd.cc
index 35306b9cbcd..bcdf41318ae 100644
--- a/storage/innobase/row/row0upd.cc
+++ b/storage/innobase/row/row0upd.cc
@@ -1580,7 +1580,8 @@ bool row_upd_changes_ord_field_binary_func(dict_index_t *index,
         mem_heap_free(temp_heap);
       }
 
-      if (!mbr_equal_cmp(index->rtr_srs.get(), old_mbr, new_mbr)) {
+      if (!mbr_equal_cmp(index->rtr_srs.get(), old_mbr, new_mbr) ||
+          !mbr_within_cmp(index->rtr_srs.get(), new_mbr, old_mbr)) {
         return (true);
       } else {
         continue;

So perhaps it is worth looking into this direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dmitry,

I think simply making epsilon used in comparison smaller, like you do (it seems that old code uses DBL_EPSILON for this purpose) won't work for all cases.

I agree that reducing the epislon value to 1e-20 would just improve precision up to 20 digits after the decimal point.

In my opinion to solve this problem we should look into direction of making the MBR comparison in secondary index update case consistent/compatible with the way comparison is done during R-tree look up or perhaps studying if it is necessary in UPDATE at all.

I'm not sure what would be the performance hit if the secondary index is updated without comparison in UPDATE statements. But, the idea of having comparisons consistent with the R-tree look up ways sounds good to me.

Although, with the suggested patch there could be cases where the new_mbr values are smaller(spatially speaking) than the old_mbr in which the server crashes again with secondary index not being updated.

For e.g:- Consider the following testcase where new_mbr is covered by the old_mbr

CREATE TABLE gis_polygon(
    id INTEGER NOT NULL PRIMARY KEY,
    shape POLYGON NOT NULL SRID 0,
    SPATIAL KEY IND(shape)
) ENGINE=INNODB;

--echo Polygon
INSERT INTO gis_polygon VALUES
(1, ST_PolyFromText('POLYGON((0.1681156881122417 47.54646568683867,0.03144500032067299 47.88496876305664,0.8493347636841432 47.31022644042969,0.0318659991025925 47.310115814208984,0.1681156881122417 47.54646568683867))'));

UPDATE gis_polygon SET
shape = ST_PolyFromText('POLYGON((0.168115688112241675 47.54646568683867,0.031445000320673 47.88496876305664,0.8493347636841432 47.31022644042969,0.03186599910259247 47.310115814208984,0.168115688112241675 47.54646568683867))')
WHERE id = 1;

DELETE FROM gis_polygon WHERE id = 1;

INSERT INTO gis_polygon VALUES
(1, ST_PolyFromText('POLYGON((0.1681156881122417 47.54646568683867,0.03144500032067299 47.88496876305664,0.8493347636841432 47.31022644042969,0.0318659991025925 47.310115814208984,0.1681156881122417 47.54646568683867))'));

DROP TABLE gis_polygon;

So, to mitigate such cases how about we apply the same comparison again but checking if old_mbr is covered by new_mbr

diff --git a/storage/innobase/row/row0upd.cc b/storage/innobase/row/row0upd.cc
index 35306b9cbcd..091f9927b65 100644
--- a/storage/innobase/row/row0upd.cc
+++ b/storage/innobase/row/row0upd.cc
@@ -1580,7 +1580,9 @@ bool row_upd_changes_ord_field_binary_func(dict_index_t *index,
         mem_heap_free(temp_heap);
       }
 
-      if (!mbr_equal_cmp(index->rtr_srs.get(), old_mbr, new_mbr)) {
+      if (!mbr_equal_cmp(index->rtr_srs.get(), old_mbr, new_mbr) ||
+          !mbr_within_cmp(index->rtr_srs.get(), new_mbr, old_mbr) ||
+          !mbr_within_cmp(index->rtr_srs.get(), old_mbr, new_mbr)) {
         return (true);
       } else {
         continue;

If we do go this way, it raises a question whether all the mbr_equal_cmp comparisons require invoking mbr_within_cmp methods as well additionally. It becomes necessary to support queries like the following

SELECT MBREquals(
    ST_PolyFromText('POLYGON((-0.1681156881122417 47.54646568683867,-0.031445000320673 47.88496876305664,-0.8493347636841432 47.31022644042969,-0.0318659991025925 47.310115814208984,-0.1681156881122417 47.54646568683867))'),
    ST_PolyFromText('POLYGON((-0.168115688112241675 47.54646568683867,-0.03144500032067299 47.88496876305664,-0.849334763684143234 47.31022644042969,-0.03186599910259247 47.310115814208984,-0.168115688112241675 47.54646568683867))')
) as Equal;

Copy link
Contributor

@dlenev dlenev Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Varun!

Interesting and valid point:

Although, with the suggested patch there could be cases where the new_mbr values are smaller(spatially speaking) than the old_mbr in which the server crashes again with secondary index not being updated.

I have been digging in related code and discussing it with @satya-bodapati and we came to conclusion that rather than adding mbr_within_cmp() to mbr_equal_cmp() check in this place it makes sense to replace approximate comparison implemented by mbr_equal_cmp() with exact comparison:

diff --git a/sql/gis/rtree_support.h b/sql/gis/rtree_support.h
index 3c241b32e0b..12ad1b9aa29 100644
--- a/sql/gis/rtree_support.h
+++ b/sql/gis/rtree_support.h
@@ -91,6 +91,11 @@ bool mbr_contain_cmp(const dd::Spatial_reference_system *srs, rtr_mbr_t *a,
 bool mbr_equal_cmp(const dd::Spatial_reference_system *srs, rtr_mbr_t *a,
                    rtr_mbr_t *b);
 
+inline bool mbr_equal_precise_cmp(rtr_mbr_t *a, rtr_mbr_t *b) {
+  return a->xmin == b->xmin && a->xmax == b->xmax &&
+         a->ymin == b->ymin && a->ymax == b->ymax;
+}
+
 /// Checks if two MBRs intersect each other
 ///
 /// @param[in] srs Spatial reference system.
diff --git a/storage/innobase/row/row0upd.cc b/storage/innobase/row/row0upd.cc
index 35306b9cbcd..4db5a27b964 100644
--- a/storage/innobase/row/row0upd.cc
+++ b/storage/innobase/row/row0upd.cc
@@ -1580,7 +1580,7 @@ bool row_upd_changes_ord_field_binary_func(dict_index_t *index,
         mem_heap_free(temp_heap);
       }
 
-      if (!mbr_equal_cmp(index->rtr_srs.get(), old_mbr, new_mbr)) {
+      if (!mbr_equal_precise_cmp(old_mbr, new_mbr)) {
         return (true);
       } else {
         continue;

This is how the comparison was done in this place before support for geographical (non-planar) spatial data was added to the code. It is actually also equivalent to !mbr_within_cmp(...,old_mbr, new_mbr) || !mbr_within_cmp(...,new_mbr, old_mbr) condition that you are suggesting :)
Finally, this is consistent with how InnoDB R-tree code compares MBRs in the cmp_gis_field()/cmp_geometry_field() calls (there is also rtree_key_cmp() call which seem to use mbr_equal_cmp(), but it actually never called with the parameter necessary to trigger mbr_equal_cmp() execution!).

If we do go this way, it raises a question whether all the mbr_equal_cmp comparisons require invoking mbr_within_cmp methods as well additionally.

Good question, I have been thinking about this and digging in code as well...

I think we should try to look at mbr_equal_cmp() usage and think if they are related to the above code/can cause similar assertion failures and crashes. For example, at least one usage of mbr_equal_cmp() in row0sel.cc seems to be related to the above code (as it was added to fix CHECK TABLE issue similarly to the above check in row0upd.cc). Otherwise, I think we are better to produce minimal patch, submit it as contribution to Upstream and see what they think about it/how they will fix the problem.

https://perconadev.atlassian.net/browse/PS-9121

When a spatial index is used on a geomtery column which
contains values with more precision, the comparison
of minimum bounding rectangle(MBR) for the geometric values
can result in unexpected errors because of trunation/rounding
of data.

Since the underlying data type used for representing the
coordinates of MBR is double, it supports upto 15 digits after
the decimal place and any data after after those digits is subject
to rounding and truncated. The comparison uses math::equals() method
in Boost GIS which does approximate epsilon comparison for doubles.

So, another MBR comparison method is provided which does exact
comparison thereby improving the precision of comparison.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

///
/// @retval true The two MBRs are equal.
/// @retval false The two MBRs aren't equal.
inline bool mbr_equal_precise_cmp(rtr_mbr_t *a, rtr_mbr_t *b) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-identifier-length ⚠️
parameter name a is too short, expected at least 2 characters

///
/// @retval true The two MBRs are equal.
/// @retval false The two MBRs aren't equal.
inline bool mbr_equal_precise_cmp(rtr_mbr_t *a, rtr_mbr_t *b) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-identifier-length ⚠️
parameter name b is too short, expected at least 2 characters

Copy link
Contributor

@dlenev dlenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Varun!

Great work!
I have a few suggestions about the new patch version though:

First of all, I think that we should start the commit comment
with a description of the problem from the users point of view.
This can be useful for Documentation Team when as a source
for Change Log entries and for later readers of commit history.

We can write something like:

Errors about InnoDB unable "to purge non-delete-marked record in index"
occurred in error log when geometry column with a spatial index on it
was updated in a such a way that MBRs for old value and new value were
close enough (for example, when difference was smaller than 1.0e-16
for values below 1).
Moreover server might abort due to assertion failure if later the
updated row was deleted and then its old version was re-inserted again.

Secondly, I think we need to change the commit message to provide
more detailed and bit more correct description of the problem.
Your current comment focuses on the double rounding and truncation
while, IMO, the real issue is discrepancy between approximate
MBR comparison in one place of InnoDB code and exact comparison
in other places.

For example we can say:

The problem occurs due to combination of several factors:

  • MySQL uses double floating-point type to store coordinates in
    Geometry values and do related calculations.
  • Update code in InnoDB tries to avoid updating spatial index value
    if MBR of new value doesn't differ from the old one. To compare
    MBRs for old and new values it uses mbr_equal_cmp() function,
    which nowadays uses Boost GIS to do this.
    The latter uses Boost math::equals() method for this comparison
    which does approximate epsilon comparison for doubles.
  • Other code in InnoDB, specifically purge code uses exact comparison
    of MBRs for lookups in spatial index.

As result of the above discrepancy we get problem with purge described
above (purge does not associate old MBR in spatial index with new
row version/value and tries to delete along with old row version)
and later assertion failures.

Note that back when the above optimization to update code was
introduced mbr_equal_cmp() was doing exact MBR comparison, so the
problem didn't exist. This has been changed when support for
geographic coordinate systems were added.

Then we can finish by saying something like:

We solve this problem by providing another MBR comparison method
which does exact comparison and using it in the Update code
(and thus effectively revert to old, pre-bug behavior as result).

@@ -0,0 +1,73 @@
# **************************************************
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea to mention issue/bug number in description of the test case.
E.g. we can say:

PS-9121 : InnoDB updates the primary index but not the spatial index, crashing MySQL.

Test if MBR comparison used for InnoDB secondary index updates works well for values
which are close to each other.

@@ -91,6 +91,21 @@ bool mbr_contain_cmp(const dd::Spatial_reference_system *srs, rtr_mbr_t *a,
bool mbr_equal_cmp(const dd::Spatial_reference_system *srs, rtr_mbr_t *a,
rtr_mbr_t *b);

/// Checks if two MBRs are equal more precisely
///
/// For both MBRs, the coordinates of the MBR's minimum corners must be smaller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it is worth to add asserts enforcing this similarly to those we have in mbr_equal_cmp() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants