Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrency issues in Performance Stats code #4910

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

adamretter
Copy link
Member

Cleanup the code and fix concurrency issues in the Performance Stats code

@adamretter adamretter added the bug issue confirmed as bug label May 9, 2023
@adamretter adamretter added this to the eXist-7.0.0 milestone May 9, 2023
@adamretter adamretter requested review from dizzzz and reinhapa May 9, 2023 16:08
%test:assertXPath("$result//stats:index[@type = 'range'][@optimization = 2]")
%test:assertXPath("$result//stats:index[@type eq 'range'][@optimization-level eq 'Optimized']")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will code that expects the performance stats to return <stats:index optimization="2"> need to be updated to read <stats:index optimization-level="Optimized"/> etc.? Monex, for example reads @optimization here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joewiz yes, indeed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm ok. then we need to have a plan to update monex here as well, whilst keeping that app compatible with older exist versions?

@dizzzz
Copy link
Member

dizzzz commented May 9, 2023

Codewise it looks OK to me. Better JMX support?

Copy link
Member

@dizzzz dizzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build fails :-(

@adamretter adamretter force-pushed the refactor/cleanup-performance-stats branch from a405132 to 783cbdd Compare May 10, 2023 08:23
@line-o
Copy link
Member

line-o commented May 10, 2023

Please update the description of this PR to include information about the breaking change(s) that it introduces.

@line-o line-o self-requested a review May 10, 2023 10:48
Copy link
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces breaking changes and fails to communicate that clearly in the commit messages.
Please extract the breaking changes into a single commit so that we can discuss them separately.

@adamretter
Copy link
Member Author

Please extract the breaking changes into a single commit so that we can discuss them separately.

This is not something we have ever done before! So why do you believe that this PR is "special" in some way to any others that target an upcoming major release?

@line-o
Copy link
Member

line-o commented May 10, 2023

As the changes are not necessary to fix concurrency issues they should not be part of this PR.

Copy link
Member

@dizzzz dizzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, so some additional updates need to be put in place in apps to keep things compatible.

@adamretter
Copy link
Member Author

adamretter commented May 11, 2023

As the changes are not necessary to fix concurrency issues they should not be part of this PR.

When has that ever been a requirement of PRs against a major version? It might be sensible to put them in separate commits perhaps, but separate PRs is unnecessary!

@adamretter adamretter force-pushed the refactor/cleanup-performance-stats branch 4 times, most recently from b138c00 to eebf365 Compare May 16, 2023 17:58
@line-o
Copy link
Member

line-o commented May 16, 2023

@adamretter In yesterdays community call I was convinced that speaking labels are favourable over magical integer constants. We also found that the attribute name returned by the stats package should stay @optimization and not be changed to @optimization-level as it is both shorter and also better to understand. This is especially true for the case where no index is found and thus no optimisation can be used.
What are your thoughts on this?

@adamretter
Copy link
Member Author

adamretter commented May 17, 2023

@Optimization and not be changed to @optimization-level

I intentionally changed this because it was misleading. It does not describe the optimization that was performed, which is what it previously implied. Rather it describes the "level of optimization" that was able to be applied - and this is how it was described in the Java code. So I corrected it so that it actually reflects what it is.

@adamretter adamretter force-pushed the refactor/cleanup-performance-stats branch from eebf365 to c769150 Compare May 18, 2023 09:21
@adamretter
Copy link
Member Author

@reinhapa @dizzzz Is this one now ready to go?

@dizzzz
Copy link
Member

dizzzz commented May 18, 2023

during the teleconference we talked about the naming:

enum IndexOptimizationLevel {
NONE,
BASIC,
OPTIMIZED;
}

OptimizationLevel= OPTIMIZED does not reflect i kind of "amount" of optimisation was applied. "FULL" would do that much better, or "OPTIMAL". Maybe it is a kind language thingy ?

@adamretter
Copy link
Member Author

adamretter commented May 18, 2023

@dizzzz I agree that optimisation-level="optimized" is terrible!
The difficultly here is that both BASIC and OPTIMIZED mean that the index is being used, the difference is that with BASIC it executes two steps: analyze and prepare. Whereas with OPTIMIZED it is able to only execute analyze as during that phase of the query execution it has all the dependencies it needs to perform the index query at that time.

I think we need better words than BASIC and OPTIMIZED but I have no idea how to name them to describe that.

This is not a good suggestion, but the only one that comes to mind: TWO_STEPS and ONE_STEP...

Perhaps @joewiz has an idea for phrasing here?

@joewiz
Copy link
Member

joewiz commented May 25, 2023

How about if we expressed this information using 2 boolean attributes instead of 1 string attribute:

  1. index-present=yes|no ("yes" when index can be found; "no" when not)
  2. fully-optimized=yes|no ("yes" when all dependencies present; "no" when further preparation is needed or when index is not present)

So there are still 3 possibilities, but they're expressed explicitly through the combination of these two boolean attributes:

  1. No index: index-present=no + fully-optimized=no
  2. Basic: index-present=yes + fully-optimized=no
  3. Full: index-present=yes + fully-optimized=yes

@line-o
Copy link
Member

line-o commented May 26, 2023

@joewiz I like your proposed approach but would like to shorten the attribute names.

<stats:index used="yes|no" optimized="yes|no" />

level label @used @optimized
0 No Index no no
1 Basic yes no
2 Full yes yes

@adamretter
Copy link
Member Author

@joewiz My thoughts:

index-present=yes|no ("yes" when index can be found; "no" when not)

That seems fine.

fully-optimized

I don't think this is a good term. There are "degrees" of optimisation. There are also optimisations that can be applied without an index, and IMHO we should not close the door of being able to report on them in future.

@adamretter
Copy link
Member Author

@line-o My thoughts:

optimized="yes|no"

This has the same problem as before. This does not express the degrees of optimisation possible.

@adamretter adamretter closed this May 26, 2023
@adamretter adamretter reopened this May 26, 2023
@line-o
Copy link
Member

line-o commented May 26, 2023

@adamretter there is only one level of optimisation possible and only if an index is found/used.

@adamretter
Copy link
Member Author

there is only one level of optimisation possible and only if an index is found/used.

I am afraid that is not correct.

@sonarcloud
Copy link

sonarcloud bot commented May 26, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug B 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 10 Code Smells

71.7% 71.7% Coverage
0.0% 0.0% Duplication

@reinhapa
Copy link
Member

there is only one level of optimisation possible and only if an index is found/used.

I am afraid that is not correct.

@adamretter @line-o I would suggest to discuss this on with all commenters on a community meeting if possible as it seems to me easier to exchange the intends and possible approaches....

@joewiz
Copy link
Member

joewiz commented Jul 24, 2023

As discussed on the Community Call today, we seem agreed on the idea of returning two boolean attributes and that one should be called index-present (or used), and we are still seeking an acceptable name for the attribute I suggested, fully-optimized. We noted that these attributes are children of the <stats:index> element, and therefore that the attributes only describe index-related performance statistics - and not any other "optimisations that can be applied without an index". Perhaps this simplifies the criteria for these attributes' names? Does anyone have any suggestions for a good name?

@adamretter
Copy link
Member Author

adamretter commented Jul 25, 2023

As far as I am aware from studying the code it is not possible to present a result like index-present=yes|no at the position in the code where the index is queried There is really only 1 big Lucene Index behind the scenes that may or may not contain things you are looking for. The "Index definitions" in eXist-db are only really used for describing what to index, they are not entirely used for selection at index query time.

The Lucene Index in eXist-db in fact does not actually know if there is an index or not for the particular expression it is looking up, instead it queries Lucene based on just the element or attribute name (ignoring fields and facets for now), which often returns more results than it should, later in other expressions these results are filtered out so that hopefully you end up with the correct results... although I am finding plenty of cases lately (not related to facets) which were inspired by #3207 whereby eXist-db returns far too many results

@line-o
Copy link
Member

line-o commented Jul 26, 2023

If the index use is not really consistent, then it is misleading to even present that data under <stats:index>.
How about renaming the element then to stats:optimization to free it from the notion it is strictly tied to indexes?

<stats:optimization level="[none|basic|full]">

@adamretter adamretter force-pushed the refactor/cleanup-performance-stats branch from c769150 to 1ee9ba7 Compare October 14, 2023 12:23
@adamretter
Copy link
Member Author

adamretter commented Oct 14, 2023

to even present that data under stats:index.
How about renaming the element then to stats:optimization

I am not sure that makes sense as these optimizations are index guided.

As this has been open now for 6 months, and is a clear improvement on the status quo, I suggest we just merge this and move on; rather than splitting hairs over naming! It can always be improved in the future as can everything else in eXist-db.

@dizzzz @reinhapa Can you discuss together, and consider merging this please?

@sonarcloud
Copy link

sonarcloud bot commented Oct 14, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug B 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 10 Code Smells

72.4% 72.4% Coverage
0.0% 0.0% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@dizzzz
Copy link
Member

dizzzz commented Oct 30, 2023

Does this require an money update? if not, I am happy.

@adamretter
Copy link
Member Author

adamretter commented Oct 30, 2023

Does this require an money update?

@dizzzz I am happy to receive any money!

@dizzzz
Copy link
Member

dizzzz commented Oct 30, 2023

I ment.... Monex :-P

@reinhapa reinhapa self-requested a review November 20, 2023 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants