-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve engine events to expose internal workflow #3881
base: master
Are you sure you want to change the base?
Improve engine events to expose internal workflow #3881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few things to address:
- To have a better overview we should also emit information of the exceptions (unHide), similarly to how it is being done for the network matches.
- how about we have one new event, lets call it
match
that would fire for both network and cosmetic filters. It's params should look alike for both types. (don't remove old event, they important for backwards compatibility)
I see it like this:
const context = { tabId: 1, frameId: 2 };
engine.getCosmeticsFilters(oldParams, withMetadata, context);
engine.match(request, withMetadata, context);
engine.on('match', (context, result) => {
console.log(context, result.match, result.exception, result.metadata);
});
Notice withMetadata
for getCosmeticsFilters
this is another capability to remember about. The result for the cosmetic filters should include the metadata in the same way the network result does.
@remusao I'd like to get a feedback about the way passing "context" to matching functions. It might help a lot for extension developers often used to involve @chrmod This is actually an off-topic but if we involve Passing an external context can solve many complex situations but I'd like to go simple as possible. For example, we could decorate the passed |
console.log('extended-rule-matched', rule, context); | ||
}); | ||
|
||
blocker.on('style-rule-matched', (rule, context) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could make it clearer in the naming that this is a hostname-based march of a cosmetic rule. Also, that it does not mean it marched in the page, just that it was selected by the engine as a candidate to be injected (I think it’s important to make this clear to avoid confusion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need that many new events. How about we introduce one filter-matched
which would emit:
- a filter, cosmetic or network
- a function name used to acquire the filter, like
match
,getHtmlFilters
,getCosmeticsFilters
- function arguments, which can include additional info like the
context: any
. For network filters it is most important to get the request.
Additionally, would like to set a naming convention for the public interfaces for the filters. Sometimes we call those things rules
(for instance in context of Cosmetic bucket), but on the interface level I would strony suggest we always call both network and cosmetic "rule" with a name of filters
.
This will be a contrast to the DNR, where after conversion the filters become the rules.
In the last changes, the events were narrowed down to two: Also, the commit addresses the followings:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add filter-matched
for network filters as well? that would be in engine.match
and engine.matchAll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like while we develop this feature it started to change direction. All good, but it also got a bit complex. Lets try to simplify a bit before moving forward.
To recap the goal: We want to introduce new events that would allow us to implement a logger that is able to list matched filters and exceptions in a context of a given tab.
@@ -1035,6 +1195,8 @@ export default class FilterEngine extends EventEmitter< | |||
|
|||
// Emit events if we found a match | |||
if (result.exception !== undefined) { | |||
this.emit('filter-matched', result.exception, context); | |||
|
|||
this.emit('request-whitelisted', request, result); | |||
} else if (result.redirect !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we emit redirects as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is treated as a normal filter (result.filter
) before being registered as result.redirect
. The event should be triggered right after the filter match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let me test it locally before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add unit tests for each of the event with cases where they should/shouldn't trigger?
Also, please do run the benchmark to validate that there isn't extra overhead with the additional data-structures we are creating on each call to engine matching functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks to @remusao comments I've realised that we should populate the reference
in all the helper libraries, that is adblocker-webextension
, adblocker-puppeteer
, and adblocker-playwrights
. If we wont, customers of those libraries wont be able to distinguish the source of events from getCosmeticsFilters
so for examples in browsers, those events wont be useful.
d9c6b86
to
ba99f8a
Compare
Benchmarking resultsseia-soto:emit-cosmetic-filter-matches Details
master Details
|
Base: Avg serialization time (100 samples): 190.044 μs (+ 59.886 μs) Total requests: 233,347 Number of samples: 94,294 Number of samples: 139,053 Number of samples: 233,347 |
Updated benchmarking. Skipping: serialization, list parsing, and initialization time - as these are not affected by this PR For matched filters (94k samples), the maximum matching time increased a lot: from 1,770.5 μs to 4,453.25 μs. This can be seen as a big deal but if we look at the average: from 6.62 μs to 6.906 μs - , it shows that the processing time grows by following the linear graph. For exception + unmatched filters (139k samples), the maximum matching time decreased: from 5,513.208 μs to 3,402.334 μs. This is because this PR dropped maximum 3 comparisons of if statements for exceptions if there's no listeners. For the overall result (94k+139k samples), see the following:
Full logs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets try to figure out if we can fix the performance degradation
After the optimisation, the performance overhead is in minimal considering the CPU time difference coming from list parsing time. For the matching process: min -0.041 μs, max -12.501 μs, avg +0.016 μs Full logs
|
|
||
// Additional context given from user | ||
matchType: FilterType.COSMETIC; | ||
reference: any; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already commented on the naming of reference
in another thread. Would it also make sense to type it with a generic type parameter T
instead of any
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed that and IMO a generic type would influence all signatures by putting a strong emphasis on itself, even if no listeners are in use.
What I mean, new users of for example getCosmeticsFilters
, should not think about such T
.
url, | ||
domain, | ||
hostname, | ||
|
||
classes, | ||
hrefs, | ||
ids, | ||
|
||
allowGenericHides, | ||
allowSpecificHides, | ||
|
||
getBaseRules, | ||
getInjectionRules, | ||
getExtendedRules, | ||
getRulesFromDOM, | ||
getRulesFromHostname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Would it make sense to reuse part of the object we already create in the arguments for the function call above instead of duplicating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like this?
const context: CosmeticFilterMatchingContext = {
...arguments[0],
isFilterExcluded: undefined,
filterType: FilterType.COSMETIC,
};
Co-authored-by: Remi <[email protected]>
Co-authored-by: remi <[email protected]>
Co-authored-by: Remi <[email protected]>
|
||
export type CosmeticFilterMatchingContext = CosmeticFilterMatchingContextBase & | ||
Partial< | ||
Omit<Parameters<FilterEngine['getCosmeticsFilters']>[0], 'callerContext'> & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not needed
this.emit('request-allowed', request, result); | ||
if (this.hasListeners()) { | ||
if (result.exception !== undefined) { | ||
this.emit('filter-matched', result.exception, context); // Emit if an exception matched |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we have individual hasListeners
checks for each event type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The computation cost is higher in that case (requires checks for more than 4 events in a row). The listener check is also happening in individual function.
fixes #3876