Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize secondary task recollection #2800

Draft
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

ABrain7710
Copy link
Contributor

Description

Overall

  • Optimized pull_request_files, pull_request_commits, and pull_request_reviews when they are recollected

Implementation Details

  • When building list of repos to collect make the list contain a full_collection boolean flag so start_data_collection knows what to pass
  • In AugurTaskRoutine.start_data_collection iterate through the tuples and pass the repo_git and full_collection flag to each phase
  • Add full_collection flag to every phase's method arguments. This had to be done since AugurTaskRoutine.start_data_collection passes the same arguments for every phase
  • Define database methods to get_secondary_data_last_collected and get prs that have been updated since a date
  • Updated pull_request_files to only get pr numbers for updated prs if full_collection flag is false
  • Updated pull_request_commits to only get pr urls for updated prs if full_collection flag is false
  • Updated pull_request_reviews to only get pr numbers for updated prs if full_collection flag is false

Notes for Reviewers
I have not tested this yet, I will change it from a draft pr to a pr when it is test

Signed commits

  • Yes, I signed my commits.

@ABrain7710 ABrain7710 changed the base branch from main to dev May 17, 2024 02:46
Signed-off-by: Andrew Brain <[email protected]>
@@ -1,7 +1,7 @@
import sqlalchemy as s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0114: Missing module docstring (missing-module-docstring)

@@ -1,7 +1,7 @@
from celery import chain

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0114: Missing module docstring (missing-module-docstring)

@@ -1,7 +1,7 @@
from celery import chain
import logging

def machine_learning_phase(repo_git):
def machine_learning_phase(repo_git, full_collection):
from augur.tasks.data_analysis.clustering_worker.tasks import clustering_task

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0415: Import outside toplevel (augur.tasks.data_analysis.clustering_worker.tasks.clustering_task) (import-outside-toplevel)

@@ -1,7 +1,7 @@
from celery import chain
import logging

def machine_learning_phase(repo_git):
def machine_learning_phase(repo_git, full_collection):
from augur.tasks.data_analysis.clustering_worker.tasks import clustering_task
from augur.tasks.data_analysis.discourse_analysis.tasks import discourse_analysis_task

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0415: Import outside toplevel (augur.tasks.data_analysis.discourse_analysis.tasks.discourse_analysis_task) (import-outside-toplevel)

@@ -1,7 +1,7 @@
from celery import chain
import logging

def machine_learning_phase(repo_git):
def machine_learning_phase(repo_git, full_collection):
from augur.tasks.data_analysis.clustering_worker.tasks import clustering_task
from augur.tasks.data_analysis.discourse_analysis.tasks import discourse_analysis_task
from augur.tasks.data_analysis.insight_worker.tasks import insight_task

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0415: Import outside toplevel (augur.tasks.data_analysis.insight_worker.tasks.insight_task) (import-outside-toplevel)

@@ -166,7 +166,7 @@ def build_primary_repo_collect_request(session,enabled_phase_names, days_until_c
primary_gitlab_enabled_phases.append(primary_repo_collect_phase_gitlab)

#task success is scheduled no matter what the config says.
def core_task_success_util_gen(repo_git):
def core_task_success_util_gen(repo_git, full_collection):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0613: Unused argument 'full_collection' (unused-argument)

@@ -186,7 +186,7 @@ def build_secondary_repo_collect_request(session,enabled_phase_names, days_until

secondary_enabled_phases.append(secondary_repo_collect_phase)

def secondary_task_success_util_gen(repo_git):
def secondary_task_success_util_gen(repo_git, full_collection):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0613: Unused argument 'full_collection' (unused-argument)

@@ -202,12 +202,12 @@ def build_facade_repo_collect_request(session,enabled_phase_names, days_until_co

facade_enabled_phases.append(facade_phase)

def facade_task_success_util_gen(repo_git):
def facade_task_success_util_gen(repo_git, full_collection):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0613: Unused argument 'full_collection' (unused-argument)

return facade_task_success_util.si(repo_git)

facade_enabled_phases.append(facade_task_success_util_gen)

def facade_task_update_weight_util_gen(repo_git):
def facade_task_update_weight_util_gen(repo_git, full_collection):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0613: Unused argument 'full_collection' (unused-argument)

@@ -222,7 +222,7 @@ def build_ml_repo_collect_request(session,enabled_phase_names, days_until_collec

ml_enabled_phases.append(machine_learning_phase)

def ml_task_success_util_gen(repo_git):
def ml_task_success_util_gen(repo_git, full_collection):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0613: Unused argument 'full_collection' (unused-argument)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant