Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General PR slow CI #30540

Merged
merged 2 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ on:
pull_request:
paths:
- "src/transformers/models/*/modeling_*.py"
- "tests/models/*/test_*.py"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nothing changed in those paths, the CI won't be triggered at all
(even if the commit message specifies some models)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep for now - I'm not sure whether we want this. It's good for the auto model selection, but I can see as being annoying for the run-slow case: I might update e.g. modeling_attn_mask_utils.py and want to be able to easily run the slow tests on a subset of important models

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will see 🚀 !


concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To cancel the CI runs trigger by previous commits of the same PR


env:
HF_HOME: /mnt/cache
Expand All @@ -20,31 +25,46 @@ env:
CUDA_VISIBLE_DEVICES: 0,1

jobs:
check_for_new_model:
find_models_to_run:
runs-on: ubuntu-22.04
name: Check if a PR is a new model PR
name: Find models to run slow tests
# Triggered only if the required label `run-slow` is added
if: ${{ contains(github.event.pull_request.labels.*.name, 'run-slow') }}
outputs:
new_model: ${{ steps.check_new_model.outputs.new_model }}
models: ${{ steps.models_to_run.outputs.models }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: ${{ github.event.pull_request.head.sha }}

- name: Get commit message
run: |
echo "commit_message=$(git show -s --format=%s)" >> $GITHUB_ENV

- name: Check if there is a new model
id: check_new_model
- name: Get models to run slow tests
run: |
echo "${{ env.commit_message }}"
python -m pip install GitPython
echo "new_model=$(python utils/check_if_new_model_added.py | tail -n 1)" >> $GITHUB_OUTPUT
python utils/pr_slow_ci_models.py --commit_message "${{ env.commit_message }}" | tee output.txt
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that (renamed) script will give the models to run (either new model or from the commit message)

echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV

- name: Models to run slow tests
id: models_to_run
run: |
echo "${{ env.models }}"
echo "models=${{ env.models }}" >> $GITHUB_OUTPUT

run_models_gpu:
name: Run all tests for the new model
# Triggered if it is a new model PR and the required label is added
if: ${{ needs.check_for_new_model.outputs.new_model != '' && contains(github.event.pull_request.labels.*.name, 'single-model-run-slow') }}
needs: check_for_new_model
name: Run all tests for the model
# Triggered only `find_models_to_run` is triggered (label `run-slow` is added) which gives the models to run
# (either a new model PR or via a commit message)
if: ${{ needs.find_models_to_run.outputs.models != '[]' }}
needs: find_models_to_run
strategy:
fail-fast: false
matrix:
folders: ["${{ needs.check_for_new_model.outputs.new_model }}"]
folders: ${{ fromJson(needs.find_models_to_run.outputs.models) }}
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, ci]
container:
Expand Down
57 changes: 53 additions & 4 deletions utils/check_if_new_model_added.py → utils/pr_slow_ci_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,20 @@
# limitations under the License.

"""
This script is used to get the directory of the modeling file that is added in a pull request (i.e. a new model PR).
This script is used to get the models for which to run slow CI.

A new model added in a pull request will be included, as well as models specified in a commit message with a prefix
`[run-slow]`, `[run_slow]` or `[run slow]`. For example, the commit message `[run_slow]bert, gpt2` will give `bert` and
`gpt2`.

Usage:

```bash
python utils/check_if_new_model_added.py
python utils/pr_slow_ci_models.py.py
```
"""

import argparse
import re
from pathlib import Path
from typing import List
Expand Down Expand Up @@ -82,7 +87,7 @@ def get_new_python_files() -> List[str]:
return get_new_python_files_between_commits(repo.head.commit, branching_commits)


if __name__ == "__main__":
def get_new_model():
new_files = get_new_python_files()
reg = re.compile(r"src/transformers/(models/.*)/modeling_.*\.py")

Expand All @@ -93,4 +98,48 @@ def get_new_python_files() -> List[str]:
new_model = find_new_model[0]
# It's unlikely we have 2 new modeling files in a pull request.
break
print(new_model)
return new_model


def parse_commit_message(commit_message: str) -> str:
"""
Parses the commit message to find the models specified in it to run slow CI.

Args:
commit_message (`str`): The commit message of the current commit.

Returns:
`str`: The substring in `commit_message` after `[run-slow]`, [run_slow]` or [run slow]`. If no such prefix is
found, the empty string is returned.
"""
if commit_message is None:
return ""

command_search = re.search(r"\[([^\]]*)\](.*)", commit_message)
if command_search is None:
return ""

command = command_search.groups()[0]
command = command.lower().replace("-", " ").replace("_", " ")
run_slow = command == "run slow"
if run_slow:
models = command_search.groups()[1].strip()
return models
else:
return ""


def get_models(commit_message: str):
models = parse_commit_message(commit_message)
return [f"models/{x}" for x in models.replace(",", " ").split()]


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--commit_message", type=str, default="", help="The commit message.")
args = parser.parse_args()

new_model = get_new_model()
specified_models = get_models(args.commit_message)
models = ([] if new_model == "" else [new_model]) + specified_models
print(models)