Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a helper script to copy files from cluster #50

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 2 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,20 +80,8 @@ It is related to https://github.com/nbiederbeck/lst-agn-analysis/issues/26
If you have run snakemake on the cluster, you can create the plots and tex files locally (using your own matplotlibrc for example).
We separate the calculation of metrics and the plotting to make sure you can finetune plots later on without needing to
run the expensive steps on the local machine. The tables for that are saved as either `fits.gz` or `h5`.

For the data-selection plots, you need to download `build/dl1-datacheck-masked.h5`, e.g.:

```
mkdir -p build
scp <host>:<path-to>/lst-data-selection/build/dl1-datachecks-masked.h5 build/
```

Afterwards:
We create a helper script for that that uses `rsync`. Check how to use it:

```
make -f local.mk
./copy-from-cluster.py --help
```

DEV-TODO: Do the same for the other plots (https://github.com/nbiederbeck/lst-agn-analysis/issues/29)
If you do some `cp **/*.fits.gz` shenanigans, beware that the dl3 files are saved with
the extension `fits.gz` as well.
71 changes: 71 additions & 0 deletions copy-from-cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#!/usr/bin/env python3
from argparse import ArgumentParser
from subprocess import run

parser = ArgumentParser(
description="Copy built data files from the cluster to locally create plots.",
)
parser.add_argument(
"--hostname",
help="The hostname as configured in you ~/.ssh/config, e.g. `cp01`.",
required=True,
)
parser.add_argument(
"--remote-path",
help="Path to your directory on the cluster, "
"e.g. `/fefs/aswg/workspace/<username>/lst-agn-analysis/build`. "
"Can be absolute (start with `/`) otherwise it is relative to the home. "
"Get the absoulte path of any directory with "
"`realpath <directory>` on the cluster.",
required=True,
)
parser.add_argument(
"--exclude",
help="Patterns to additionally exclude from copying. Comma separated string. "
"When you use globs (`*`), remember to quote it in single quotes, "
"e.g. `--exclude='*.pdf,*.png'`.",
default=None,
type=str,
)
parser.add_argument(
"--rsync-args",
help="Custom command line arguments for rsync. Check `man rsync` for info. "
"Try `--rsync-args='-nv'` for a verbose dry-run.",
default="",
type=str,
)
args = parser.parse_args()

exclude_patterns = [
# run files
"dl1_*.h5",
"dl2_*.h5",
"dl3_*.fits.gz",
# DL4 Datasets
"phaobs_*.fits",
"dl4/*/datasets.fits.gz",
# log files
"*.log",
"logs/*",
# models
"models/model*",
# plots
"*.pdf",
]


def main():
rsync = "rsync -auh --info=progress2 "
rsync += args.rsync_args + " "
for pat in exclude_patterns:
rsync += f"--exclude='{pat}' "
if args.exclude is not None:
for pat in args.exclude.split(","):
rsync += f"--exclude='{pat}' "
cmd = f"{rsync} '{args.hostname}:{args.remote_path.rstrip('/')}' ."
print(cmd)
run(cmd, shell=True, capture_output=False, check=True)


if __name__ == "__main__":
main()
18 changes: 0 additions & 18 deletions lapalma.sh

This file was deleted.