Pipeline execution on CONP datasets #63

glatard · 2019-06-13T14:28:44Z

We should streamline the processing of CONP datasets with CONP pipelines, possibly by reviving https://github.com/CONP-PCNO/conp-pipeline

paiva · 2019-06-19T14:22:31Z

I agree with this suggestion. How would you like to proceed?

glatard · 2019-06-28T19:54:05Z

This has two aspects:

Following a discussion with @shots47s. From the portal, the frontend should be developed to launch a specific pipeline on a specific dataset using CBRAIN's REST API. The new CBRAIN GUI will soon provide widgets to facilitate that.
From the command line, this is already possible using Boutiques+DataLad. I don't think we need to add anything specific there.

shots47s · 2019-07-03T13:11:25Z

I think we should do the CBRAIN execution in two stages:

redirect to the portal as a first pass (i.e. send over to CBRAIN the information about which dataset and pipeline to be executed) and let the users use CBRAIN to run it.
Then move to providing modular UI components from our new interface and run the jobs through the CBRAIN API. It may not be necessary then to code up an actual connection to the API, because the React Components will have the baked in.

cmadjar · 2020-05-14T19:01:09Z

@glatard should this issue be closed?

cmadjar · 2020-05-14T19:01:38Z

actually, I will close it. Feel free to reopen it if you think there is still work involved on that issue.

glatard · 2020-05-14T19:04:57Z

On the CBRAIN front it would be useful to have a tighter integration than just redirecting to the login page. We should check with the CBRAIN team if point 2. in @shots47s' list above would be doable.

natacha-beck · 2020-05-14T20:10:28Z

We should discuss about the new interface in the coming weeks, I will bring the point 2 to this discussion.

cmadjar · 2020-09-30T15:07:21Z

Discussed briefly at the CONP dev call of September 30th, 2020.

Will focus and split that issue in smaller tasks at the next CONP dev call (October 7th).

@glatard should we invite people from the CBRAIN team to the next CONP dev call to discuss the plan? If so, who should be invited?

glatard · 2020-10-07T12:23:35Z

Here are a few possible actions regarding this issue, organized in four Goals summarized below. All goals can be worked on in parallel, except Goal 3 as it depends on 1 and 2.

Goal 1: Run CONP pipelines in CBRAIN

Tasks

Make sure that all CONP pipelines that are available in CBRAIN appear as such in the CONP portal.
When user click on the CBRAIN button in a CONP pipeline, redirect to the pipeline launch page instead of the generic CBRAIN login.

How

Point 2 most likely requires storing a CBRAIN tool config id for each pipeline, preferably as a config file also available on GitHub for easier update. This design would also solve point 1, as a pipeline will be assumed to be installed in CBRAIN if and only if it has a valid tool config id. When registering config ids, one should make sure that they match the exact same pipeline (boutiques descriptor) than registered in CONP.

Who

CONP developers (@cmadjar, @mandana-mazaheri), liaise with @natacha-beck to get tool config ids.

Goal 2: Process CONP datasets in CBRAIN

Task

Create a CBRAIN data provider for the whole CONP dataset, access datasets on this data provider or create a CBRAIN data provider for each CONP dataset.
Store a CBRAIN data provider id for each dataset.
In each dataset page, add a link to redirect to the CBRAIN dataset page in the CBRAIN portal.

How

The ideal solution would be to use CBRAIN's DataLad data provider. Otherwise, install and download the datasets on a server (suggestion: Beluga, to facilitate processing), and register this location as a regular CBRAIN data provider. Make sure that simple pipelines (Diagnostics) can be run on the files. In any case, new datasets should be created automatically (either create a new data provider or register new files to an existing data provider).

The CBRAIN data provider id should be stored using a mechanism similar to the one used to store CBRAIN tool config ids (see previous point). Suggestion: JSON file available in the portal config on GitHub.

Who

This is on the CBRAIN roadmap. Need to make sure that the CBRAIN datalad provider works as expected.
Liaise with CONP developers for DataLad expertise.

Notes

Something specific has to be done for datasets that require authentication. The CBRAIN team will manually configure permissions.

Goal 3: Process CONP datasets in CBRAIN using CONP pipelines

Tasks

In the CONP portal, create an interface to select a pipeline from a dataset, and/or to select a dataset from a pipeline
From this interface, redirect to a pre-populated CBRAIN launch form

How

Needs discussion, it might be a bit tricky, as fine-grained file selection in the dataset might be necessary.

Who

CONP portal developers: @liamocn, @xlecours

Goal 4: Analytics on pipeline execution

Task

Create a dashboard of CONP pipeline executions on CONP datasets. This dashboard would track executions done in and outside of CBRAIN.

How

Regularly upload Boutiques provenance from CBRAIN and any other execution platform.
Pull Boutiques provenance records and present them in graphs

Who

@mandana-mazaheri for the provenance dashboard, liaise with @nbeck for provenance upload from CBRAIN.

cmadjar · 2020-10-07T15:39:01Z

Goal 1 will be tracked in Run CONP pipelines in CBRAIN #347
Goal 2 will be tracked in Process CONP datasets in CBRAIN #348
Goal 3 will be tracked in Process CONP datasets in CBRAIN using CONP pipelines #349
Goal 4 will be tracked in Analytics on pipeline execution #350

cmadjar · 2021-04-30T15:40:57Z

ooooops, closed the wrong issue.

github-actions · 2021-09-28T02:11:59Z

This issue is stale because it has been open 5 months with no activity. Remove stale label or comment or this will be closed in 3 months.

github-actions · 2021-12-27T02:26:19Z

This issue was closed because it has been stalled for 3 months with no activity.

glatard added the PIPELINES label Jun 13, 2019

paiva added the discussion-required label Jun 19, 2019

glatard mentioned this issue Jun 28, 2019

Pipeline execution in CBRAIN #59

Closed

glatard mentioned this issue Jul 3, 2019

Add link to online platform where pipeline is installed #66

Closed

cmadjar closed this as completed May 14, 2020

cmadjar reopened this May 14, 2020

cmadjar added this to To do in Data Portal Developments May 14, 2020

cmadjar changed the title ~~Pipeline execution on CONP datasets~~ Pipeline CBRAIN execution on CONP datasets May 20, 2020

cmadjar added the priority-medium label May 20, 2020

cmadjar added this to Other issues on the repo in Fall 2020 roadmap Sep 23, 2020

cmadjar moved this from Other issues on the repo to Deadline: end of December in Fall 2020 roadmap Sep 30, 2020

glatard changed the title ~~Pipeline CBRAIN execution on CONP datasets~~ Pipeline execution on CONP datasets Oct 7, 2020

This was referenced Oct 7, 2020

Run CONP pipelines in CBRAIN #347

Closed

Process CONP datasets in CBRAIN #348

Closed

Process CONP datasets in CBRAIN using CONP pipelines #349

Closed

Analytics on pipeline execution #350

Open

cmadjar added this to Deadline: end of December in Winter 2021 Dec 11, 2020

cmadjar moved this from Leftovers from the fall to Deadline: end of February in Winter 2021 Jan 13, 2021

cmadjar removed this from Deadline: end of February in Winter 2021 Jan 13, 2021

mandana-mazaheri mentioned this issue Jan 20, 2021

Recommendations for pipelines and datasets #395

Open

cmadjar added this to Deadline: end of May in Spring 2021 Apr 15, 2021

cmadjar moved this from Deadline: end of May to Leftover issues from winter roadmap in Spring 2021 Apr 15, 2021

cmadjar moved this from Leftover issues from winter roadmap to Other issues on repos in Spring 2021 Apr 15, 2021

cmadjar removed this from Other issues on repos in Spring 2021 Apr 15, 2021

mandana-mazaheri mentioned this issue Apr 27, 2021

Add a CBRAIN button in the dataset card #451

Merged

4 tasks

cmadjar closed this as completed Apr 30, 2021

Data Portal Developments automation moved this from To do to Done Apr 30, 2021

cmadjar reopened this Apr 30, 2021

Data Portal Developments automation moved this from Done to In progress Apr 30, 2021

github-actions bot added the Stale label Sep 28, 2021

github-actions bot closed this as completed Dec 27, 2021

Data Portal Developments automation moved this from In progress to Done Dec 27, 2021

cmadjar reopened this Jan 4, 2022

Data Portal Developments automation moved this from Done to In progress Jan 4, 2022

cmadjar removed the Stale label Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline execution on CONP datasets #63

Pipeline execution on CONP datasets #63

glatard commented Jun 13, 2019

paiva commented Jun 19, 2019

glatard commented Jun 28, 2019

shots47s commented Jul 3, 2019

cmadjar commented May 14, 2020

cmadjar commented May 14, 2020

glatard commented May 14, 2020

natacha-beck commented May 14, 2020

cmadjar commented Sep 30, 2020

glatard commented Oct 7, 2020 •

edited

cmadjar commented Oct 7, 2020 •

edited

cmadjar commented Apr 30, 2021

github-actions bot commented Sep 28, 2021

github-actions bot commented Dec 27, 2021

Pipeline execution on CONP datasets #63

Pipeline execution on CONP datasets #63

Comments

glatard commented Jun 13, 2019

paiva commented Jun 19, 2019

glatard commented Jun 28, 2019

shots47s commented Jul 3, 2019

cmadjar commented May 14, 2020

cmadjar commented May 14, 2020

glatard commented May 14, 2020

natacha-beck commented May 14, 2020

cmadjar commented Sep 30, 2020

glatard commented Oct 7, 2020 • edited

Goal 1: Run CONP pipelines in CBRAIN

Tasks

How

Who

Goal 2: Process CONP datasets in CBRAIN

Task

How

Who

Notes

Goal 3: Process CONP datasets in CBRAIN using CONP pipelines

Tasks

How

Who

Goal 4: Analytics on pipeline execution

Task

How

Who

cmadjar commented Oct 7, 2020 • edited

cmadjar commented Apr 30, 2021

github-actions bot commented Sep 28, 2021

github-actions bot commented Dec 27, 2021

glatard commented Oct 7, 2020 •

edited

cmadjar commented Oct 7, 2020 •

edited