Implement MD workflows #1672

tomdemeyere · 2024-02-07T22:44:58Z

Summary of Changes

Closes #1577.

Before going further a little consultation:

MD users use workflows all the time, just for the NVE example in this commit there is pretty much a thousand ways you can run that; doing a first NVT to relax to the desired temperature, or a first NPT to relax to the desired density, etc...

For example, should we have one NVT workflow with a parameter "thermostat", or one workflow per thermostat?
If you want things to be in recipe.common, atoms will have to be sent in with a calculator already attached, is that what you want?
From what I understand you want to keep things simple (not something like the code here), users then build their workflows to get what they want.
In any case, for now doing these kinds of workflows is difficult, because it is generally hard to transfer results between ASE calculators, which will always happen between runs, this is critical since MD needs momenta and forces to ensure proper continuation. (This is a problem I would like fixed above since it also causes problems when restarting optimization etc)

Checklist

I have read the "Guidelines" section of the contributing guide. Don't lie! 😉
My PR is on a custom branch and is not named main.
I have used black, isort, and ruff as described in the style guide.
I have added relevant, comprehensive unit tests.

buildbot-princeton · 2024-02-07T22:45:01Z

Can one of the admins verify this patch?

Andrew-S-Rosen · 2024-02-07T22:51:12Z

@tomdemeyere: Happy to respond to each question but before I do, I just wanted to check that this is the full commit you wanted to share --- just the args/kwargs, right?

(In short, I really would love to have MD flows and am willing to work together to figure out how to get this done the best way possible)

codecov · 2024-02-07T22:54:27Z

Codecov Report

Attention: Patch coverage is 98.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 98.99%. Comparing base (10ee9b0) to head (65bc767).

Files	Patch %	Lines
src/quacc/runners/ase.py	95.91%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1672      +/-   ##
==========================================
- Coverage   99.02%   98.99%   -0.04%     
==========================================
  Files          82       84       +2     
  Lines        3387     3481      +94     
==========================================
+ Hits         3354     3446      +92     
- Misses         33       35       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tomdemeyere · 2024-02-07T23:27:01Z

@tomdemeyere: Happy to respond to each question but before I do, I just wanted to check that this is the full commit you wanted to share --- just the args/kwargs, right?

(In short, I really would love to have MD flows and am willing to work together to figure out how to get this done the best way possible)

Yes, before doing more I wanted to make sure we are on the same page.

I guess my main question here is: how to make these flows common to all codes? Should they call code specific MD jobs? Or, should a calculator be attached to the Atoms object being sent to the MD flow?

Once I know this we can work on details

Andrew-S-Rosen · 2024-02-08T00:45:07Z

I guess my main question here is: how to make these flows common to all codes? Should they call code specific MD jobs? Or, should a calculator be attached to the Atoms object being sent to the MD flow?

@tomdemeyere: Good question.

Personally, I think what would likely make the most sense is to make a run_md function in quacc.runners.ase that would take an Atoms object with an attached calculator just like the other methods in there. You likely were aware of that already.

From there, we get to the recipes, and my answer is: "it depends." If it is just a simple @job (i.e. not a @flow), then this is no different than a relaxation job --- one would just be calling run_md instead of run_calc or run_ase_opt in a given code's module (e.g. quacc.recipes.emt). Since there wouldn't be much logic beyond defining default arguments and calling run_md, this is fine and doesn't introduce much duplication. I would treat the individual MD jobs in the same way as a relaxation --- we have a runner, which is called in the various recipes where it is appropriate. It does mean that there would be multiple MD modules, but we would need that anyway because every code will have different parameters (defaults) that are needed to properly run an MD calculation.

If we are talking about a @flow where there are multiple Jobs being stitched together, and we anticipate this pattern (i.e. the workflow DAG) to be calculator-agnostic, then we'll put that in common. This is the philosophy behind the existing common recipes. But we would still need individual Jobs for each code. In general, users aren't expected to call the common workflow directly. Aside from usability aspects, there are some other nuances about (de)serialization needed by workflow engines for why I say this, but I won't bore you with those details here.

To get started, I would suggest making quacc.runners.ase.run_md and then a simple demo for a cheap-to-run calculator, like EMT or LJ. That will likely be illustrative and help guide the design further.

I should also note that we can always refactor later if there is something we can refactor. No need to prematurely optimize.

For example, should we have one NVT workflow with a parameter "thermostat", or one workflow per thermostat?

If the workflow logic is staying the same regardless of thermostat, then I would suggest one job/workflow where thermostat would be a keyword argument with a sensible default.

If you want things to be in recipe.common, atoms will have to be sent in with a calculator already attached, is that what you want?

See comment above. If we are talking about a @job, then that would call run_md and be specific for the code since we need to set some sensible default parameters. If it's a @flow pattern, that can go in common but would not require passing in Atoms objects with attached calculators --- it would require passing in one or more Jobs.

From what I understand you want to keep things simple (not something like the code here), users then build their workflows to get what they want.

Hard for me to tell from the commit you shared here, but I am not opposed to complexity. In fact, we should not rely on the users to build something complex. If the @flow pattern is "trivial", like a relax then an MD run, it is debatable if we should make that its own @flow or not (honestly, it can go either way). But if it's more complex, for sure we should.

In any case, for now doing these kinds of workflows is difficult, because it is generally hard to transfer results between ASE calculators, which will always happen between runs, this is critical since MD needs momenta and forces to ensure proper continuation. (This is a problem I would like fixed above since it also causes problems when restarting optimization etc)

This is a good point. However, I think it is solvable. Let's circle back to this when the time is right.

tomdemeyere · 2024-02-08T15:54:41Z

@Andrew-S-Rosen Thanks for all these details. I will come back to it at some point

tomdemeyere · 2024-02-14T08:15:23Z

@Andrew-S-Rosen I think this is now a good point to have your input

Andrew-S-Rosen

Thank you for kicking this off, @tomdemeyere! Really appreciate the contribution!

This is looking good. I have some comments that are mostly related to some minor restructuring.

Starting with this isolated EMT example is very helpful for me in understanding the process and figuring out what should be refactored, so I thank you for doing that.

src/quacc/recipes/emt/core.py

Andrew-S-Rosen · 2024-02-14T19:46:30Z

src/quacc/recipes/emt/core.py

+
+ if temperature:
+ MaxwellBoltzmannDistribution(
+ atoms, temperature_K=temperature, **initial_temperature_params


With the fixcm and fixrot as dedicated keyword arguments, the **kwargs here can be more clearly described as simply being the **maxwell_boltzmann_distribution_kwargs. That is a mouthful (although nobody is calling the name directly). Could try something like **maxwell_boltzmann_kwargs. Either way, the point is that both the docstring and name would be much clearer this way. At a glance, it's not clear what an **initial_temperature_params is.

src/quacc/recipes/emt/core.py

src/quacc/runners/ase.py

src/quacc/schemas/_aliases/ase.py

src/quacc/schemas/ase.py

src/quacc/recipes/emt/core.py

tomdemeyere · 2024-02-15T08:39:30Z

Thank you for kicking this off, @tomdemeyere! Really appreciate the contribution!

This is looking good. I have some comments that are mostly related to some minor restructuring.

Starting with this isolated EMT example is very helpful for me in understanding the process and figuring out what should be refactored, so I thank you for doing that.

I see that your comment is different than the one I received by email from git. But yes, originally my plan was to have these "_base" functions. Either per calculator or in common like you previously said.

After discussing that along with other details I was then planning to introduce recipes for the other ensembles (NVT, NPT).

Didn't see your comments originally (github mobile) will work on it at some point

Andrew-S-Rosen · 2024-02-15T11:41:30Z

I see that your comment is different than the one I received by email from git. But yes, originally my plan was to have these "_base" functions. Either per calculator or in common like you previously said.

Apologies about the edits! I was trying to sort out where to place them and was going back-and-forth so decided to just save it for a future discussion to avoid confusion. But you got the right idea anyway.

After discussing that along with other details I was then planning to introduce recipes for the other ensembles (NVT, NPT).

That would be great!

Didn't see your comments originally (github mobile) will work on it at some point

Thank you!

src/quacc/recipes/emt/core.py

tomdemeyere · 2024-02-15T22:25:29Z

Few comments:

Without run_kwargs there is no way to pass things to dyn.run()
There is still a check_convergence, although it does not make much sense for MD. By definition you reached the max number of steps, but there is a case where it makes sense:

We are currently working on a error-free quacc as part of another project. We make sure that quacc always reaches the point of writing to the database: it writes the errors that occurred, along with the available results, if any (the errors might happen at some point during the MD, previous results should be returned, actually in some cases, errors might even be part of your journey (although this might not be what Quacc was originally made for)). Currently if one is running a bunch of simulations, if errors occur, there will be a bunch of directories with no easy way to figure things out.

Because you said you might be interested in error handling I am leaving that there.

I changed how temperature/time etc... is written to results, feel free to tell me what you think

Andrew-S-Rosen · 2024-02-15T22:52:38Z

Without run_kwargs there is no way to pass things to dyn.run()

The fmax and steps are already passed in quacc.runners.ase.run_opt, and .run() doesn't take any other keyword arguments, so we don't need it.

There is still a check_convergence, although it does not make much sense for MD. By definition you reached the max number of steps, but there is a case where it makes sense:

I agree that a check_convergence doesn't make much sense for this job type. We should remove that logic entirely.

We are currently working on a error-free quacc as part of another project. We make sure that quacc always reaches the point of writing to the database: it writes the errors that occurred, along with the available results,

One could catch the typical RuntimeError errors when ASE is called and try to give the user some flexibility via the global settings about how to handle things with the CHECK_CONVERGENCE and/or some other global parameter. But of course, feel free to continue doing what you're doing!

Currently if one is running a bunch of simulations, if errors occur, there will be a bunch of directories with no easy way to figure things out.

Indeed. This is a limitation of Parsl/Dask, which is really best thought of a task manager. Other true workflow tools like Prefect or Covalent or FireWorks have a separate database for the task metadata and would highlight when/where the job fails. But since Parsl does not have a task database of its own, there is no mechanism to easily store that info. The recommended approach would be to launch your calculations in a for loop and log any errors from the Parsl AppFuture. But of course, your mechanism is fine as well.

I changed how temperature/time etc... is written to results, feel free to tell me what you think

I like that a lot more!

tomdemeyere · 2024-05-23T11:48:53Z

Some news on this? Just to know if it is waiting on https://gitlab.com/ase/ase/-/merge_requests/3310

Andrew-S-Rosen

@tomdemeyere: Thank you for your enormous amount of patience with this PR. I didn't see your comment from 3 weeks ago, so my apologies about not responding and for keeping you in the dark about this.

The reason I had been reluctant to merge this was that there was a fair bit of refactoring I wanted to do to remove code duplication in the run_md method and also in the schema. This was dependent on some other PRs I had in the pipeline. I was finally able to devote time to do so properly, and now it is much cleaner. 👍 I didn't change any functionality (to the best of my knowledge).

At this point, I am quite pleased with this. Whenever you have a moment, however, I have a few remaining questions outlined below. I am happy to take over of any (small) changes that result from my questions --- but I wanted to get your professional input before attempting to proceed since I am not an MD expert. Finally, before I merge, if you have a moment to look at a similar attempt recently implemented in matcalc, it'd be nice to know if we are missing anything particularly useful that's in there. For instance, there is something about upper triangular cells that is entirely foreign to me.

Andrew-S-Rosen · 2024-06-15T06:19:59Z

src/quacc/recipes/emt/md.py

+ Additionally, Quacc uses the keywords `fix_com` and `fix_rot` instead of
+ `fixcm` and `fixrot` to fix the center of mass and rotation, respectively.


If this is just a matter of renaming and nothing else, let's just stick with the ASE ones. If you approve, let me know and I can revert the change here.

Andrew-S-Rosen · 2024-06-15T06:20:46Z

src/quacc/recipes/emt/md.py

+ As of March 2024, ASE still accept deprecated keywords such as:
+
+ - temperature instead of temperature_K
+ - pressure instead of pressure_au
+ - compressibility instead of compressibility_au
+ - dt instead of timestep
+
+ All of these keywords will be accepted by Quacc, but it is recommended to use the
+ new keywords to avoid confusion. Everything will always be converted to the Quacc
+ units mentioned above.


We probably don't need to mention this since ASE itself will raise a deprecation warning if they are used, right? Ultimately, it's on the user to use the current ASE names. Or maybe I'm misunderstanding the context.

This can probably be deleted yes

Andrew-S-Rosen · 2024-06-15T06:22:25Z

src/quacc/runners/ase.py

+ if "temperature" in dynamics_kwargs or "temp" in dynamics_kwargs:
+ LOGGER.warning(
+ "In quacc `temperature`, `temp` and `temperature_K` are"
+ " equivalent and will be interpreted in Kelvin."
+ )
+ dynamics_kwargs["temperature_K"] = dynamics_kwargs.pop(
+ "temperature", None
+ ) or dynamics_kwargs.pop("temp", None)
+
+ if "pressure" in dynamics_kwargs:
+ LOGGER.warning(
+ "In quacc `pressure` and `pressure_au` are equivalent and will be"
+ " interpreted in GPA."
+ )
+ dynamics_kwargs["pressure_au"] = dynamics_kwargs.pop("pressure")
+


Do we need all these checks, or can we just rely on the fact that the user should be using the up-to-date names and if they don't, ASE will warn them or a crash will happen.

The thing is that, parameters are a very big mess currently, for example for NVT Berendsen we have:

temperature: float The desired temperature, in Kelvin. temperature_K: float Alias for temperature

And for NPT (Nose-Hoover) :

temperature: float (deprecated) The desired temperature in eV. temperature_K: float The desired temperature in K.

Sometimes temperature is in eV, sometimes in K, you don't know, surprise!

I know it's a lot of time to ask from you but your support on https://gitlab.com/ase/ase/-/merge_requests/3391 (probably not everything in this PR is good to take) would be more than welcome.

If/when this goes, I would rewrite the entire MD doc and we can finally move on with X-years old deprecated keywords and have user-friendliness and consistency.

Ooof, yeah that is a mess! Thanks for the reminder.

Let's see how the discussion plays out and then I am happy to proceed how you suggest. I am happy to review on ASE's side if/when we get a blessing from Ask on certain components.

…rkflow

tomdemeyere · 2024-06-15T19:55:36Z

@tomdemeyere: Thank you for your enormous amount of patience with this PR. I didn't see your comment from 3 weeks ago, so my apologies about not responding and for keeping you in the dark about this.

The reason I had been reluctant to merge this was that there was a fair bit of refactoring I wanted to do to remove code duplication in the run_md method and also in the schema. This was dependent on some other PRs I had in the pipeline. I was finally able to devote time to do so properly, and now it is much cleaner. 👍 I didn't change any functionality (to the best of my knowledge).

At this point, I am quite pleased with this. Whenever you have a moment, however, I have a few remaining questions outlined below. I am happy to take over of any (small) changes that result from my questions --- but I wanted to get your professional input before attempting to proceed since I am not an MD expert. Finally, before I merge, if you have a moment to look at a similar attempt recently implemented in matcalc, it'd be nice to know if we are missing anything particularly useful that's in there. For instance, there is something about upper triangular cells that is entirely foreign to me.

I had a look at the code you sent me and I am afraid I have never had issues with the upper vs lower triangular matrix (mainly because I don't often run NPT) but from what I understand this is taken care by a PR in ASE (https://gitlab.com/ase/ase/-/merge_requests/3277) by just modifying the NPT class without the need of modifying the atoms object like in matcalc.

For more context: left is my slab before, right is the slab after being passed in matcalc.

To refresh the current challenges in this PR:

The lack of proper restart, which breaks the purpose of "MD workflows"
Choose a convention for the units being used and make sure that it is clear and user-friendly

For the first point, I am afraid this is not for today :/. For the later I tried to make things as clean as I could but of course feel free to discard these suggestions and do as you like.

Andrew-S-Rosen · 2024-06-15T20:12:38Z

Thanks for the reply, @tomdemeyere! Regarding the triangular stuff, thanks for linking me to the MR. If this is just an ASE bug/inconvenience, we will take care of that upstream. I'll keep tabs on the MR.

As for this PR, not to worry about restarts. I don't think it is a dealbreaker, and I disagree that it defeats the purpose of a workflow. After all, one might want to do a non-MD job afterwards (e.g. a DFT relaxation) which would be fully supported. It's just that MD-->MD might be tricky. I wouldn't let that hold up this PR. Let me know if/when your upstream ASE MR though needs any attention from me.

The second point is really the only one I feel is necessary to make sure we get right. I trust your opinion on this and am content with leaving it as-is.

So, I think the plan (unless you disagree) should be:

I'll monitor the conversation at https://gitlab.com/ase/ase/-/merge_requests/3391 to see if there's anything productive that might turn up in the short term.
Assuming the above answer is no, I'll go ahead and merge this PR pretty much as-is sometime next week.

Thoughts?

Andrew-S-Rosen · 2024-06-15T20:16:33Z

Oh, but also I still had the question about fixcm and fixrot. Did we need to rename them for some reason? Was that also an internal consistency thing?

tomdemeyere · 2024-06-15T20:38:09Z

Oh, but also I still had the question about fixcm and fixrot. Did we need to rename them for some reason? Was that also an internal consistency thing?

Most of the time you will find fixcm but occasionally you might find fix_com (mostly internally) I prefer fix_com so I used that (it is less obscure in my opinion) but I don't feel strongly about it, change it as you want.

Thoughts?

Yes! I think that's a great plan

initial

b23a93f

Andrew-S-Rosen added the New Recipe label Feb 9, 2024

tomdemeyere added 5 commits February 11, 2024 21:14

fix

3e332dc

Merge remote-tracking branch 'upstream/main' into md_workflow

5cbeaea

working

c736ac3

working

6c36678

working

42dcb82

better tests

f806b57

Andrew-S-Rosen reviewed Feb 14, 2024

View reviewed changes

Merge branch 'main' into md_workflow

97d1511

Andrew-S-Rosen reviewed Feb 15, 2024

View reviewed changes

src/quacc/recipes/emt/core.py Outdated Show resolved Hide resolved

tomdemeyere added 4 commits February 15, 2024 22:25

suggestions

1fc6914

Merge remote-tracking branch 'origin/md_workflow' into md_workflow

6a2f634

remove comment

cfd1de7

oops

07f6943

Andrew-S-Rosen added 4 commits February 15, 2024 15:22

Merge branch 'main' into md_workflow

c744af9

Fix inconsistent fix_com and fix_rot naming

dcace7e

Fix hyperlink to docs

8a2c030

Merge branch 'main' into md_workflow

f6f8b5b

Andrew-S-Rosen and others added 16 commits June 4, 2024 17:33

Merge branch 'main' into md_workflow

25cb740

Merge branch 'main' into md_workflow

d9a67d2

pre-commit auto-fixes

548df48

Merge branch 'main' into md_workflow

1f9c741

fix

6d4af21

fix

af3f25d

fix docs

e999e60

fix

d1e7b3d

Slight refactor

31ae752

Merge branch 'main' into md_workflow

775c051

Add code comments

be90963

Merge branch 'main' into md_workflow

843cdc4

Remove unused logger

9fdb592

Refactor

164adb5

pre-commit auto-fixes

522b0ca

Fix

039f20f

Andrew-S-Rosen reviewed Jun 15, 2024

View reviewed changes

Andrew-S-Rosen added 8 commits June 14, 2024 23:56

Fix type hint/schema

1539581

Add new job to docs

2df4564

Merge branch 'main' into md_workflow

b9d63ae

Add code comment

5975892

Merge branch 'md_workflow' of github.com:tomdemeyere/quacc into md_wo…

2770a7a

…rkflow

Clarify type hints

4681d26

Fix stray del

074f58c

Fix

65bc767

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MD workflows #1672

Implement MD workflows #1672

tomdemeyere commented Feb 7, 2024 •

edited by Andrew-S-Rosen

buildbot-princeton commented Feb 7, 2024

Andrew-S-Rosen commented Feb 7, 2024 •

edited

codecov bot commented Feb 7, 2024 •

edited

tomdemeyere commented Feb 7, 2024

Andrew-S-Rosen commented Feb 8, 2024 •

edited

tomdemeyere commented Feb 8, 2024

tomdemeyere commented Feb 14, 2024

Andrew-S-Rosen left a comment •

edited

Andrew-S-Rosen Feb 14, 2024

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024 •

edited

tomdemeyere commented May 23, 2024

Andrew-S-Rosen left a comment •

edited

Andrew-S-Rosen Jun 15, 2024

tomdemeyere Jun 15, 2024

Andrew-S-Rosen Jun 15, 2024 •

edited

tomdemeyere Jun 15, 2024

Andrew-S-Rosen Jun 15, 2024

tomdemeyere Jun 15, 2024 •

edited

Andrew-S-Rosen Jun 15, 2024

tomdemeyere commented Jun 15, 2024

Andrew-S-Rosen commented Jun 15, 2024

Andrew-S-Rosen commented Jun 15, 2024 •

edited

tomdemeyere commented Jun 15, 2024 •

edited

		Additionally, Quacc uses the keywords `fix_com` and `fix_rot` instead of
		`fixcm` and `fixrot` to fix the center of mass and rotation, respectively.

Implement MD workflows #1672

Are you sure you want to change the base?

Implement MD workflows #1672

Conversation

tomdemeyere commented Feb 7, 2024 • edited by Andrew-S-Rosen

Summary of Changes

Checklist

buildbot-princeton commented Feb 7, 2024

Andrew-S-Rosen commented Feb 7, 2024 • edited

codecov bot commented Feb 7, 2024 • edited

Codecov Report

tomdemeyere commented Feb 7, 2024

Andrew-S-Rosen commented Feb 8, 2024 • edited

tomdemeyere commented Feb 8, 2024

tomdemeyere commented Feb 14, 2024

Andrew-S-Rosen left a comment • edited

Choose a reason for hiding this comment

Andrew-S-Rosen Feb 14, 2024

Choose a reason for hiding this comment

tomdemeyere commented Feb 15, 2024 • edited

Andrew-S-Rosen commented Feb 15, 2024

tomdemeyere commented Feb 15, 2024 • edited

Andrew-S-Rosen commented Feb 15, 2024 • edited

tomdemeyere commented May 23, 2024

Andrew-S-Rosen left a comment • edited

Choose a reason for hiding this comment

Andrew-S-Rosen Jun 15, 2024

Choose a reason for hiding this comment

tomdemeyere Jun 15, 2024

Choose a reason for hiding this comment

Andrew-S-Rosen Jun 15, 2024 • edited

Choose a reason for hiding this comment

tomdemeyere Jun 15, 2024

Choose a reason for hiding this comment

Andrew-S-Rosen Jun 15, 2024

Choose a reason for hiding this comment

tomdemeyere Jun 15, 2024 • edited

Choose a reason for hiding this comment

Andrew-S-Rosen Jun 15, 2024

Choose a reason for hiding this comment

tomdemeyere commented Jun 15, 2024

Andrew-S-Rosen commented Jun 15, 2024

Andrew-S-Rosen commented Jun 15, 2024 • edited

tomdemeyere commented Jun 15, 2024 • edited

tomdemeyere commented Feb 7, 2024 •

edited by Andrew-S-Rosen

Andrew-S-Rosen commented Feb 7, 2024 •

edited

codecov bot commented Feb 7, 2024 •

edited

Andrew-S-Rosen commented Feb 8, 2024 •

edited

Andrew-S-Rosen left a comment •

edited

tomdemeyere commented Feb 15, 2024 •

edited

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024 •

edited

Andrew-S-Rosen left a comment •

edited

Andrew-S-Rosen Jun 15, 2024 •

edited

tomdemeyere Jun 15, 2024 •

edited

Andrew-S-Rosen commented Jun 15, 2024 •

edited

tomdemeyere commented Jun 15, 2024 •

edited