SlogLogger emit data config #349

itrumper · 2023-05-09T00:52:43Z

I found myself wanting to be able to customize what data the SlogLogger observer was logging, so I attempted to implement this functionality via the builder pattern as shown in the doc example in the data method on SlogLogger.

To do this, I needed to add the Debug trait as a trait bound to the type Param associated types, hence why this PR touches so many files. This could cause breaking changes for users.

Since this uses a builder style pattern to define what data is logged, the default contains the same data as what was previously hard coded, so end users should not experience any breaking changes in their observer configurations. To customize the data that is logged, a new enum called StateData has been defined in the state/mod.rs file that covers all the available data you can get from the State trait. A user supplies a Vec of these enum variants and then the logger iterates through and emits them. The Debug trait was needed because I wanted to print out the Params as well, which are generics.

The motivation for doing this was to actually be able to log the gradient of a problem as well, but since not every solver has a defined gradient, I'm less sure that can be done. I will need assistance in evaluating that possibility, since it would significantly help with debugging!

Let me know what you think! Thanks for building a cool crate :)

codecov-commenter · 2023-05-11T18:40:36Z

Codecov Report

Patch coverage: 61.62% and project coverage change: -0.15 ⚠️

Comparison is base (ae98919) 93.35% compared to head (39b832c) 93.21%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #349      +/-   ##
==========================================
- Coverage   93.35%   93.21%   -0.15%     
==========================================
  Files         117      117              
  Lines       18803    18881      +78     
==========================================
+ Hits        17554    17599      +45     
- Misses       1249     1282      +33

Impacted Files	Coverage Δ
argmin/src/core/state/iterstate.rs	`99.90% <ø> (ø)`
argmin/src/core/state/linearprogramstate.rs	`98.24% <ø> (ø)`
argmin/src/core/state/populationstate.rs	`97.53% <ø> (ø)`
argmin/src/solver/conjugategradient/cg.rs	`99.16% <ø> (ø)`
...rgmin/src/solver/conjugategradient/nonlinear_cg.rs	`98.25% <ø> (ø)`
...n/src/solver/gaussnewton/gaussnewton_linesearch.rs	`92.69% <ø> (ø)`
...rgmin/src/solver/gaussnewton/gaussnewton_method.rs	`96.51% <ø> (ø)`
...gmin/src/solver/gradientdescent/steepestdescent.rs	`95.04% <ø> (ø)`
argmin/src/solver/landweber/mod.rs	`98.33% <ø> (ø)`
argmin/src/solver/linesearch/backtracking.rs	`99.05% <ø> (ø)`
... and 18 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

stefan-k · 2023-05-11T18:51:48Z

Thanks a lot for this work! I apologize again to you for the very late response. Unfortunately you chose the worst possible time to issue a PR this big as I'm currently extraordinarily busy ;) Therefore I'm unfortunately unable to give a detailed review right know, but I do want to give you a bit of feedback before you invest more time into this PR.

Generally, I'm very happy to see filtering implemented in the SlogLogger; however, I'm not convinced that this approach is the way to go. A few points:

Having an enum to represent the possible metrics is limiting, because each solver can define their own metrics sent to the observer interface. In order to be as flexible as possible, I think there is no way to get past string comparisons.
A Debug bound on Param shouldn't be necessary (I think). I would have to look at it in detail, but the observer itself could enforce this bound without it being used all over the place in the core code. As an example you could have a look at file checkpointing, where a Serialize bound is only required when checkpointing is actually used. I'm however not sure how well this also applies to the observer case.
I don't think state is the appropriate place to store which fields to observe. As this is observer specific, I think it should be possible to only change the observer code.

I had the need for filtering also in a new observer I am working on in #311. I chose to collect a list of strings and then only select the ones listed (see line 148 in this file). Note that this code essentially does something quite different to what you want to achieve, but I think the approach should be similar.

Regarding observing parameters/gradients and so on: I would love to see this, but maybe it would be easier to implement this in a dedicated observer? I'm not sure though, just an idea to think about :)

Thanks again for this contribution, I really appreciate it! I hope that not too much of your work needs to be changed and I hope that I will be able to give you a detailed review soon, but unfortunately I cannot promise it right now.

itrumper · 2023-05-13T03:22:17Z

Thanks for taking the time to look it over and provide feedback! That is a great help, and thanks for letting me know about your availability. I don't have expectations of a timeline, so please don't feel pressured to work on this.

In response to your points:

I see that point, we need a flexible enough interface to accommodate Solvers different parameters. However, I want to try to find a way that implements better type safety than string comparisons, but still is extensible. I don't know what that looks like right now, so I'll think about it more.
I will certainly look into the file checkpointing! Seems like you're describing a trait bound behind a feature gate?
I may have miscommunicated about this point. Which data to log isn't stored in State but rather the Observer itself. I only modified that file to add the enum that defines what "names" of data are available.

I'll check out the observer you are working on, thanks for the references!

For the solver specific data, like parameters and gradients, I do think it would take a different type of observer, as right now the Observer doesn't know which parameters are available, just what is exposed through the State trait, correct?

I'm happy to completely redo it all if a better solution is out there! You know your crate, and how pieces fit together better than I do, so I'll happily take your direction. Thanks for the discussion.

stefan-k · 2023-11-04T09:08:40Z

1. I see that point, we need a flexible enough interface to accommodate `Solver`s different parameters. However, I want to try to find a way that implements better type safety than string comparisons, but still is extensible. I don't know what that looks like right now, so I'll think about it more.

I agree that string comparisons aren't the most elegant solution, not only for type safety, but also for performance reasons.

2. I will certainly look into the file checkpointing! Seems like you're describing a trait bound behind a feature gate?

FileCheckpoint is behind the serde feature gate, but this is not what's "(de)activating" the trait bound. FileCheckpoint imposes a Serialize trait bound on the solver and the state (here), but only in case FileCheckpoint is actually used. If a user doesn't use checkpointing in their code, the state and solver do not need to be serializable.

3. I may have miscommunicated about this point. Which data to log isn't stored in `State` but rather the `Observer` itself. I only modified that file to add the `enum` that defines what "names" of data are available.

I see, sorry for the misunderstanding!

For the solver specific data, like parameters and gradients, I do think it would take a different type of observer, as right now the Observer doesn't know which parameters are available, just what is exposed through the State trait, correct?

That's a good point! The State trait does expose the parameters via get_param and get_best_param (here). Gradients are not accessible via the State trait, but it should be possible to create a dedicated GradientObserver which only works with IterState states. Then you will have access to all fields of the IterState and it is easy to turn the observer on and off for debugging.

itrumper added 8 commits May 8, 2023 17:11

Add Debug trait to param type

9e52fc3

make StateData pub accessible

47c9516

add Debug trait to example solver impl

4468934

enable customized logs via StateData enum

0f1bb9a

remove old emits

e28c485

add doc comments to enum variants

270038b

fix clippy warns

d6037e4

fix book doc example impl Debug for solver

39b832c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SlogLogger emit data config #349

SlogLogger emit data config #349

itrumper commented May 9, 2023

codecov-commenter commented May 11, 2023

stefan-k commented May 11, 2023

itrumper commented May 13, 2023

stefan-k commented Nov 4, 2023

SlogLogger emit data config #349

Are you sure you want to change the base?

SlogLogger emit data config #349

Conversation

itrumper commented May 9, 2023

codecov-commenter commented May 11, 2023

Codecov Report

stefan-k commented May 11, 2023

itrumper commented May 13, 2023

stefan-k commented Nov 4, 2023