Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatter specific defaults #662

Open
alexfikl opened this issue Sep 7, 2023 · 11 comments · May be fixed by #711
Open

Formatter specific defaults #662

alexfikl opened this issue Sep 7, 2023 · 11 comments · May be fixed by #711

Comments

@alexfikl
Copy link
Collaborator

alexfikl commented Sep 7, 2023

I think what might be a better solution is for sensible defaults to be set for all relevant options when you enable the jinja formatter. I think I have a pretty complete setup for jinja, happy to contribute them :)

Originally posted by @jghauser in #661 (comment)

Currently the defaults in papis/defaults.py use the python formatter and setting any other formatter means the user needs to provide values for all of these. Ideally we would help them along by setting some values ourselves.

This is a bit difficult to do 100% because formatters and options can be provided by third-party plugins. Possible issues:

  • How would a third-party formatter provide defaults in its style?
  • How would third-party commands / exporters / downloaders / etc provide defaults for all formatters?
  • Would it be sufficient to just set values for plugins that are shipped with papis?
@alexfikl
Copy link
Collaborator Author

alexfikl commented Sep 7, 2023

Maybe a good compromise would be to provide values for the formatters we support and have a fallback to the python formatter for everything else. Something like

try:
	formatter.format(...)
except:
	PythonFormatter().format(...)

? The main worry there would be that it's going to be slower if we keep trying to format everything twice.

EDIT: ⚠️ This doesn't work since formatter.format won't necessarily fail with a different formatter, just give incorrect results..

@jghauser
Copy link
Member

jghauser commented Sep 7, 2023

  • How would a third-party formatter provide defaults in its style?

Could formatters set their own defaults? For instance, if the relevant options internally become dictionaries mapping formatters to values, a third-party formatter could get the defaults with get_default_settings() and then update the dictionary and set it with register_default_settings()? (Of course, we might also provide some helper function for this.) We would then need to make it clear to authors of such formatters what options they are expected to provide defaults for.

  • How would third-party commands / exporters / downloaders / etc provide defaults for all formatters?

Hmm... that's a tougher one. I guess some sort of fallback option might be useful there. It would be important for such third-party extensions to make explicit for what formatters they supply defaults. Authors should also be encouraged to provide defaults for at least python and jinja2. When using such an extension, papis would then try to use, first, whatever formatter and values that are set by the user, second, use the value for the user defined formatter if it is defined by the extension, or finally, fallback to the python formatter (using the value provided by the extension).

  • Would it be sufficient to just set values for plugins that are shipped with papis?

I think papis itself should ship values for all options and formatters that are included in papis itself. Third-party formatters should be expected to provide defaults for all options that are shipped with papis. Support for third-party 'commands / exporters / downloaders /' would be optional, and if a case is encountered for which no value exists, fallback to the python or jinja2 formatter (and warn the user).

You're right, getting this all right does seem quite complicated. I guess it would make sense to internally structure the relevant options as dictionaries, mapping formatters to values. But that would both break existing configs and be unnecessarily confusing for users. So, it would make sense to allow users to set options' values with just a string, which would then set the value for whatever formatter is set as the one to be used (or for the python formatter if nothing is set).

EDIT: ⚠️ This doesn't work since formatter.format won't necessarily fail with a different formatter, just give incorrect results..

I was just going to say that. But I don't think we need to fallback like that. If the options are dictionaries, we can just use whatever is defined and fallback if a value is missing for a given formatter.

(Some of this is probably a bit hard to parse and just me thinking out loud. Let me know if/where that's the case 😅)

@OopsYao
Copy link

OopsYao commented Sep 8, 2023

It seems like that allowing local formatter would be the simplest solution 😅, for both papis and third-part plugin extensions. As far as I am concerned, the format string provided by user (or the defaults by papis) is some kind of function/mapping that maps the doc meta info to a string. That being said, the string itself does not make sense --- it is bound with its formatter. And allowing third-part formatter to set defaults may cause inconsistent problem due to carelessness of its developer, or updates in defaults from papis. Maintaining consistent defaults between formatters (especially for third-part ones) would be a problem.

The assumption behind the logic that every formatter provides defaults for basic options is that all the formatters can do what python formatter can do at least. An extreme example would be a formatter that supports words slicing (which python formatter lacks) but unable to do character slicing (which python formatter can).

I know little about plugins/extensions, so let me know if there is any misunderstandings.

@alexfikl
Copy link
Collaborator Author

alexfikl commented Sep 8, 2023

It seems like that allowing local formatter would be the simplest solution 😅, for both papis and third-part plugin extensions.

Yeah, I like that idea too, but I don't particularly like the [FORMATTER]format_string thing because it adds restrictions on what the format string can be. Ideally we'd prescribe the formatter somehow independently of the format string itself..

Maybe we can add that to the configuration key instead of the configuration value? Something like

document-description-format = <DEFAULT_FORMAT>
document-description-format!python = <PYTHON_FORMAT>
document-description-format!jinja2 = <JINJA2_FORMAT>

(only one of these should be defined) and then have a papis.config.getformat helper that returns a (formatter, format_string) that we can then pass to papis.format.format? Is that too convoluted?

Maintaining consistent defaults between formatters (especially for third-part ones) would be a problem.

Yeah, that's a good point. The defaults change over time, so it would be quite confusing for a user to change their formatter and suddenly get new values in various places :\

@OopsYao
Copy link

OopsYao commented Sep 9, 2023

Oh I see! It is a issue about how to represent a compound structure in ini file indeed. It reminds me of TOML's dotted key. How about this

document-description-format = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

It seems to be more natural. 😃

@alexfikl
Copy link
Collaborator Author

alexfikl commented Sep 9, 2023

Oh I see! It is a issue about how to represent a compound structure in ini file indeed. It reminds me of TOML's dotted key. How about this

document-description-format = <DEFAULT_FORMAT_STR>
document-description-format.formatter = <FORMATTER_FORMAT_STRING>

It seems to be more natural. 😃

Yeah, that looks nicer!

@jghauser @alejandrogallo What do you think about a scheme like this that lets you select the formatter? It doesn't solve the original problem of providing correct defaults for each formatter, but that seems rather complicated and error prone :\

This would solve making some settings, e.g. ref-format, have more complex logic using the jinja2 formatter, while leaving the rest the same. Are there any other use cases that we should think about here?

@jghauser
Copy link
Member

document-description-format = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

I really like this syntax. And I think it makes a lot of sense to handle setting a mix and match combo of formatters that way.

I still think providing sensible config defaults would be a nice thing to have, but I can see that the benefits may not outweigh the added complexity. I guess if we note in the docs very clearly that the preferred way to get nicer formatting for e.g. the ref key is to use the suggested formatter-specific option, this should limit the amount of people who get surprised by things when setting formater = jinja2. Thank you all for the excellent input! ❤️

@alejandrogallo
Copy link
Member

this makes a lot of sense, I support this. We have to put a full example in the example configuration on the docs.

@alexfikl
Copy link
Collaborator Author

this makes a lot of sense, I support this. We have to put a full example in the example configuration on the docs.

Definitely needs a bunch of docs to make it clear how it works, but otherwise should be pretty nice!

I still think providing sensible config defaults would be a nice thing to have, but I can see that the benefits may not outweigh the added complexity

@jghauser Maybe we can have a papis.defaults.python_formatted_settings and a papis.jinja2_formatted_settings (horrible naming) that can then be used in config.py to overwrite things? That would be pretty low maintenance, since we don't need to be smart about how to apply the settings ourselves. It generally seems like a good idea to have versions at least in jinja2 for this stuff, for documentation/showcasing reasons if nothing else.

document-description-format = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

@OopsYao Do you have time to implement something like this?

@OopsYao
Copy link

OopsYao commented Sep 11, 2023

Being busy recently, but I’m very glad to implement this, and I’ll probably work on it one week later or two.

And it occurs to me that, though it is technically possible to implement something like

document-description-format = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

it is not a good data model in the sense that document-description-format is not a typical plain dictionary/object, thus there is no equivalent representation in popular configuration formats like JSON, YAML or TOML, making it difficult to support other config file formats in the future (if necessary). At least this syntax seems to be a little weird for users accustomed to JSON/YAML/TOML. So I suggest a slightly verbose version here.

# Use local formatter
document-description-format.content = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

# Use global formatter
document-description-format = <FORMAT_STR>

And we can throw an error of duplicated keys or let the later one take precedence (depending on how papis handle duplicated config keys conventionally) when both document-description-format and document-description-format.xxx appear. What do you think?

@alexfikl
Copy link
Collaborator Author

Being busy recently, but I’m very glad to implement this, and I’ll probably work on it one week later or two.

That's quite ok! Let us know if you need any help or can't get to it.

it is not a good data model in the sense that document-description-format is not a typical plain dictionary/object, thus there is no equivalent representation in popular configuration formats like JSON, YAML or TOML, making it difficult to support other config file formats in the future (if necessary). At least this syntax seems to be a little weird for users accustomed to JSON/YAML/TOML.

I'm not particularly worried about that. If we ever decide to move to another format, we'll likely have to write some converter which can take care of any issues like that. So for the moment, we'd just need something simple to implement and understand.

# Use local formatter
document-description-format.content = <FORMAT_STR>
document-description-format.formatter = <FORMATTER_NAME>

# Use global formatter
document-description-format = <FORMAT_STR>

Not a super big fan of this because you always have to remember to write both, which can get tiring if there's a dozen or so settings to modify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants