Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantity fields should be human-readable #862

Closed
aaronmondal opened this issue Apr 18, 2024 · 10 comments · Fixed by #891
Closed

Quantity fields should be human-readable #862

aaronmondal opened this issue Apr 18, 2024 · 10 comments · Fixed by #891
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@aaronmondal
Copy link
Contributor

Instead of this:

"eviction_policy": {
  // 50mb.
  "max_bytes": 50000000,
}

We could have this which seems more readable and is standardized in the Kubernetes Quantity format:

"eviction_policy": {
  "max_bytes": "50Mi",
}
@aaronmondal aaronmondal added enhancement New feature or request good first issue Good for newcomers labels Apr 18, 2024
@matdexir
Copy link
Contributor

Hi @aaronmondal

I was thinking of taking on this issue if still available.
Before proceeding, I was wondering: should it support both versions or only the human readable version?

@aaronmondal
Copy link
Contributor Author

Hi @matdexir feel free to take the issue ❤️

Personally, I'd be in favor of just removing the int support altogether to keep the implementation simple, (otherwise the field would have to support both strings and integers).

I believe this can be done by adding something like a Quantity config object so that we can turn something like this:

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(deny_unknown_fields)]
pub struct SizePartitioningStore {
    /// Size to partition the data on.
    #[serde(deserialize_with = "convert_numeric_with_shellexpand")]
    pub size: u64,

    // ...
}

into this:

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(deny_unknown_fields)]
pub struct SizePartitioningStore {
    /// Size to partition the data on.
    #[serde(deserialize_with = "convert_numeric_with_shellexpand")]
    pub size: Quantity,

    // ...
}

cc @MarcusSorealheis @allada and @adam-singer @blakehatch wdyt? I believe the part that needs to be thought out here is how we can best propagate this information to monitoring services. Under the hood a Quantity would remain a uint64, but what do we want to do front-end (i.e. config-file) wise?

@allada
Copy link
Collaborator

allada commented Apr 23, 2024

To be honest, I'd prefer to always use primitives (non-complex structs) in our config. Some day in the future there's a reasonable chance we'll need our structs to be FFI compliant, and using raw primitives is going to make things easier.

As for how to do this, I'd keep them as u64 but make a custom deserializer in serde that can convert strings that have suffixes to an integer. For example, if we say:

{
  "size1": "5kB"
  "size2": "5KiB",
  "size3": "5GiB",
  "duration1": "5s",
  "duration2": "5m",
  "duration3": "5h"
}

Would result in a struct like (respectively):

Data {
  size1: 1000,
  size2: 1024,
  size3: 1073741824,
  duration1: 5,
  duration2: 300,
  duration3: 18000
}

(obviously we have some places that take milliseconds, but lets ignore those for now, we can figure out what to do with them later).

It shouldn't be too difficult to make one that takes shell_expand and a suffix qualifier. I'd even argue we could probably convert what is already there and we wouldn't even need to create a new deserialization function for it.

@matdexir
Copy link
Contributor

Hi,
Sorry for the late response! Didn't have enough time to dive into the issue.

Here's a sample of how I think it would go: preview gist
Note: I haven't written rust in months so it may not be very idiomatic.

For this use case I would only implement deserialize since I am unsure if serialize would be needed.
My train of thought is:

enum Unit {
 KiloBytes, // KB
 Kibibit, // KiB
 ....
}
struct Quantity {
  value: u64,
  unit: Unit,
}

Then implement deserialization accordingly, although, I do not think that keeping the unit would be of much value 🤔.
What do you think?
cc: @aaronmondal, @allada

@allada
Copy link
Collaborator

allada commented Apr 26, 2024

This is pretty close to what I was thinking. There are a few minor differences, the struct's must use only FFI compatible primitives (String, Enum, u64, usize, float64, exc...).

I suggest using the following libraries that will do the conversion though. I'm pretty sure just wrapping these in a function should do much of the work:
https://docs.rs/humantime/latest/humantime/fn.parse_duration.html
https://docs.rs/byte-unit/latest/byte_unit/enum.Unit.html#method.parse_str

Thanks :-)

@aleksdmladenovic
Copy link
Contributor

I would like to contribute on this issue. @allada @MarcusSorealheis @aaronmondal

@allada
Copy link
Collaborator

allada commented Apr 27, 2024

@matdexir, @aleksdmladenovic and us had a small conversation on slack. I assume you wanted to take this one on?

@aleksdmladenovic
Copy link
Contributor

aleksdmladenovic commented Apr 27, 2024

Yes, now I am working on the issue for reducing the verbosity of ping message.
#764
I would like to work on this issue after accomplishing.

@matdexir
Copy link
Contributor

Hi @allada
Yes I wanted to take this one on. I live in a GMT+8 timezone so there may be delay in my response.

@allada
Copy link
Collaborator

allada commented Apr 27, 2024

Thanks to both. @matdexir has already put work into it, so lets give some time for them to accomplish it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants