Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate doesn't correctly validate yaml files per the spec. #270

Open
endophage opened this issue Dec 9, 2022 · 5 comments
Open

Validate doesn't correctly validate yaml files per the spec. #270

endophage opened this issue Dec 9, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@endophage
Copy link

Describe the bug
The yaml spec requires that all keys in a mapping are unique (spec for reference: https://yaml.org/spec/1.2.2/). dasel validate does not catch this issue in a yaml file

To Reproduce
Steps to reproduce the behavior:

$> cat <<EOF | tee tmp.yaml
key:
        foo: "bar"
key:
        foo: "baz"
EOF

$> dasel validate tmp.yaml
pass tmp.yaml

Expected behavior
Validation should fail because of the duplicate key key at the top level.

Desktop (please complete the following information):

  • OS: MacOS
  • Version: development-v1.27.4-0.20221207223452-2761f5761361

Additional context
This would be useful for validating kubernetes yaml files as every now and again somebody unintentionally redeclares a key in a map and kubernetes doesn't error, it just takes the last assignment.

@endophage endophage added the bug Something isn't working label Dec 9, 2022
@TomWright
Copy link
Owner

The method of validation dasel uses here is just to attempt to read the YAML and fail if it cannot be parsed, so largely the same approach kubernetes is using.

We use gopkg.in/yaml.v2 to parse YAML, perhaps there's a strict mode or something to catch these cases.

@endophage
Copy link
Author

It does indeed have a strict mode it claims will catch duplicates: https://pkg.go.dev/gopkg.in/yaml.v2#UnmarshalStrict.

Happy to make a PR if that's a change you'll accept for dasel validate

@TomWright
Copy link
Owner

Yes of course. It may be good to hide it behind a --strict flag so it doesn't catch other users off-guard

@endophage
Copy link
Author

endophage commented Dec 28, 2022

I spent some time looking at this over Christmas. Before I spend more time on it there's a bit of a refactor required and I wanted to run that by you before making the changes.

In the init methods for each of your parsers, e.g. yaml here, you instantiate actual instances of the parser. Then in your New*Parser* methods here you simply return the instance. From there, because we're reading a file, we call FromBytes which only takes the raw input []byte.

So, there are 2 options to enable configurability on the parsers:

  1. Don't return the instance of the parser, return a factory that can be used to generate an instance given optional configuration.
  2. Add a Configure or similarly named method on the parsers that can accept options to configure the instance.
  3. Have FromBytes take additional options like ToBytes already does.

In this specific case, the addition of a "strict" mode feels like parser level configuration, not method level, i.e. a parser is configured once then can be used many times with that configuration. That means we're looking at options 1 or 2.

Option 2 is probably slightly less work but it's essentially a half assed factory, i.e. Option 1. Additionally if somebody was using dasel as a library rather than a CLI tool, option 2 makes it easier to introduce bugs in sibling code paths (one changes the configuration that is then used by the other that doesn't expect the config). Enforcing this type of configuration at instantiation, option 1, makes it clear what the state of the parser is.

Let me know what your thoughts are and I can take a crack at the changes.

@TomWright
Copy link
Owner

I agree that option 1 is probably the right way to go here.

  1. It allows us to instantiate a parser based on some options given to the factory method.
  2. Dasel would not longer instantiate all parsers regardless of which parser is used.

Thanks for looking into this 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants