Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the motivation to have hard limits on section and property [name] length? #429

Closed
goodhoko opened this issue May 15, 2020 · 17 comments

Comments

@goodhoko
Copy link

goodhoko commented May 15, 2020

The official spec states:

The maximum length of a pair key is 50 characters and the maximum length of a pair value is 255 characters. Any key or value beyond these limits shall be ignored.

and also

The maximum length of a section name is 4096 characters. All sections exceeding this limit are ignored.

Why? Why should be rules exceeding the limits ignored?


Problem

The questions are a bit suggestive. Let my explain why I believe there should not be any limits defined in the spec. First, a concrete example.

Ktlint, linting tool for Kotlin, allows its users to disable linting rules by putting disabled-rules=[rule-names] key-value pair into the .editorconfig file. When the user puts a larger amount of rules there and exceeds the rather low limit of 255 chars the pair is silently ignored.

In this case the user will probably notice the problem but has no option to workaround it while keeping the intended functionality. The spec simply can't support his/her intentions.

In some other cases the problem might not become evident as quickly because the rules are specified to be ignored. In case of ec4j (which is used in ktlint) the rules are ignored completely silently - not even a warning is rised. That's not the best thing to do.

Looking at history of editor-config-c the limits have been altered several times in the past. In case of property name, first from 50 to 500 and later from 500 back to 50 again. Altering the limits, whenever an issue is raised that they're too restrictive, is not only confusing but can also cause incompatibilities between various implementations of the specification.

To wrap it up: The limits are abitrary and had already changed several times. They efectivelly render some use cases of the .editorconfig impossible. Further, the fact that exceeding the limit results in ignoring the rule, makes it very hard for users to notice a limit was exceeded.

Proposal

I propose to simply remove the limits from the spec. I can't think of any practical motivation for them - that's also why I'm asking for it in the first place. .)

At minimum rules exceeding the limits should be rejected as syntactical error not silently ignored.

Thanks .)

@cxw42
Copy link
Member

cxw42 commented May 15, 2020

I agree that both (1) removing limits and (2) reporting invalid keys could be useful. On the flip side, it is important that EditorConfig cores or plugins not crash, even in the presence of pathological input.

What about this? ---

  • Remove mandatory maximum limits from the specification.
  • Add mandatory minimum lengths that all conforming implementations must support.
  • Permit implementations to define their own limits
  • Define a new key _parse_error that must not appear in an .editorconfig file. A core can send _parse_error=true to a plugin to indicate that there was a problem parsing the file.
  • Require that any core that drops a key send a _parse_error to the plugin. Do not require that the parse error indicate the name of the key, since the key may have been dropped for having an overlong name.
  • Encourage plugins and editors to report parse errors or errors they themselves encounter.

What say you?

@ppalaga
Copy link

ppalaga commented May 16, 2020

Thanks for bringing this up, @goodhoko

  • Remove mandatory maximum limits from the specification.

+1

  • Add mandatory minimum lengths that all conforming implementations must support.

+1, keeps the spec testable, at least the lower bound

  • Permit implementations to define their own limits

+1, permit implementations to define their own limits or no limits at all

  • Define a new key _parse_error that must not appear in an .editorconfig file. A core can send _parse_error=true to a plugin to indicate that there was a problem parsing the file.

Sounds hackish, but I cannot see a better way how to test a core with the current setup.

  • Require that any core that drops a key send a _parse_error to the plugin. Do not require that the parse error indicate the name of the key, since the key may have been dropped for having an overlong name.

Outputting Informative error messages should still be encouraged. How about _parse_error=<plaintext_error_message>?

  • Encourage plugins and editors to report parse errors or errors they themselves encounter.

+1

What should happen if there are multiple oversized lines? I would not mind requiring to output multiple _parse_error=<plaintext_error_message>

@ppalaga
Copy link

ppalaga commented May 16, 2020

Or perhaps better _parse_errors=<plaintext_error_message1>, <plaintext_error_message2>, ...
Where each message could be something like path/to/.editorconfig@<line>:<column>: <message>. WDYT?

@cxw42
Copy link
Member

cxw42 commented May 17, 2020

@ppalaga I like _parse_errors since we don't currently have any keys that appear multiple times in a section. I still think the mere presence of the key should indicate that an error happened, in case a plug-in can't handle the full-length value.

I agree with your proposed format (path@r:c). 1-based line, col numbers?

@ppalaga
Copy link

ppalaga commented May 17, 2020

1-based line, col numbers?

Yes.

And the question is of course, whether the parser may output an error list longer than its own maximum? 8-]

@cxw42
Copy link
Member

cxw42 commented May 17, 2020

:D For the record, I think _parse_errors values should not have a hard max length --- it should obey the same rules as all other keys,

@goodhoko
Copy link
Author

goodhoko commented May 18, 2020

Hello again, sorry for late response.

First, thanks for adding more context to this discussion. I'd say that we're trying to solve more than one problem at once here. I'll try to break this up into two individual problems that are in fact independent.

Length limits

This is the main issue of my original post. There're abitrary limits on what the editor config format can express. Of course, there always will be some, but the length-based limits are especially restrictive and can be easily removed without any breaking changes.

That said I think it may help to learn what was the original motivation for the limits in the first place. This initial commit is as far as I can dig. Maybe @xuhdev could explain why s/he put them in? Without understanding that, I'm afraid we might be overlooking some important details.

If it turns out there're no practical reasons to have the limits, we can (and probably should) remove them from the spec. That is the baseline solution which will solve this issue. However it rises some follow-up ideas wich @cxw42 wrote above.

Add mandatory minimum lengths that all conforming implementations must support.

I don't think this is needed. Editor config is a general INI-like format for representing data with added semantics for certain values. Looking at some other general data formats (yaml, json, ini) none of them specifies a minimum supported key/value length. [1] Also note that we already have an implicit minimum supported length resulting from the pre-defined supported pairs. I think sticking to this implicit minimum limit is better than introducing yet another abitrary numbers to the spec.

Permit implementations to define their own limits

Generally I agree with this. However I'd argue that it is not necessary to state this explicitly in the spec. Implicitly, there will always be limits on how big data each implementation or running instance of implementation can handle. Again, using other data formats like json or yaml: they do not stress this explicitly in the spec.

Error handling

The other problem we touched here is error handling - how should cores and plugins communicate syntax erros to the user? While it is related, I'd argue that it's a separate issue that should be discussed elsewhere. Resolution of the main problem of this issue (the lenth limits) is independent of how or if the error handling will be changed.

I created another issue where we can discuss that.

[1] On the other hand YAML for example specifies a maximum length (1024 chars) for the key in key-value pairs.

(edited by cxw42: link directly to the supported-pairs section)

@cxw42
Copy link
Member

cxw42 commented May 18, 2020

@goodhoko Sure, we can discuss error handling and reporting in a separate issue.

I believe these things should be given in the specification, for the reasons below. Whenever a program has a bug, we can add a test case to make sure the bug doesn't recur. We can't do that here, but we can add language to the specification so that this issue will be settled for the future.

Add mandatory minimum lengths that all conforming implementations must support.

I don't think this is needed.

I believe we do need a required minimum length in order to clearly state what can be included in the tests. You are correct that we have an implicit limit, defined by the passing tests. If we state that limit expressly, we will be less likely to accidentally add tests later that violate the limits.

Permit implementations to define their own limits

Generally I agree with this. However I'd argue that it is not necessary to state this explicitly in the spec.

Defining this in the spec is a way of respecting core and plugin developers. By clearly stating what is and is not conforming, we give developers confidence in their work.

Example: Suppose a core could not handle keys of a certain length. If implementation-defined limits were expressly permitted in the spec, the core would still be conforming, and the issue would go to the core as an implementation bug. If the specification did not expressly permit implementation-defined limits, it would be unclear whether the problem were with the specification or the implementation.

I look forward to your thoughts!

@xuhdev
Copy link
Member

xuhdev commented May 19, 2020

@goodhoko The limit exists for the ease of implementation of core libraries, given the fact that core libraries would be implemented in many different programming languages (but I couldn't find exactly where this was discussed). The thought was that as long as the limit is large enough, it shouldn't bother users (as far as I can remember).

@ppalaga
Copy link

ppalaga commented May 19, 2020

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

@goodhoko, I'd let ec4j have no explicit upper limits immediately after the core-tests change.

@xuhdev
Copy link
Member

xuhdev commented May 19, 2020

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

I have no problem for increasing the limit.

@cxw42
Copy link
Member

cxw42 commented May 20, 2020

Would it be helpful if I opened a draft PR in the specification to serve as the subject of a vote?

@goodhoko
Copy link
Author

I believe we do need a required minimum length in order to clearly state what can be included in the tests. You are correct that we have an implicit limit, defined by the passing tests. If we state that limit expressly, we will be less likely to accidentally add tests later that violate the limits.

Defining this in the spec is a way of respecting core and plugin developers. By clearly stating what is and is not conforming, we give developers confidence in their work.

Ack. I must agree that testing and developer experience are worth adding the lower limits even if their values might be arbitrary.

The limit exists for the ease of implementation of core libraries, given the fact that core libraries would be implemented in many different programming languages [...] The thought was that as long as the limit is large enough, it shouldn't bother users (as far as I can remember).

Thanks for the context. I think that removing the limits and permitting implementations to introduce their own limits does not make implementation any harder, so this does not block us from doing so.

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

Yes these seem like a sane for key-value pairs.

There's also the current upper limit of 4096 characters for section name length. In the planned schema it makes sense to also include the lower limit for section name. I'd go with 1024 chars too. What do you think?

Would it be helpful if I opened a draft PR in the specification to serve as the subject of a vote?

Honestly, I was really hyped to do that myself. 😌 Buf of course if you already have something ready, I won't object. 👍

@cxw42
Copy link
Member

cxw42 commented May 20, 2020

@goodhoko I don't have anything ready :) . If you'd like to, feel free!

lower limit for section name. I'd go with 1024 chars too.

Fine with me!

goodhoko added a commit to goodhoko/specification that referenced this issue May 21, 2020
…mits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 21, 2020
…mits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 21, 2020
…h limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/specification that referenced this issue May 21, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
@goodhoko
Copy link
Author

Here we go

goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 21, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/specification that referenced this issue May 22, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 22, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
cxw42 added a commit to cxw42/editorconfig-vim that referenced this issue May 26, 2020
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 26, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue May 26, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
cxw42 added a commit to cxw42/editorconfig-vim that referenced this issue Jun 1, 2020
goodhoko added a commit to goodhoko/editorconfig-core-test that referenced this issue Jul 21, 2020
…th limits

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
xuhdev pushed a commit to editorconfig/specification that referenced this issue Dec 23, 2020
…th limits (#21)

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
xuhdev pushed a commit to editorconfig/editorconfig-core-test that referenced this issue Dec 28, 2020
…th limits (#41)

- Remove the upper limit for section name, key and value length.
- Permit implementations to define their own upper limits.
- Add minimum supported lengths each implementation must support.
@goodhoko
Copy link
Author

The spec and core tests are updated. I think we can close this issue!

I guess the next logical step is to notify maintainers of the core implementations. I'll try to do that using the list at https://editorconfig.org/.

Thanks to everyone who helped to push this to the finish! <3

@goodhoko
Copy link
Author

goodhoko commented Dec 28, 2020

For reference, here I'll link issues or PRs in the implementation repositories that track this update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants