What is the motivation to have hard limits on section and property [name] length? #429

goodhoko · 2020-05-15T15:31:38Z

The maximum length of a pair key is 50 characters and the maximum length of a pair value is 255 characters. Any key or value beyond these limits shall be ignored.

and also

The maximum length of a section name is 4096 characters. All sections exceeding this limit are ignored.

Why? Why should be rules exceeding the limits ignored?

Problem

The questions are a bit suggestive. Let my explain why I believe there should not be any limits defined in the spec. First, a concrete example.

Ktlint, linting tool for Kotlin, allows its users to disable linting rules by putting disabled-rules=[rule-names] key-value pair into the .editorconfig file. When the user puts a larger amount of rules there and exceeds the rather low limit of 255 chars the pair is silently ignored.

In this case the user will probably notice the problem but has no option to workaround it while keeping the intended functionality. The spec simply can't support his/her intentions.

In some other cases the problem might not become evident as quickly because the rules are specified to be ignored. In case of ec4j (which is used in ktlint) the rules are ignored completely silently - not even a warning is rised. That's not the best thing to do.

Looking at history of editor-config-c the limits have been altered several times in the past. In case of property name, first from 50 to 500 and later from 500 back to 50 again. Altering the limits, whenever an issue is raised that they're too restrictive, is not only confusing but can also cause incompatibilities between various implementations of the specification.

To wrap it up: The limits are abitrary and had already changed several times. They efectivelly render some use cases of the .editorconfig impossible. Further, the fact that exceeding the limit results in ignoring the rule, makes it very hard for users to notice a limit was exceeded.

Proposal

I propose to simply remove the limits from the spec. I can't think of any practical motivation for them - that's also why I'm asking for it in the first place. .)

At minimum rules exceeding the limits should be rejected as syntactical error not silently ignored.

Thanks .)

The text was updated successfully, but these errors were encountered:

cxw42 · 2020-05-15T20:37:11Z

I agree that both (1) removing limits and (2) reporting invalid keys could be useful. On the flip side, it is important that EditorConfig cores or plugins not crash, even in the presence of pathological input.

What about this? ---

Remove mandatory maximum limits from the specification.
Add mandatory minimum lengths that all conforming implementations must support.
Permit implementations to define their own limits
Define a new key _parse_error that must not appear in an .editorconfig file. A core can send _parse_error=true to a plugin to indicate that there was a problem parsing the file.
Require that any core that drops a key send a _parse_error to the plugin. Do not require that the parse error indicate the name of the key, since the key may have been dropped for having an overlong name.
Encourage plugins and editors to report parse errors or errors they themselves encounter.

What say you?

ppalaga · 2020-05-16T09:01:18Z

Thanks for bringing this up, @goodhoko

Remove mandatory maximum limits from the specification.

+1

Add mandatory minimum lengths that all conforming implementations must support.

+1, keeps the spec testable, at least the lower bound

Permit implementations to define their own limits

+1, permit implementations to define their own limits or no limits at all

Define a new key _parse_error that must not appear in an .editorconfig file. A core can send _parse_error=true to a plugin to indicate that there was a problem parsing the file.

Sounds hackish, but I cannot see a better way how to test a core with the current setup.

Require that any core that drops a key send a _parse_error to the plugin. Do not require that the parse error indicate the name of the key, since the key may have been dropped for having an overlong name.

Outputting Informative error messages should still be encouraged. How about _parse_error=<plaintext_error_message>?

Encourage plugins and editors to report parse errors or errors they themselves encounter.

+1

What should happen if there are multiple oversized lines? I would not mind requiring to output multiple _parse_error=<plaintext_error_message>

ppalaga · 2020-05-16T09:06:45Z

Or perhaps better _parse_errors=<plaintext_error_message1>, <plaintext_error_message2>, ...
Where each message could be something like path/to/.editorconfig@<line>:<column>: <message>. WDYT?

cxw42 · 2020-05-17T02:23:45Z

@ppalaga I like _parse_errors since we don't currently have any keys that appear multiple times in a section. I still think the mere presence of the key should indicate that an error happened, in case a plug-in can't handle the full-length value.

I agree with your proposed format (path@r:c). 1-based line, col numbers?

ppalaga · 2020-05-17T19:42:46Z

1-based line, col numbers?

Yes.

And the question is of course, whether the parser may output an error list longer than its own maximum? 8-]

cxw42 · 2020-05-17T22:19:49Z

:D For the record, I think _parse_errors values should not have a hard max length --- it should obey the same rules as all other keys,

goodhoko · 2020-05-18T10:53:32Z

Hello again, sorry for late response.

First, thanks for adding more context to this discussion. I'd say that we're trying to solve more than one problem at once here. I'll try to break this up into two individual problems that are in fact independent.

Length limits

This is the main issue of my original post. There're abitrary limits on what the editor config format can express. Of course, there always will be some, but the length-based limits are especially restrictive and can be easily removed without any breaking changes.

That said I think it may help to learn what was the original motivation for the limits in the first place. This initial commit is as far as I can dig. Maybe @xuhdev could explain why s/he put them in? Without understanding that, I'm afraid we might be overlooking some important details.

If it turns out there're no practical reasons to have the limits, we can (and probably should) remove them from the spec. That is the baseline solution which will solve this issue. However it rises some follow-up ideas wich @cxw42 wrote above.

Add mandatory minimum lengths that all conforming implementations must support.

I don't think this is needed. Editor config is a general INI-like format for representing data with added semantics for certain values. Looking at some other general data formats (yaml, json, ini) none of them specifies a minimum supported key/value length. [1] Also note that we already have an implicit minimum supported length resulting from the pre-defined supported pairs. I think sticking to this implicit minimum limit is better than introducing yet another abitrary numbers to the spec.

Permit implementations to define their own limits

Generally I agree with this. However I'd argue that it is not necessary to state this explicitly in the spec. Implicitly, there will always be limits on how big data each implementation or running instance of implementation can handle. Again, using other data formats like json or yaml: they do not stress this explicitly in the spec.

Error handling

The other problem we touched here is error handling - how should cores and plugins communicate syntax erros to the user? While it is related, I'd argue that it's a separate issue that should be discussed elsewhere. Resolution of the main problem of this issue (the lenth limits) is independent of how or if the error handling will be changed.

I created another issue where we can discuss that.

[1] On the other hand YAML for example specifies a maximum length (1024 chars) for the key in key-value pairs.

(edited by cxw42: link directly to the supported-pairs section)

cxw42 · 2020-05-18T12:00:29Z

@goodhoko Sure, we can discuss error handling and reporting in a separate issue.

I believe these things should be given in the specification, for the reasons below. Whenever a program has a bug, we can add a test case to make sure the bug doesn't recur. We can't do that here, but we can add language to the specification so that this issue will be settled for the future.

Add mandatory minimum lengths that all conforming implementations must support.

I don't think this is needed.

I believe we do need a required minimum length in order to clearly state what can be included in the tests. You are correct that we have an implicit limit, defined by the passing tests. If we state that limit expressly, we will be less likely to accidentally add tests later that violate the limits.

Permit implementations to define their own limits

Generally I agree with this. However I'd argue that it is not necessary to state this explicitly in the spec.

Defining this in the spec is a way of respecting core and plugin developers. By clearly stating what is and is not conforming, we give developers confidence in their work.

Example: Suppose a core could not handle keys of a certain length. If implementation-defined limits were expressly permitted in the spec, the core would still be conforming, and the issue would go to the core as an implementation bug. If the specification did not expressly permit implementation-defined limits, it would be unclear whether the problem were with the specification or the implementation.

I look forward to your thoughts!

xuhdev · 2020-05-19T07:22:20Z

@goodhoko The limit exists for the ease of implementation of core libraries, given the fact that core libraries would be implemented in many different programming languages (but I couldn't find exactly where this was discussed). The thought was that as long as the limit is large enough, it shouldn't bother users (as far as I can remember).

ppalaga · 2020-05-19T09:57:13Z

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

@goodhoko, I'd let ec4j have no explicit upper limits immediately after the core-tests change.

xuhdev · 2020-05-19T17:26:41Z

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

I have no problem for increasing the limit.

cxw42 · 2020-05-20T00:46:47Z

Would it be helpful if I opened a draft PR in the specification to serve as the subject of a vote?

goodhoko · 2020-05-20T13:17:05Z

I believe we do need a required minimum length in order to clearly state what can be included in the tests. You are correct that we have an implicit limit, defined by the passing tests. If we state that limit expressly, we will be less likely to accidentally add tests later that violate the limits.

Defining this in the spec is a way of respecting core and plugin developers. By clearly stating what is and is not conforming, we give developers confidence in their work.

Ack. I must agree that testing and developer experience are worth adding the lower limits even if their values might be arbitrary.

The limit exists for the ease of implementation of core libraries, given the fact that core libraries would be implemented in many different programming languages [...] The thought was that as long as the limit is large enough, it shouldn't bother users (as far as I can remember).

Thanks for the context. I think that removing the limits and permitting implementations to introduce their own limits does not make implementation any harder, so this does not block us from doing so.

I do not mind having the minimal supported length limits, but please let's make them reasonably large. How about 1024 for keys and 4096 for values?

Yes these seem like a sane for key-value pairs.

There's also the current upper limit of 4096 characters for section name length. In the planned schema it makes sense to also include the lower limit for section name. I'd go with 1024 chars too. What do you think?

Would it be helpful if I opened a draft PR in the specification to serve as the subject of a vote?

Honestly, I was really hyped to do that myself. 😌 Buf of course if you already have something ready, I won't object. 👍

cxw42 · 2020-05-20T15:02:18Z

@goodhoko I don't have anything ready :) . If you'd like to, feel free!

lower limit for section name. I'd go with 1024 chars too.

Fine with me!

…mits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

…h limits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

…th limits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

goodhoko · 2020-05-21T10:14:59Z

Here we go

…th limits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

editorconfig/editorconfig#429

…th limits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

editorconfig/editorconfig#429

…th limits - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

…th limits (#21) - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

…th limits (#41) - Remove the upper limit for section name, key and value length. - Permit implementations to define their own upper limits. - Add minimum supported lengths each implementation must support.

goodhoko · 2020-12-28T22:43:44Z

The spec and core tests are updated. I think we can close this issue!

I guess the next logical step is to notify maintainers of the core implementations. I'll try to do that using the list at https://editorconfig.org/.

Thanks to everyone who helped to push this to the finish! <3

goodhoko · 2020-12-28T23:31:13Z

For reference, here I'll link issues or PRs in the implementation repositories that track this update.

ec4j: Remove section, key and value length limits ec4j/ec4j#111
python core: Support key, values and section names of arbitrary length editorconfig-core-py#36
javascript core: doesn't seem to implement limits atm
java core: doesn't seem to implement limits atm
ruby core: Support key, values and section names of arbitrary length editorconfig-core-ruby#8

goodhoko mentioned this issue May 15, 2020

Syntactically invalid .editorconfig silently ignored pinterest/ktlint#739

Closed

cxw42 added core code Documentation feature request file format labels May 15, 2020

goodhoko mentioned this issue May 18, 2020

Improve editorconfig parsing error handling #430

Open

cxw42 added a commit to cxw42/editorconfig-vim that referenced this issue May 26, 2020

Remove length limits on sections/keys/values

d69128f

editorconfig/editorconfig#429

cxw42 mentioned this issue May 26, 2020

Remove length limits on sections/keys/values editorconfig/editorconfig-vim#148

Draft

cxw42 added a commit to cxw42/editorconfig-vim that referenced this issue Jun 1, 2020

Remove length limits on sections/keys/values

3f6b71a

editorconfig/editorconfig#429

ffes mentioned this issue Aug 26, 2020

Segfault while reading .editorconfig generated by JetBrains IDE editorconfig/editorconfig-core-c#73

Closed

xuhdev mentioned this issue Dec 5, 2020

Should we remove the upper limit for section name, key and value length? editorconfig/editorconfig-vote#13

Closed

goodhoko closed this as completed Dec 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the motivation to have hard limits on section and property [name] length? #429

What is the motivation to have hard limits on section and property [name] length? #429

goodhoko commented May 15, 2020 •

edited

cxw42 commented May 15, 2020

ppalaga commented May 16, 2020

ppalaga commented May 16, 2020

cxw42 commented May 17, 2020

ppalaga commented May 17, 2020

cxw42 commented May 17, 2020

goodhoko commented May 18, 2020 •

edited by cxw42

cxw42 commented May 18, 2020

xuhdev commented May 19, 2020 •

edited

ppalaga commented May 19, 2020 •

edited

xuhdev commented May 19, 2020

cxw42 commented May 20, 2020

goodhoko commented May 20, 2020

cxw42 commented May 20, 2020

goodhoko commented May 21, 2020

goodhoko commented Dec 28, 2020

goodhoko commented Dec 28, 2020 •

edited

What is the motivation to have hard limits on section and property [name] length? #429

What is the motivation to have hard limits on section and property [name] length? #429

Comments

goodhoko commented May 15, 2020 • edited

Problem

Proposal

cxw42 commented May 15, 2020

ppalaga commented May 16, 2020

ppalaga commented May 16, 2020

cxw42 commented May 17, 2020

ppalaga commented May 17, 2020

cxw42 commented May 17, 2020

goodhoko commented May 18, 2020 • edited by cxw42

Length limits

Error handling

cxw42 commented May 18, 2020

xuhdev commented May 19, 2020 • edited

ppalaga commented May 19, 2020 • edited

xuhdev commented May 19, 2020

cxw42 commented May 20, 2020

goodhoko commented May 20, 2020

cxw42 commented May 20, 2020

goodhoko commented May 21, 2020

goodhoko commented Dec 28, 2020

goodhoko commented Dec 28, 2020 • edited

goodhoko commented May 15, 2020 •

edited

goodhoko commented May 18, 2020 •

edited by cxw42

xuhdev commented May 19, 2020 •

edited

ppalaga commented May 19, 2020 •

edited

goodhoko commented Dec 28, 2020 •

edited