-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEEDBACK] The data model could be simplified for options and attributes #716
Comments
I think we should explicitly say that this is so. It's not just DM that's affected. Translation tools or message authors shouldn't have to remember option orders. -- Side note: while I don't expect "normal people" (don't read too much into 'normal') to file feedback in this format, could we, as WG members, try to keep separate topics in separate issues? |
Fair point. I'll break this up into three, keeping the options here because that's what you already commented on. |
While I'm not opposed to this change, I'd like to note potential reasons to prefer arrays over maps:
|
I think this is an interesting problem. A message that passes through the data model and is reserialized might have declarations or variants in a different order. This could negatively affect leverage in naive translation memory tools that look only at the value of the message string.
We guard against this:
|
On the other hand, our current approach is rather prone to producing false negative results for comparisons, as the annotations |
A different approach:
Then, the
(Keeping Then, the same type can be used for both option maps and attribute maps:
This has the disadvantage that it allows constructing an option map that maps some option name onto an operand with no value. But it has the advantage of making option maps and attribute maps uniform. (This is what the data model in my ICU4C implementation does.) |
What's the benefit of allowing for an option with no value to be represented in the data model? |
I was going to say that there's no direct benefit, but the indirect benefit is that having a single name for operands -- which can be the right-hand side of an option, the right-hand side of an attribute, or the subject of an expression -- makes the data model easier to read and understand. However, thinking about it, this would mean it would be possible to construct a data model that can't be serialized to a syntactically correct message (e.g. It still seems like a bad code smell to me to repeat |
I'm fine with a map, and I think it is a good idea. The data model is to describe how the data looks like, and hint on how it works. See my comment on "map as an abstract data type" here #718 (comment) See my old data model: // The order matters.
// So we need a "special map" that keeps the insertion order.
export type OrderedMap<K, V> = Map<K, V>; When one implements the data model in a certain programming language, they must use an order preserving map if available (for example LinkedHashMap in Java), or implement something that works the same. For example protobuffer v2 use a list of pairs: It is still a map, conceptually. |
So I would propose changing Eemeli's proposal to something like this: export type OrderPreservingMap<K, V> = Map<K, V>;
...
+ options: OrderPreservingMap<string, Literal | VariableRef>;
...
+ attributes: OrderPreservingMap<string, Literal | VariableRef | null>; We can also define export type OrderPreservingMap<K, V> = Map<K, V>;
export type LiteralOrVariableRef = Literal | VariableRef
...
+ options: OrderPreservingMap<string, LiteralOrVariableRef>;
...
+ attributes: OrderPreservingMap<string, LiteralOrVariableRef | null>; |
@mihnita Why would we want to explicitly preserve option order? |
I would like to understand that too. That is for .input and .local. clearly
the order is significant. Options seem quit different, and having the order
be significant seems likely to be error-prone.
…On Sat, Mar 23, 2024, 00:49 Eemeli Aro ***@***.***> wrote:
@mihnita <https://github.com/mihnita> Why would we want to explicitly
preserve option order?
—
Reply to this email directly, view it on GitHub
<#716 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMDAEPBHPQ365RMGMM3YZUXW3AVCNFSM6AAAAABEOCSIKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGQYDAMBYGM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Stas' argument. We want the data model to be used by tools, so that they can do refactoring and all kind of cool stuff without writing a parser / serializer. But now I edit a file in various editors, or refactored / auto-formatted by tools written different programming languages, and you end up with big differences for nothing. Messing up the diff tools, version control, etc. And in general I would find it very annoying to write my options one way and have some dumb tool reshuffle them based on some invisible criteria (a hash value, most often). So it makes no difference in runtime functionality, but in user convenience. |
We can maybe have it non-normative? As in: WDYT? |
I would not oppose something like the proposed text, but I honestly don't think it's necessary to encourage or require tools to keep source order when parsing and reserialising option bags; this will happen naturally because it's so easy to do and, as mentioned, it prevents churn. |
Parts of this issue have been broken out into their own issues #717 and #718 after this was initially posted.
While working on some Python code, I ended up needing to put together a pythonic representation of the message data model. While doing so, I encountered a few places where I could apply some simplifications to the data model with no loss of fidelity.
I think we should consider applying this change to how options and attributes are represented in the data model:
As discussed most recently in #710, option names cannot be duplicated. This means that instead of an array of name+value wrapper objects, options could be represented by a 1:1 mapping. I'm using a JS Map here rather than an Object, as the former more explicitly does not include any prototype fields. In the JSON Schema, this could equivalently be represented by a JSON Object.
One explicit benefit of this change would be that messages coming from JSON interchange could not include any Duplicate Option Name errors.
The same applies for attributes as for options, but with the added note that values may be
null
because the attribute value is optional.Separately, we probably should add a requirement for functions not to assign any meaning to option order, such that
:func foo=42 bar=13
and:func bar=13 foo=42
would always be synonymous.The text was updated successfully, but these errors were encountered: