Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner diffs for inserted array items #38

Open
lukechilds opened this issue Sep 20, 2017 · 2 comments
Open

Cleaner diffs for inserted array items #38

lukechilds opened this issue Sep 20, 2017 · 2 comments

Comments

@lukechilds
Copy link

lukechilds commented Sep 20, 2017

Isolated test case as requested in avajs/ava#1521

If you append an item to an array and diff:

var concordance = require('concordance');
const orig = [
  'foo',
  'bar',
  'fizz',
  'buzz'
];
const copy = Array.from(orig);
copy.push('hi')
concordance.diff(orig, copy);

You get neat output:

  [
    'foo',
    'bar',
    'fizz',
    'buzz',
+   'hi',
  ]

However if you insert anywhere else and diff:

var concordance = require('concordance');
const orig = [
    'foo',
    'bar',
    'fizz',
    'buzz'
];
const copy = Array.from(orig);
copy.unshift('hi')
concordance.diff(orig, copy);

The output is quite hard to follow after the inserted item:

  [
-   'foo',
+   'hi',
-   'bar',
+   'foo',
-   'fizz',
+   'bar',
-   'buzz',
+   'fizz',
+   'buzz',
  ]
@lukechilds lukechilds changed the title Cleaner diffs for prepended array items Cleaner diffs for inserted array items Sep 20, 2017
@lukechilds
Copy link
Author

Repro with git producing clean results:

$ cat orig
[
    'foo',
    'bar',
    'fizz',
    'buzz'
];

$ cat copy
[
    'hi',
    'foo',
    'bar',
    'fizz',
    'buzz'
];

$ git diff orig copy
diff --git a/orig b/copy
index 5d484f0..f4066df 100644
--- a/orig
+++ b/copy
@@ -1,4 +1,5 @@
 [
+    'hi',
     'foo',
     'bar',
     'fizz',

@ninevra
Copy link
Contributor

ninevra commented Jul 6, 2021

I looked into this a bit, out of curiosity. This issue is specific to arrays of strings; it only occurs when string items of two arrays get diffed against eachother. What's going on is that the first pair of strings ('foo', 'hi') are diffed (producing - 'foo', + 'hi') and then the next pair of strings are diffed ('bar', 'foo') (producing - 'bar', + 'foo') and so on.

The issue is caused by the fact that string descriptors provide diffDeep(), and PrimitiveItem descriptors' prepareDiff() method (which would ordinarily be responsible for lining up the items of the two arrays) short-circuits when its value's descriptor provides diffDeep().

Simply removing that short-circuiting step solves this issue, but also prevents string items from ever being deeply diffed against eachother. E.g., the original example correctly produces

  [
+   'hi',
    'foo',
    'bar',
    'fizz',
    'buzz',
  ]

but diffing two arrays of similar strings against eachother produces e.g.

diff(['foo\nbar\nfizz\nbuzz'], ['hi\nfoo\nbar\nfizz\nbuzz'])

  [
-   `foo␊
-   bar␊
-   fizz␊
-   buzz`,
+   `hi␊
+   foo␊
+   bar␊
+   fizz␊
+   buzz`,
  ]

instead of the current result,

  [
+   hi␊
    `foo␊
    bar␊
    fizz␊
    buzz`,
  ]

(which is incidentally itself a little bugged, the opening backtick is on the wrong line).

I'm not sure there's a perfect solution here, at least with the current greedy diff algorithm; without an edit-script-minimizing algorithm, (edited: I don't understand the algorithm as well as I thought, sorry) it doesn't seem obvious whether an item should be treated as inserted or modified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants