Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zk graph --format json is not handling note titles with inner double quotes #389

Open
2 tasks done
tjex opened this issue Jan 31, 2024 · 7 comments
Open
2 tasks done
Assignees
Labels
bug Something isn't working
Projects
Milestone

Comments

@tjex
Copy link
Member

tjex commented Jan 31, 2024

Check if applicable

  • I have searched the existing issues (required)
  • I'm willing to help fix the problem and contribute a pull request

Describe the bug

I have four notes with inner double quotes in their titles, e.g. This is a "note" title.

Running zk graph --json > notes.json dumps json fine except for these notes. In their place, is a singular comma:

    3 ▎   {filename:h5u1.md,filenameStem:h5u1,path:h.....<valid json continues>
E   2 ▎   ,     ■ Value expected
    1 ▎   {filename:jbig.md,filen......<valid json continues>

The weird thing is that there's only three instances of this, despite there being four notes with this quote pattern. Additionally, all four notes are not in the json file. In other words, three have been replaced with a singular column, and one has vanished altogether.

Additionally, it seems like notes with double quotes in their title create some strange behaviour.
it is exclusively those notes that are printed to stdout, and after the printing of each note's body, my zsh environment variables are printed as well. Which is super weird, and would freak me out if I didn't know that this project is a legitimate project run by honest people haha.

I'm sure that will be fixed, once the strings get handled, but just wanted to put it out there.

How to reproduce?

  1. run zk graph --json > notes.json and check if it's already bug free
  2. create a note This is a "test" note
  3. run zk index
  4. run zk graph --json > notes2.json

Theoretically, there should be a singular line with a comma

zk configuration

# zk configuration file
#
# Uncomment the properties you want to customize.

# NOTE SETTINGS
#
# Defines the default options used when generating new notes.
[note]
language = "en"
default-title = "untitled"
filename = "{{id}}"
extension = "md"
template = "default.md"

# Path globs ignored while indexing existing notes.
ignore = [
   "d/**",
]

# Configure random ID generation.
id-charset = "alphanum"
id-length = 4
id-case = "lower"

# EXTRA VARIABLES
#
# A dictionary of variables you can use for any custom values when generating
# new notes. They are accessible in templates with {{extra.<key>}}
[extra]

#key = "value"

# GROUP OVERRIDES
# Omitting `paths` is equivalent to providing a single path equal to the name of
# the group. This can be useful to quickly declare a group by the name of the
# directory it applies to.

[group.daily]
paths = ["d/2024"]

[group.daily.note]
filename = "{{format-date now}}"
extension = "md"
template = "daily.md"


# MARKDOWN SETTINGS
[format.markdown]

link-format = "markdown"
# in case for some reason, a white space file makes it into my vault
link-encode-path = true 
link-drop-extension = true

# use YAML tags only
hashtags = false
colon-tags = false
multiword-tags = false


# EXTERNAL TOOLS
[tool]

# Command used to preview a note during interactive fzf mode.
# Set it to an empty string "" to disable preview.

# bat is a great tool to render Markdown document with syntax highlighting.
#https://github.com/sharkdp/bat
#fzf-preview = "bat -p --color always {-1}"


# LSP
[lsp]

[lsp.diagnostics]
# Each diagnostic can have for value: none, hint, info, warning, error
# wiki-title = "hint"
dead-link = "error"

[lsp.completion]
# Show the note title in the completion pop-up, or fallback on its path if empty.
note-label = "{{title-or-path}}"
# Filter out the completion pop-up using the note title or its path.
#note-filter-text = "{{title}} {{path}}"
# Show the note filename without extension as detail.
#note-detail = ""


# NAMED FILTERS
[filter]

# Matches the notes created the last two weeks. For example:
#    $ zk list recents --limit 15
recents = "--sort created- --created-after 'last two weeks'"


# COMMAND ALIASES

[alias]

# Show a random note.
lucky = '$EDITOR $(zk list --format path --sort random --limit 1)'

daily = 'zk new --no-input "$ZK_NOTEBOOK_DIR/d/`date +%Y`"'
s = 'zk edit $ZK_NOTEBOOK_DIR/i/149x.md'
random-orphan = '$EDITOR $(zk list --orphan --format path --sort random -n1 -x d -x g)'
rg = 'rg $1 $(zk list --format path --delimiter " " $2 $3 $4 $5)'
conf = '$EDITOR "$ZK_NOTEBOOK_DIR/.zk/config.toml"'

# Edit the last modified note.
last = "zk edit --limit 1 --sort modified- $@"

# Edit the notes selected interactively, "recents" is a filter from above
recent = "zk edit recents --interactive"

# Print paths separated with colons for the notes found with the given
# arguments. This can be useful to expand a complex search query into a flag
# taking only paths. For example:
#   zk list --link-to "`zk path -m potatoe`"
#path = "zk list --quiet --format {{path}} --delimiter , $@"

# Returns the Git history for the notes found with the given arguments.
# Note the use of a pipe and the location of $@.
#hist = "zk list --format path --delimiter0 --quiet $@ | xargs -t -0 git log --patch --"

Environment

zk 0.14.0
system: Darwin 22.6.0 arm64
@tjex tjex self-assigned this Jan 31, 2024
@tjex tjex added the bug Something isn't working label Jan 31, 2024
@tjex tjex added this to Backlog in zk via automation Jan 31, 2024
@tjex
Copy link
Member Author

tjex commented Jan 31, 2024

@mcDevnagh @julio-lopez @jurica

Would it be a case of manually sanitizing the note titles pre-json output? e.g, in the internal/core/note_format.go file?

I had a look at the go json.Marshal documentation.

It states something about a syntax like below being possible?

	Filename     string                 `json:"filename,string"`

Which should format it appropriately for wrapping in double quotes, but it didn't work on first try.

@mcDevnagh
Copy link
Contributor

We shouldn't have to sanitize.
encoding/json does the sanitation for us.
For example,

package main

import "encoding/json"

type Test struct {
	Title string
}

func main() {
	json, err := json.Marshal(Test{
		Title: `"`,
	})

	if err != nil {
		panic(err)
	}
	print(string(json))
}

outputs

{"Title":"\""}

Is json.Marshal outputting everything in zk graph --json? This would be an upstream issue, but I'd find that very surprising. More likely we're calling json.Marshal for each note, and ignoring errors (or simply logging them). If that is the case, the best place to start fixing this would be to look at that error.

@tjex
Copy link
Member Author

tjex commented Feb 14, 2024

Ok great, thanks for this. As a heads up, I won't be able to look at this (or anything else) until March 17. I'll be on holiday 🏝️

@tjex
Copy link
Member Author

tjex commented Apr 1, 2024

note: narrowed this down a little bit and it actually looks like the culprit is osEnv() call in newNoteFormatter.

When it's getting called in graph.go, line 70, something about the double quoted title is breaking it. This is why it spits out the entire shell env variables to stdout. The double quotes in the file name is not the culprit as I'm using ids as file names.

Changing the file title from: what does "this" mean? to what does 'this' mean? executes without error.

@tjex
Copy link
Member Author

tjex commented Apr 1, 2024

It's actually due to the quotes not being escaped in link_format.go.

These functions are used to render the links within documents as well. So escaping the double quotes here will render links within notes with the backslashes: [[a linked \"note\"]]. This of course breaks the links.

@mcDevnagh
Copy link
Contributor

Does that mean that the true culprit is links to titles with quotes and not titles with quotes themselves?

@tjex
Copy link
Member Author

tjex commented Apr 6, 2024

In other words, quotes are being escaped properly everywhere in json output except the string for the filename's own link.

If I hardcode the problem field output to foo, the call to zk graph --format json returns json without error.

{
  "filename": "another \"test\" note.md",
  "filenameStem": "another \"test\" note",
  "path": "another \"test\" note.md",
  "absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
  "title": "another \"test\" note",
  "link": "foo",
  "lead": "",
  "body": "",
  "snippets": [],
  "rawContent": "# another \"test\" note\n\n\n",
  "wordCount": 4,
  "tags": [],
  "metadata": {},
  "created": "2024-04-01T03:11:03.540446352Z",
  "modified": "2024-04-01T03:11:05.217630323Z",
  "checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}

Otherwise it would in effect return this invalid json:

{
  "filename": "another \"test\" note.md",
  "filenameStem": "another \"test\" note",
  "path": "another \"test\" note.md",
  "absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
  "title": "another \"test\" note",
+  "link": "[[another "test" note]]", <- the quotes around test are not escaped
  "lead": "",
  "body": "",
  "snippets": [],
  "rawContent": "# another \"test\" note\n\n\n",
  "wordCount": 4,
  "tags": [],
  "metadata": {},
  "created": "2024-04-01T03:11:03.540446352Z",
  "modified": "2024-04-01T03:11:05.217630323Z",
  "checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}

With the initial idea for the fix in #400 , the output looks like this:

{
  "filename": "another \"test\" note.md",
  "filenameStem": "another \"test\" note",
  "path": "another \"test\" note.md",
  "absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
  "title": "another \"test\" note",
  "link": "[[another \"test\" note]]",
  "lead": "",
  "body": "",
  "snippets": [],
  "rawContent": "# another \"test\" note\n\n\n",
  "wordCount": 4,
  "tags": [],
  "metadata": {},
  "created": "2024-04-01T03:11:03.540446352Z",
  "modified": "2024-04-01T03:11:05.217630323Z",
  "checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}

Links are still rendered correctly within documents:

# note title
[another "test" note](another%20%22test%22%20note)
[[another "test" note]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
zk
Backlog
2 participants