Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new field type list #179

Open
4 tasks
peterdesmet opened this issue Mar 13, 2024 · 1 comment
Open
4 tasks

Support new field type list #179

peterdesmet opened this issue Mar 13, 2024 · 1 comment

Comments

@peterdesmet
Copy link
Member

peterdesmet commented Mar 13, 2024

CHANGELOG: https://datapackage.org/overview/changelog/#list-field-type-new

  • Decide on support, e.g. don't support this (i.e. default to string) or convert values on delimiter to a vector.
  • Update documentation
  • Update tests
  • See also suggestions in Support delimited arrays #173
@khusmann
Copy link
Contributor

khusmann commented Jun 7, 2024

I'm interested in this field type for representing multiselect items (although it will have to wait until the list field type can be extended by the categories property)

In the meantime, I'd vote for the latter approach (convert the cell to a vector using delimiter and load as list-columns).

For example, the csv:

row_id, field1
1, "a,b,c"
2, "d,e"
3, "f"

with schema fields:

[
  {
    "name": "row_id",
    "type": "integer"
  },
  {
    "name": "field1",
    "type": "list",
    "delimiter": ",",
    "itemType": "string"
  }
]

would become:

library(tidyverse)
tibble(
  row_id = 1:3,
  field1 = list(
    c("a", "b", "c"),
    c("d", "e"),
    c("f")
  )
)
#> # A tibble: 3 × 2
#>   row_id field1   
#>    <int> <list>   
#> 1      1 <chr [3]>
#> 2      2 <chr [2]>
#> 3      3 <chr [1]>

Created on 2024-06-07 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants