Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] transform specific meta variables #723

Closed
7 tasks done
HerringtonDarkholme opened this issue Nov 23, 2023 · 11 comments · Fixed by #860
Closed
7 tasks done

[feature] transform specific meta variables #723

HerringtonDarkholme opened this issue Nov 23, 2023 · 11 comments · Fixed by #860
Labels
enhancement New feature or request

Comments

@HerringtonDarkholme
Copy link
Member

HerringtonDarkholme commented Nov 23, 2023

⭐ Suggestion

It is impossible to apply rules/transformations to specific sub nodes in the matching.
We can only move multi-var match around in the fix.
It is possible to introduce a apply sub rules to matched variables / variable list.

Promposal:

rewriters: 
- id: re1
  rule: { pattern: subPattern }
  fix: fixer
- id: re2
  rule: {kind: subKind}
  fix: fixer2
rule:
  pattern: $$$LISTS
transform:
  NEW_LIST:
    applyRules:
      source: $$$LISTS
      rewrite: [re1, re2, re3]
      join: optionalJoiner
      sortBy: optionalKey
fix:  useNew($NEW_LIST)

💻 Use Cases

Suppose we have this code.

import {A, B, C} from 'lib'

We want to transform it to

import A from 'lib/a'
import B from 'lib/b'
import C from 'lib/c'

This can be achieved by something similar to

rewriters:
- id: importFromLib
  rule: { pattern: $IDENT, kind: import_kind }
  transform: 
    LIB_NAME:
      convert:
        source: $IDNET
        toCase: lowercase
  fix: import $IDENT from 'lib/$LIB_NAME'
rule:
  pattern: import {$$$IMPORTS} from 'lib'
transform:
  NEW_IMPORTS:
    applyRules:
      source: $$$IMPORTS
      rewrite: [ importFromLib ]
      join: '\n'                  # remove original code
fix: $NEW_IMPORTS
@HerringtonDarkholme HerringtonDarkholme added the enhancement New feature or request label Nov 23, 2023
@emosenkis
Copy link

This can be accomplished today by writing two rules and running them repeatedly until there are no more matches:

import {$FIRST, $$$REST} from 'lib'
->
import $FIRST from 'lib/$LOWERCASE_FIRST'
import $REST from 'lib'

and

import {$ONLY} from 'lib'
->
import $ONLY from 'lib/$LOWERCASE_ONLY'

@HerringtonDarkholme
Copy link
Member Author

List meta transformation may be related to this. We need a way to select a list of nodes.
#779

@emosenkis
Copy link

List comprehensions seem artificially limiting. What it i want to transform groups of N items together, e.g. convert key: value pairs to key=value. It would be much more powerful and add less complexity to the mental model if it was possible to run one or more rules over the contents of a list metavariable, thereby harnessing all the existing capabilities instead of adding new one-off functionality.

@HerringtonDarkholme
Copy link
Member Author

@emosenkis would you like to elaborate more on the grouping and converting? I am interested in alternative ways to transformations.

@emosenkis
Copy link

emosenkis commented Jan 9, 2024

I think what I was trying to express is that I don't think transformations are the right abstraction here. ast-grep is an incredibly powerful framework for transforming code and it doesn't make any sense to me to create a second, much less capable, mechanism for transforming code. Why not instead allow rules to be run ast-grep over the metavariables? The config might look something like this:

rule:
    id: dict-constructor-to-literal
    language: python
    pattern: dict($$$PARAMS)
   # not included here: filter out positional argument dict constructor calls, e.g. dict(list_of_pairs)
fix:
    { ${applyRules(dict-items, $$$PARAMS)} }
---
rule:
    id: kwarg-to-dict-literal
    language: python
    groups: [dict-items]
    pattern: $KEY=$VAL
    kind: keyword_argument
fix: "$KEY": $VAL

I'm combining a few ideas here, each of which can be accepted or rejected independently:

  • Add a transformation that can accept both regular and list metavariables as input and runs one or more ast-grep rules over the input, outputting the fixed result of applying those rules

  • Add a groups attribute to rules, allowing multiple rules to be identified by group IDs. Instead, the rule definitions could be required to be defined inline in the rule that invokes them or they could be referenced by individual rule IDs instead (both would work, but reduce reusability)

  • Borrowing from Bash and JS template string ${expression} syntax to allow function calls, not just variable names in variable interpolation. I like that because it avoids the need to define a new metavariable for every transformation, offering no way to inline metavariables that are only referenced once. It also offers a path towards consolidating the rule/fix and the transform systems into a single code-rewriting model, which would simplify the mental model and avoid all the gotchas resulting when each of the two mechanisms for transforming code only implements a subset of the capabilities required, resulting in awkward switching back and forth between the two systems. However, applying rules to metavariables could instead be implemented as just another transform, e.g.:

    transform:
        DICT_LITERAL_ARGS:
            applyRules:
                source: $$$PARAMS
                rule_group: dict-items
    

@HerringtonDarkholme
Copy link
Member Author

HerringtonDarkholme commented Jan 9, 2024

Thank you for the detailed explanation! I really appreciate your informative input! 😃

I will also share my thoughts about the suggestions:

Add a transformation that can accept both regular and list metavariables as input and runs one or more ast-grep rules over the input, outputting the fixed result of applying those rules

This is a nice idea and it will be very powerful! My concern is whether nesting rules/referencing rules will be too complex for users to understand.

Add a groups attribute to rules, allowing multiple rules to be identified by group IDs. Instead, the rule definitions could be required to be defined inline in the rule that invokes them or they could be referenced by individual rule IDs instead (both would work, but reduce reusability)

I would prefer the latter solution by using rule IDs at first. Fewer concepts can make it easier to learn. In future we can also add a new group concept if it is useful.

Borrowing from Bash and JS template string ${expression} syntax to allow function calls, not just variable names in variable interpolation.

The format of transformation is discussed in #436. I also proposed a mini-bash-like scripting language for fix templates. I eventually withdrew the proposal because it is too likely to evolve into another complex DSL hard to learn. I also had not designed the escaping mechanism for the fix lang, say, if users want to change 'string' + a + 'string' into string${variable}string. It will require ast-grep add escaping for ${}.

So I decided to release the YAML transformation at first and looks like the solution has worked fine.


I really like the idea of applying rule to a sub-region of code and composing a larger rewrite. Let me think more about it.

@HerringtonDarkholme
Copy link
Member Author

HerringtonDarkholme commented Jan 11, 2024

Contemplating on the essence of ast-grep rewriting, I found apply-rule is a nice fit for sub meta variable transformation.

ast-grep rewrite rule

The mental model of ast-grep is like:

  1. parse the source to AST
  2. find nodes based on rule/pattern. Note that every rewritten node does not overlap with each other.
  3. generate fix string based on matched nodes
  4. replace node text with generated fix

Rewrite list with Python's mindset

Previously I'm thinking about using list comprehension to transform multiple nodes. Indeed it is limited. I was previously imagining something like the pseudo Python code below

[ 
  rewrite(node)
  for node in meta_var_list
  if node.matches(filter_rule)
]

So that I can generate a list of fix strings that correspond to meta_var_list.
Furthermore, I can aggregate the list transformation results into one string like "comma separated item list"

','.join(rewrite for node in list if filter)

The YAML transformation for the fake code above will be like

rewrite_list:
  source: $$$METAVARS
  item: $NODE
  filter: 
    kind: xxx
  rewrite: rewrite($NODE)
  join: ','

What's missing in list comprehension?

  1. it introduces a new concept of filter_rule in transformation that is not used anywhere else
  2. it only manipulates the direct children of a node in the list
  3. it only supports multiple meta variables. It is common to change some specific nodes that cannot be captured by meta vars. In list compression, it can be supported by introducing more sources (like children of a node, sibling of a node). But that introduces more concepts :/
  4. it is simply impossible to apply different fixes on different nodes.

ast-grep's rule does not have the issues above. Since

  1. finding nodes is not limited to direct children nor meta variables.
  2. different rules can transform different nodes
  3. rules are our ol'friends! Nothing to learn

What's missing in rule-based rewrite?

Why can't we implement list comprehension based on our current rule?

  1. a code fix must be confined to the range of the matching node. There is no way to edit text outside of the node.
  2. a more powerful fixer [feature] Implement more powerful fix methods #656 can solve the issue above. But it introduces more concepts. Say, I want to generate import a from 'lib/a' after the original import. expandStart/prepend fail to achieve it at the moment. We need even more mechanisms to specify where to insert new text and possibly find a way to duplicate similar code insertions.
  3. code fix can only replace node. It is impossible to aggregate the node fixes into a new string. e.g. generate a comma separated item list.

@HerringtonDarkholme
Copy link
Member Author

HerringtonDarkholme commented Jan 11, 2024

The simpler way: applyRule

  1. applyRules make it possible to generate code fix that is not bound to a node range, making it more flexible to splice code fixes
  2. we need a way to join the codefixes from applyRules. So that we have a way to aggregate the result. so the joiner property may still be added.

Also, applyRule transformation is orthogonal to powerful fixer, i.e. node range expansion. Theoretically it is possible to achieve node expansion by using applyRule. But it is too knotty to do so:

  • find the parent node of the desired node
  • use the applyRule to rewrite the desired node
  • apply another rule to remove the nodes before/after the desired node.
  • replace the whole parent node with the new rewrite
Click to see the comparison of expansion and applyRule to remove a key-value pair in a dictionary/object

Expansion

rule:
  kind: pair  # find the pair
  has:
    field: key
    regex: test
fix:
  template: ''   # remove the pair
  expandEnd:   # remove the comma
    regex: ',' 

Apply Rules

utils:
  pairs:               # find pair
    kind: pair
    has:
      field: key
      regex: test
rule:
  kind: object   # find parent
  pattern: $OBJ
  has:
    matches: pairs
    stopBy: end
transform:
  NEW_OBJ:     # transform parent
    applyRules:
      source: $OBJ 
      rules: 
      - removePair       
      - removeComma 
rewriter: 
  removePair:
    rule: { matches: pairs }  # repeat matching
    fix: ''
  removeComma:
    rule: 
      regex: /,/
      follows: { matches: pairs } # repeat matching
    fix: ''
fix: '$NEW_OBJ'  # replace the parent

Concerns

  1. ast-grep will record matched meta-variables in a dictionary called MetaVarEnv. It is unclear how to populate the env in applyRules. More concretely, how to inherit meta vars from parent rules? Should one invocation of a subrule impact another invocation's MetaVarEnv ?
  2. Should we allow recursively calling sub-rule in sub-rule? If yes, how do we avoid cyclic dependency?
  3. Sub-rule should only be called in parent rule. ast-grep should ignore the rule defined as sub-rule, but how?

@HerringtonDarkholme
Copy link
Member Author

Withdrawn Proposal

⭐ Suggestion

It is impossible to transform a multiple-meta-variable match at the moment.

We can only move multi-var match around in the fix.
It is possible to introduce a list-comprehension like transformation to the YAML rule.

Promposal:

rule:
  pattern: $$$LISTS
transform:
  NEW_LIST:
    source: $$$LISTS
    item: $ITEM
    rewrite: rewriteWith($ITEM)
    includeUnnamed: true # optional, default to false
fix:  useNew($$$NEW_LIST)

💻 Use Cases

Suppose we have this code.

import {A, B, C} from 'lib'

We want to transform it to

import A from 'lib/a'
import B from 'lib/b'
import C from 'lib/c'

This can be achieved by something similar to

rule:
  pattern: import {$$$IMPORTS} from 'lib'
transform:
  NEW_IMPORTS:
    source: $$$LISTS
    item: $ITEM
    rewrite: import $ITEM from 'lib/$ITEM'
    includeUnnamed: false # optional

Questions

  1. should list transform produce one new transform variable or one multi-meta-var? Put it in another way, should people use the list transformed variable as $LIST or $$$LIST?
  2. what if I want to transform $ITEM? e.g. in the example above $ITEM is A but the actually we need a lowercase a in 'lib/a' as import source.
  3. Is it making the rule/fix system way too complex? I remember using bad YAML rules in my day job.

@HerringtonDarkholme
Copy link
Member Author

HerringtonDarkholme commented Jan 18, 2024

Status update!
This rule
image

rewrites to this

image

@HerringtonDarkholme HerringtonDarkholme changed the title [feature] support list meta variable transformation [feature] transform specific meta variables Feb 4, 2024
@HerringtonDarkholme
Copy link
Member Author

All sub-tasks are done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants