Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for libraries/macros #153

Open
rkalis opened this issue Jul 4, 2023 · 3 comments
Open

Add support for libraries/macros #153

rkalis opened this issue Jul 4, 2023 · 3 comments
Labels
cashc-compiler Relates to the cashc compiler enhancement New feature or request refactor Internal refactoring
Milestone

Comments

@rkalis
Copy link
Member

rkalis commented Jul 4, 2023

In certain cases, contracts (or collections of contracts) reuse pieces of code. For readability and reusability of complex operations it would be good to allow defining libraries/macros that live outside of a contract and can be used by different contracts.

Under the hood, the compiler would then replace the library function call statement with its bytecode.

One option would to use macros (aka string replacements). E.g.:

macro MULDIV(x, y, z) { x * y / z }

macro DO_STUFF() {
  int a = 5;
}

The benefit of this is that it results in the most control over performance / bytesize for the developer. But it is also very easy for this to result in worse readability, e.g. using DO_STUFF() would cause confusion:

contract Test() {
  function spend() {
    DO_STUFF()
    require(a == 5); // Whoa, where did 'a' come from?
  }
}

So it likely makes more sense to go with a properly typed library system, even at the cost of slightly larger / less efficient contracts (note: size can probably be brought down by optimisations in the future if needed). We also think it's likely that the opcode limit will be increased at some point in the future.

So what does a library look like? For example:

library Math {
  function muldiv(int x, int y, int z) returns (int) {
    return x * y / z;
  }
  function divmul(int x, int y, int z) returns (int) {
    return x / y * z;
  }
}

A library is a collection of functions, that get compiled individually. The function has parameters and can potentially return one value. This can be extended to multiple values in the future (by "upgrading" the tuple type to allow for larger tuples).

From the consuming contract's perspective:
When the library is called in a contract, the compiler treats it as a built in function call (e.g. abs()). In other words, it puts the args on top of the stack and replaces abs() with OP_ABS, except OP_ABS will be a (much) larger piece of bytecode, with more than one opcode, e.g. OP_SWAP OP_MUL OP_DIV. The function-to-bytecode mapping is retrieved from the compiled "library artifact" (see below).

From the library's perspective:
Every function in a library is compiled independently. Compiling a library function is similar to compiling a contract with a single function, but there are a few notable differences:

  • We do not remove the final OP_VERIFY, because we only need to do that at the end of a contract execution, not some function call
  • We need to add a return statement that preserves the top stack value and cleans the rest of the stack (known to the function)
  • We need to create a library artifact interface / generation process to store function inputs/outputs/bytecode per function

We also need to add import functionality, allowing importing libraries into contracts or into other libraries. For simplicity we can stick to 1 library/contract per file.

import "./Math.cash";
@rkalis rkalis added enhancement New feature or request refactor Internal refactoring cashc-compiler Relates to the cashc compiler labels Jul 4, 2023
@mr-zwets
Copy link
Member

mr-zwets commented Nov 5, 2023

We also need to add import functionality, allowing importing libraries into contracts or into other libraries. For simplicity we can stick to 1 library/contract per file.

import "./Math.cash";

I'm thinking how this would work in practice with syntax checking pugins, would we call the function on the library like this

int result = Math.muldiv(x, y, z);

If we want to use the function name directly, it might be a good idea to do explicit imports to allow for highlighting.

import { muldiv } from "Math.cash";

contract Example() {
    function test(int x, int y, int z) {
        int result = muldiv(x, y, z);
    }
}

However, code completion would work better with the reversed order:

from "Math.cash" import { muldiv };

@rkalis rkalis added this to the v1 milestone Nov 8, 2023
@rkalis
Copy link
Member Author

rkalis commented Nov 8, 2023

@mr-zwets and I just had a call about this, some of the main points:

  • We won't generate "library artifacts" for now, instead compile everything on the fly when compiling the main contract.
    • We may want to extend support for separate library compilation to share artifacts instead of source code on NPM.
  • We'll allow for separate functions/libraries in a file together with contracts (or in separate files).
  • At the start of compilation, we'll "compile" libraries/functions by themselves and add these to the global symbol table.
    • From there, the compiler will treat these exactly the same as the builtin "global functions" (e.g. abs(x)).
    • We'll just need to extend the symbol table with "bytecode" for functions.
  • If you import from a separate file, the compiler has to match all pragma statements from all files. So contract with pragma version ^0.9.0 and library with pragma version ^0.8.0 won't compile.
  • For import syntax we probably want to support both import "X.cash"; and import { y } from "X.cash";.

Steps to get there / checklist:

  • Updating symbol table to include compiled "bytecode" and update current "global functions" accordingly.
  • Allow for standalone functions in a .cash file.
  • At the start of compilation, all standalone functions should get compiled and added to the GLOBAL_SYMBOL_TABLE.
    • Make a compiler distinction between functions that are part of a contract or standalone.
    • Add "return" statement for standalone functions to return something.
  • Add import functionality.
    • See how we can re-use Node.js' module resolution from inside the CashScript compiler.
    • Initially only allow import "X.cash"; syntax, where import means copy-paste the entire file contents, and then treat the file as a single .cash file for compilation purposes.

Extensions:

  • Support import { y } from "X.cash"; syntax.
  • Support library X { function y() {} } syntax in addition to standalone functions.
  • Allow for independent library compilations / artifacts.
  • Update "tuple" type to allow for multiple return statements from a standalone function.

@rkalis
Copy link
Member Author

rkalis commented Nov 22, 2023

We need to consider how we resolve the dependency graph for contracts / libraries that have (semi-)complex dependency graphs. We'll need to think about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cashc-compiler Relates to the cashc compiler enhancement New feature or request refactor Internal refactoring
Projects
None yet
Development

No branches or pull requests

2 participants