Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Whisper in new concise nn.Module API #868

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

LeshengJin
Copy link
Contributor

@LeshengJin LeshengJin commented Sep 5, 2023

The first version of TVM Whisper. Try it out with python tests/python/test_model_whisper.py. A cuda device is required.

Need this pr(apache/tvm#15670) to be merged.

@raj-khare
Copy link

I get the following error when i run the test file:

  File "/root/run.py", line 600, in <module>
    main()
  File "/root/run.py", line 579, in main
    model = model.jit(spec=mod_spec, target=target, device="cuda", out_format="torch", debug=True)
  File "/usr/local/lib/python3.10/dist-packages/tvm/relax/frontend/nn/core.py", line 524, in jit
    spec, vm, params = _compile(spec, device, pipeline, debug)  # pylint: disable=invalid-name
  File "/usr/local/lib/python3.10/dist-packages/tvm/relax/frontend/nn/core.py", line 513, in _compile
    relax_build(
  File "/usr/local/lib/python3.10/dist-packages/tvm/relax/vm_build.py", line 341, in build
    return _vmlink(
  File "/usr/local/lib/python3.10/dist-packages/tvm/relax/vm_build.py", line 247, in _vmlink
    lib = tvm.build(
  File "/usr/local/lib/python3.10/dist-packages/tvm/driver/build_module.py", line 294, in build
    rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
  File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 263, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 252, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
  File "/usr/local/lib/python3.10/dist-packages/tvm/_ffi/base.py", line 481, in raise_last_ffi_error
    raise py_err
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const+0x278) [0xffff9b8bc598]
  [bt] (7) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const+0x428) [0xffff9b8bd0b8]
  [bt] (6) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const+0x278) [0xffff9b8bc598]
  [bt] (5) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const+0x1c8) [0xffff9b8baeac]
  [bt] (4) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(+0x1f13674) [0xffff9c293674]
  [bt] (3) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(+0x1f13294) [0xffff9c293294]
  [bt] (2) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(+0x1f1067c) [0xffff9c29067c]
  [bt] (1) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x68) [0xffff9b5a36a8]
  [bt] (0) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x30) [0xffff9d3fd050]
  Did you forget to bind?
    Variable `B` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `A` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `matmul` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `matmul` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `matmul` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
  File "/opt/mlc-llm/3rdparty/tvm/src/tir/analysis/verify_memory.cc", line 205
RuntimeError: Memory verification failed with the following errors:
# from tvm.script import tir as T

@T.prim_func
def matmul11(var_A: T.handle, var_B: T.handle, matmul: T.Buffer((T.int64(1), T.int64(16), T.int64(1), T.int64(64)), "float32")):
    T.func_attr({"target": T.target({"arch": "sm_87", "host": {"keys": ["cpu"], "kind": "llvm", "tag": ""}, "keys": ["cuda", "gpu"], "kind": "cuda", "max_num_threads": 1024, "max_shared_memory_per_block": 49152, "max_threads_per_block": 1024, "registers_per_block": 65536, "tag": "", "thread_warp_size": 32}), "tir.noalias": T.bool(True)})
    total_seq_len = T.int64()
    A = T.match_buffer(var_A, (T.int64(1), T.int64(16), T.int64(1), total_seq_len))
    B = T.match_buffer(var_B, (T.int64(1), T.int64(16), total_seq_len, T.int64(64)))
    for i1, i3, k in T.grid(T.int64(16), T.int64(64), total_seq_len):
        cse_var_1: T.int64 = i1 * T.int64(64) + i3
        matmul_1 = T.Buffer((T.int64(1024),), data=matmul.data)
        if k == T.int64(0):
            matmul_1[cse_var_1] = T.float32(0)
        A_1 = T.Buffer((total_seq_len * T.int64(16),), data=A.data)
        B_1 = T.Buffer((total_seq_len * T.int64(1024),), data=B.data)
        matmul_1[cse_var_1] = matmul_1[cse_var_1] + A_1[i1 * total_seq_len + k] * B_1[k * T.int64(64) + i1 * total_seq_len * T.int64(64) + i3]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants