[WIP, don't merge] unity.cpp -> ggml master #719
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a WIP PR to sync unity.cpp from seamless_communication (https://github.com/facebookresearch/seamless_communication/tree/main/ggml) to examples/ under ggml. Sharing for visibility and looking for early feedback from community & authors. Feel free to check README for usage.
Questions about ggml changes needed with this PR:
(1) Would you prefer a separate repo for unity.cpp (Like whisper.cpp / llama.cpp we could have a standalone (https://github.com/facebookresearch/seamless_communication/tree/main/ggml/unity.cpp), or checking into examples/ once it’s polished? Or both?
(2) We use kaldi-native-fbank for feature extraction and checked the whole library into ggml as examples/kaldi-native-fbank. Do you prefer a separate installation for this KNF lib? One caveat is with this lib sticking together I needed to update CMAKE_CXX_STANDARD to 14 from 11.
(3) We added several custom operators including batch_norm, glu, and convolution related ones. Realized convolution related ones already existed on master (but didn’t when we started). One TODO item is to merge with them, wondering if there’s ongoing effort on
(a) unifying depthwise conv & im2col ops to have one single ggml_conv_1d op with groups=1 or model_dim as argument
(b) supporting fp32 for im2col. Currently I ran into some issues when using im2col for our model which uses fp32 all the way down, so likely fp32 <-> fp16 cast related, still investigating.
(4) In order to convert fairseq2 checkpoints to ggml format, our script ggml_convert.py rely on ggml-python third party library https://github.com/abetlen/ggml-python (We have ggml.py copy in our folder). Just wondering if there’s a plan to add the python bindings to ggml repo, so we could make sure they are in sync.
Also any comment on the best path to integrate unity.cpp with awesome ggml would be appreciated, thanks in advance!