Skip to content

A German instruct data set with about 50k samples. Generated based on the Alpaca approach. Useful for LLM finetuning.

License

Notifications You must be signed in to change notification settings

perfood/alpaca-data-german

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alpaca-data-german

English

This is a German instruct data set with about 50.000 samples. It was generated based on the Stanford Alpaca approach with GPT-3.5. It can be used to finetune open source text completion LLMs (e.g. Llama) for instruct purposes in German.

Citation

Please cite this data set as following:

Zauleck, Julius P. P.; Thieme, Nils; Witt, Oliver; Perfood. (2023). alpaca-data-german - 50k German instruct samples. GitHub Repository

Deutsch

Dies ist ein deutscher Instruct Datensatz mit ungefähr 50.000 Beispielen. Er wurde basierend auf dem Stanford-Alpaka-Ansatz mit GPT-3.5 generiert. Er kann verwendet werden, um open source Textvervollständigungsmodelle (z.B. Llama) für Anweisungsanwendungen auf Deutsch zu finetunen.

Zitierung

Bitte diesen Datensatz wie folgt zitieren:

Zauleck, Julius P. P.; Thieme, Nils; Witt, Oliver; Perfood. (2023). alpaca-data-german - 50k German instruct samples. GitHub Repository

About

A German instruct data set with about 50k samples. Generated based on the Alpaca approach. Useful for LLM finetuning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published