Skip to content

Are there sample CSV files for training data? #102

Closed Answered by psinger
manyam asked this question in Q&A
Discussion options

You must be logged in to vote

Yes, so actually when you start the GUI, it will generate a sample dataset out of the box, actually it is a parquet file you will find in data folder.

You can also check here: https://www.kaggle.com/code/philippsinger/openassistant-conversations-dataset-oasst1?scriptVersionId=127047926

And here: https://github.com/h2oai/h2o-llmstudio/blob/main/app_utils/utils.py#L1888

To see how it is created.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by pascal-pfeiffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants