GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them
-
Updated
Jun 3, 2024 - Java
GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them
Make dataset for RVC with gradio webui
This Python script collects weather data automatically using GitHub Actions. It runs on a scheduled job, collects the latest weather information of Bengaluru from the Open-Meteo API, and pushes the changes to the repository .
Marktplaats.nl (Dutch Classifieds) Listing Scraper
EDA tools and datasets generator for ML projects
[ACL 2024]CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
scalexi is a versatile open-source Python library, optimized for Python 3.11+, focuses on facilitating low-code development and fine-tuning of diverse Large Language Models (LLMs).
언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.
MPI-based distributed downloading tool for retrieving data from diverse domains.
Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection
A little Python script made for scraping data from grabcraft, which can then be used for things like machine learning and data analysis projects and can be transformed to litematica files with https://github.com/RandomGamingDev/grabcraft-to-schema (Sadly, I can't release the dataset since you aren't allowed to share downloaded content)
Data release for the ImageInWords (IIW) paper.
Reproduction of the 3d rotation augmentation of the 300W-LP face pose data set
Building Training Datasets for Deep Learning Models in Software Engineering
An Algorithm that can generate conversation history dataset for your own custom LLM/ChatBot finetuning
Download YouTube video description and video comments without using the YouTube API.
Synthesising and embedding content from the trending stories on Hacker News
NFStream: a Flexible Network Data Analysis Framework.
Generate unicode glyph PNG images from FreeType fonts.
Add a description, image, and links to the dataset-generation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-generation topic, visit your repo's landing page and select "manage topics."