A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
-
Updated
Jun 1, 2024 - Python
A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
Efficiently handle JSON array files in Node.js with minimal memory usage. Perfect for efficiently processing large data volumes without worrying about memory limitations
Given a directory of source code, find the project name, contributors, collect the source code and output it all in JSON chunks with an upper token limit
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
Alternative casync implementation
🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows
This project is a Streamlit application that allows users to upload PDF files, process them into text chunks, and perform similarity searches on the text. Users can adjust chunking parameters and view the most similar chunks retrieved.
Chunk array of objects by their size in JSON
Gene's SMTP server — receive Internet mail with less fuss
A python app to practice music — "cut" sheet music into pieces and get better with chunking and deliberate practice.
This repository demonstrates a workflow that integrates LangChain with a vector store (Pinecone) to enable semantic search and question answering using large language models (LLMs).
📑 Split Laravel jobs into multiple separate job chunks
Ongoing - AstroGPT revolutionizes astrology with features like Kundli, daily horoscope, live chat, and a ChatGPT-powered bot. 🚀 Achieved 20% faster loading and 15% better responsiveness, supporting 7 languages and managing 10+ APIs efficiently. 💻 Tailwind CSS for sleek UI, upcoming dynamic search. 🔍 And a lot more !
A file system that can be used to compare different deduplication algorithms.
FastCDC implementation in Python https://pypi.org/project/fastcdc/
Question-Answering App Over Your Own Data with LLamaindex and ElasticSearch !
Command line front end for longtail synchronization tool
A Webpack integration tool for CoCreate applications, enabling file watching, automated chunking, lazy loading, and file uploading. It leverages CoCreate.config for streamlined project builds and development workflows.
Add a description, image, and links to the chunking topic page so that developers can more easily learn about it.
To associate your repository with the chunking topic, visit your repo's landing page and select "manage topics."