🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
-
Updated
Dec 13, 2024 - Python
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Content-Addressable Data Synchronization Tool
Alternative casync implementation
A package for parsing PDFs and analyzing their content using LLMs.
A TensorFlow implementation of Neural Sequence Labeling model, which is able to tackle sequence labeling tasks such as POS Tagging, Chunking, NER, Punctuation Restoration and etc.
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
An LLM GUI application; enables you to interact with your files, offering dynamic parameters that can modify response behavior during runtime.
webpack 2, react hotloader 3, react router v4, code splitting and more
📑 Split Laravel jobs into multiple separate job chunks
An asynchronous event-driven HTTP client based on netty.
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Labelling Sequential Data in Natural Language Processing with R - using CRFsuite
Extract and align grammar patterns from English sentences.
FastCDC implementation in Python https://pypi.org/project/fastcdc/
Add a description, image, and links to the chunking topic page so that developers can more easily learn about it.
To associate your repository with the chunking topic, visit your repo's landing page and select "manage topics."