Releases: tidyverse/ragnar
ragnar 0.2.1
-
ragnar_register_tool_retrieve()
now registers a tool that will not return
previously returned chunks, enabling the LLM to perform deeper searches of
a ragnar store with repeated tool calls (#106). -
Updates for ellmer v0.3.0 and duckdb v1.3.1 (#99)
-
Improved docs and error message in
ragnar_store_insert()
(@mattwarkentin, #88) -
ragnar_find_links()
can now parsesitemap.xml
files. It also gains a
validate
argument, allowing for sending aHEAD
request to each link and
filtering out broken links (#83). -
ragnar_inspector()
now renders all urls as clickable links in the chunk markdown
viewer, even if url is not a formal markdown link (#82). -
Before running examples and tests we now check if ragnar can load DuckDB extensions.
This fixes issues in environments where DuckDB pre-built binaries for extensions are not
compatible with the installed DuckDB version (#94). -
Added
embed_lm_studio
to use LMStudio as an embedding provider (#100). -
Fixed a bug causing
ragnar_retrieve()
to fail when documents were inserted without
an origin (#102). -
We now suppress a "Couldn't find ffmpeg or avconv" warning when importing markitdown when
usingread_as_markdown()
. The warning would only be relevant for users doing
audio transcription (#103). -
Added
embed_google_gemini
to use Google Gemini API as an embedding provider (#105).
ragnar 0.2.0
-
ragnar_store_create()
gains a new argument:version
, with default2
.
Store version 2 adds support for chunk deoverlapping on retrieval and automatic chunk augmentation with headings.
To support these features, the internal schema and ingestion requirements are different.
Seemarkdown_chunk()
and new S7 classesMarkdownDocument
andMarkdownDocumentChunks
.
Backwards compatibility is maintained with version = 1. (#58, #39, #36) -
ragnar_store_create()
now supports Date and POSIXct classes supplied toextra_cols
. -
ragnar_store_create()
now supports remote MotherDuck Databases specified withmd:<dbname>
as
thelocation
argument. (#50) -
ragnar_retrieve()
and friends gain afilter
argument, adding support for efficiently
filtering retrieval results. -
ragnar_retrieve_bm25()
gains argumentsb
,k
, andconjunctive
(#56). -
ragnar_retrieve_vss()
gains argumentquery_vector
, supporting workflows that preprocess the query string before embedding. -
ragnar_retrieve_vss()
set of validmethod
choices have been updated to a narrower set to ensure that anHNSW
index scan is used. -
Passing a
tbl(store)
toragnar_retrieve()
is deprecated. -
New chunker
markdown_chunk()
with support for chunk heading context generation,
semantic boundary selection, overlapping chunks, document segmentation, and more. (#56) -
New function
ragnar_chunks_view()
for quickly previewing chunks (#42) -
ragnar_register_tool_retrieve()
gains optionalname
andtitle
arguments
to allow for more descriptive tool registration. These values can also be set
inragnar_store_create()
(#43). -
ragnar_read()
andread_as_markdown()
now accept paths
that begin with~
(@topepo, #46, #48). -
Changes to
read_as_markdown()
HTML conversion (#40, #51):- New arguments
html_extract_selectors
andhtml_zap_selectors
provide a flexible way to
exclude some html page elements from being included in the converted markdown. - code blocks now include the language, if available.
- Fixed handling of nested code fences in markdown output.
- New arguments
ragnar 0.1.0
- Initial CRAN submission.