F The Docs

Exactly what it sounds like

P.S: No offence is intended to documentation writers, nor to any documentations. I love the effort you guys put to writing good, (ahm) easy-to-follow manuals for software ❤️

This was created as a fun side project to learn some RAG. Please do not get mad at me if this doesnt meet your expectations

Huh?

Do you ever get tired of just reading documentation?
Do you wish for a program with which you can just ask questions and get the answers from?
Well, this piece of majesticity is (probably) just for you!

So what IS FTheDocs?

Glad you asked. FTheDocs is a super advanced documentation querier software, that you can use to get information from the documentation faster and (maybe) more efficiently.
It first embeds all your docs, then, using your query, finds the closest most accurate piece of embedding to the embedding of the query
In short, it's a vector database querying software

How does it work?

You first stuff the entire documentation (which you should scrape and store into either a .txt or .json file) into FTheDocs
FTheDocs then builds a 'collection' of those documents after parsing them, while also taking into consideration any settings you have given it (YOU CAN CUSTOMIZE IT)
Then it presents you into 'asking mode', where you can ask the collection what you want to know from the docs

Oooook, how do I get started?

THATS EASY. Just git clone this repo :)

git clone https://github.com/muaaz-ur-habibi/fthedocs.git

then just go into the directory, run the command

python fthedocs.py --help

to be presented with the help menu. Or just read the documentation below for more details

Some Features:

Pretty Console UI using Rich library
Question-Answer style querying system
Verbose output of whatever process is currently on-going (still working on this)
Settings to allow the program to be fitted according to your documentation
Free AND Open-source

Documentation:

Basic Usage

When cloning the repo, you also clone a test.txt & a test.json file. This is a scraped version of Beej's C Sockets Guide. This is also the test documents that I used for testing FTheDocs. You can use this to play around with it aswell

python fthedocs.py --file test.txt

this is the most basic way to use FTheDocs. This command will load the test.txt file and present you in asking mode using default settings

Using JSON

The argument `--file` is used for .txt files. To use .json files, use the argument `--json` to specify a .json file
When using JSON, the `--json-path` argument becomes compulsory

JSON Path

This is the key values of the .json file FTheDocs needs to take in order to reach the target text, which it then converts into a list of texts
Think of it like this:

{
  'main': {
    'key_1': {
          'key_2': ["target_text_as_list"],
  }
}

In order to reach all the desired texts, in this case ["target_text_as_list"], FTheDocs needs to take the path 'main->key_1->key_2'

Naturally, there will be limitations. In this case:

The path MUST include a 'LIST' parameter, for FTheDocs to iterate over. Now whether that list is a list of dictionaries, strings. It doesnt matter
Multiple paths in a single run can NOT be specified

To specify this, use the argument: `--json-path "PATH|TO|LIST|OF|TEXTS`

It would also be nice to know that, to specify the end of the path aka that 'here are the list of dictionaries' you should add a LIST parameter

Alternatively if you have many lists of lists, and only wish to use one of them, LIST also works like any list (in the sense you can use LIST[0] to specify an element at 1st index)

Too much to swallow? You bet. Go ahead and open the test.json file for me. I'll show you a real example.

In here, you see there is a main dictionary, which has a "css" key, whose value is a LIST. Inside that LIST is another LIST, only one tho. That LIST is a list of dictionaries, with different HTML element properties. But what we are looking for is the "content" key of those dictionaries.

What will be the json path of this file?

Think about it for a second...

That would be "css|LIST[0]|LIST|content"

First fthedocs would go into 'css', there it will find a LIST, but we only need the 0th one, so we specified 'LIST[0]'. After that is another LIST, this one containing all the dictionaries, whose key that we need is 'content'

Limits

You can also specify a starting and ending point of the document to be added. In normal cases this would be the line limit. This is also a command-line argument.
The syntax goes as: `starting_point:ending_point`

Settings

CUSTOMIZING (kinda)

There are quite a few things you can change. Some of them directly impact the results. Others not so much or not at all

Changing the collection name
Changing the document I.D name
Changing the parsing seperator
Changing the amount of queried results
Concatenating a set number of documents
Changing the concatenating character
Showing these settings when building the collection

Just from reading im pretty sure you can figure out which one has an impact on the result.

Collection Name:
This changes the created collection's name.
Document I.D Name:
This changes the document I.D starting string
Parsing seperator:
This changes the splitting criteria for each line. So if this is a '.', each line of the text/json file will be splitted further on each '.'. Can be useful for more seperation
Amount of Queried Results:
Changes the amount of results that are returned back to the user, usually in ascending order of close match. NOTE: setting this to anything other than 1 will disable the 'Query around' functionality (explained later on)
Concatenating Docs:
You can also concatenate an integer number of docs together, to create a bigger document. Think of it as the opposite of Parsing Seperator
Change Concatenation Character:
Concatenate on a custom character. Eg: 'document 1', 'document 2'. Character is '. ' (spaces will matter). So the concatenated will be 'document 1. document 2'
Showing the Settings:
Doesnt do much. Just display the settings when building the collection. Just in case you realise you messed up a setting or two

Those were the basics of FTheDocs. Incase you dont understand anything OR encounter an issue/problem, you can always open up an issue, and ill make sure to find some time to respond :)

Limitations:

No file types supported other than .txt and .json
Cannot save a previous collection
Cannot scrape the documentation for you
Cannot format the documentation for you (in json)
It isnt an AI so you cant ask it anything
Cannot give you emotional support (I tried)

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
undoc		undoc
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
__init__.py		__init__.py
fthedocs.py		fthedocs.py
requirements.txt		requirements.txt
test.json		test.json
test.txt		test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F The Docs

Exactly what it sounds like

Huh?

So what IS FTheDocs?

How does it work?

Oooook, how do I get started?

Some Features:

Documentation:

Basic Usage

Using JSON

JSON Path

Limits

Settings

Collection Name:

Document I.D Name:

Parsing seperator:

Amount of Queried Results:

Concatenating Docs:

Change Concatenation Character:

Showing the Settings:

Limitations:

About

Languages

muaaz-ur-habibi/fthedocs

Folders and files

Latest commit

History

Repository files navigation

F The Docs

Exactly what it sounds like

Huh?

So what IS FTheDocs?

How does it work?

Oooook, how do I get started?

Some Features:

Documentation:

Basic Usage

Using JSON

JSON Path

Limits

Settings

Collection Name:

Document I.D Name:

Parsing seperator:

Amount of Queried Results:

Concatenating Docs:

Change Concatenation Character:

Showing the Settings:

Limitations:

About

Topics

Resources

Stars

Watchers

Forks

Languages