- Add the following models:
- Provide the chat history in the
context_aware_answer
. - Experiment Agentic Patterns:
- https://weaviate.io/blog/what-is-agentic-rag
- https://github.com/neural-maze/agentic_patterns
- https://www.youtube.com/watch?v=ApoDzZP8_ck
- Function Calling in LLM https://medium.com/@danushidk507/function-calling-in-llm-e537b286a4fd
- Ollama Tool support
- Google Search with LLM:
- https://huggingface.co/blog/nand-tmp/google-search-with-llm
- https://blog.nextideatech.com/how-to-use-google-search-with-langchain-openai/
- https://medium.com/@reynxzz/rag-with-gemini-google-search-and-bq-vector-search-for-content-personalization-08fe7dab6b33
- https://newspaper.readthedocs.io/en/latest/
- https://github.com/AstraBert/PrAIvateSearch
- System level safety:
- https://huggingface.co/meta-llama/Llama-Guard-3-1B - https://huggingface.co/tensorblock/Llama-Guard-3-1B-GGUF
- Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification.
- https://huggingface.co/meta-llama/Llama-Guard-3-8B
- Llama Guard 3-8B is a fine-tuned Llama-3.1-8B pretrained model for content safety classification.
- https://huggingface.co/meta-llama/Llama-Guard-3-11B-Vision
- Llama Guard 3 Vision is a Llama-3.2-11B pretrained model, fine-tuned for content safety classification.
- https://huggingface.co/meta-llama/Llama-Guard-3-1B - https://huggingface.co/tensorblock/Llama-Guard-3-1B-GGUF
- Experiment Multimodal LLMs with
Llama 3.2 Vision 11B
(text + images in / text out)- The model is currently not supported by
llama.cpp
ggerganov/llama.cpp#9643 - Is it supported just by (Ollama, so we need to use the Python API to create an additional client.
Llama 3.2 Vision 11B
requires least8GB
ofVRAM
, and the90B
model requires at least64GB
ofVRAM
.- Take also a look here: https://huggingface.co/unsloth
- The model is currently not supported by
- Explore long term memory:
- https://help.openai.com/en/articles/8590148-memory-faq
- https://ai.gopubby.com/long-term-memory-for-agentic-ai-systems-4ae9b37c6c0f
- https://github.com/mem0ai/mem0
- Explore also the structure of the repo https://github.com/mem0ai/mem0/tree/main/mem0 and the vector store implementation.
- https://github.com/letta-ai/letta
- Investigate Chroma batch querying: https://github.com/langchain-ai/langchain/blob/907c758d67764385828c8abad14a3e64cf44d05b/libs/community/langchain_community/vectorstores/chroma.py#L42
- Make docker container.
- Test Flash attention:
- Investigate V-RAG (Vision RAG) https://github.com/Softlandia-Ltd/vision-is-all-you-need