Skip to content
noco-ai edited this page Feb 24, 2024 · 23 revisions

Overview

Spell Book is a project to create a UI for interacting with different types of AI models, it focuss on LLMs and using them in conjunction with other AI models to create cool applications.

Features

AI Assistant

The AI assistant integrates large language models such as Llama 2 and Mixtral for engaging in dynamic conversations.

  • Function Calling and Model Routing: Can ask questions in chat session that invoke a function. These functions can call a third-party API or AI model to accomplish tasks outside a LLMs abilities, generating music or artwork for example.
  • Conversation Customization: Users can modify conversation settings to adjust generation and routing preferences, enhancing personalization.
  • Conversation Management: Features include the ability to save, access, and delete conversations, with each digital ally maintaining its own list for better organization.
  • Speech Recognition and TTS: Supports voice interactions and can provide vocal responses through xTTS.
  • Content Regeneration and Editing: Allows for regenerating or editing conversation turns to correct the dialogue or adjust the flow as needed.
  • Shortcut and Pinning Functions: Enables quick access to chat abilities or language models via shortcuts and allows for function or skill pinning for efficient future routing.

Application Manager

The Application Manager allows administrators to enabled and disable applications and chat abilities for all users. The interface is simple with a description of the application or chat ability and an install/uninstall button. Some applications and chat abilities also have configuration values like API keys or default models to route to.

Chat Abilities

Also known as function calling when this is enabled the UI will attempt to decipher the intent of the user and call a TypeScript class that can preform tasks like image generation and find the real time weather for a city. As for v0.3.0 the abilities include.

  • Bing News: Uses Bing News API to search for current news stories, will attempt to read and summarize the news articles.
  • Current Weather: Uses Accuweater API to get the current conditions for a location by name
  • Dynamic Functions: The dynamic function ability allows you to create simple TypeScript functions using GPT-4 that are then stored and accessed when you ask similar questions, the OOB database includes 35+ generated function to solve common math problems, something small LLMs are terrible at. To use the skill just append the 🧪 the question and GPT-4 will try to generate a function to solve the problem instead of the language model handling it.
  • FTP Transfer: This chat ability can upload files from the workspace to a sever using the FTP protocol.
  • Image Analyzer: The image analyzer ability lets you run classification and object detection on images in the chat session. If no file name is given the ability will use the last image in the chat session.
  • Image Generator: This ability allows users to generate images based on prompts. The quality and accuracy of the generated images are dependent on the provided negative prompt and settings such as steps, guidance scale, height, and width.
  • Language Translator: The translator ability uses the Alma models to preform translation from one language to another. Specify the text you want to translate and the language you want to transfer it to, the model can correctly guess the input language reliably.
  • Music Generator: This ability uses the MusicGen models from Meta to create music clips up to 30 seconds based off a text prompt.
  • Telenyx SMS: The Telenyx SMS and MMS ability allows you to send outgoing SMS and MMS messages using the Telenyx API. This ability requires you have a Telenyx API key and outgoing phone number configured.
  • Text to Speech: The text to speech ability can take a text string as input and output a wav file with a human voice of the input text.
Clone this wiki locally