AI MicroCore: A Minimalistic Foundation for AI Applications

MicroCore is a collection of python adapters for Large Language Models and Vector Databases / Semantic Search APIs allowing to communicate with these services in a convenient way, make them easily switchable and separate business logic from the implementation details.

It defines interfaces for features typically used in AI applications, which allows you to keep your application as simple as possible and try various models & services without need to change your application code.

You even can switch between text completion and chat completion models only using configuration.

Thanks to LLM-agnostic MCP integration, MicroCore connects MCP tools to any language models easily, whether through API providers that do not support MCP, or through inference using pytorch or arbitrary python functions.

The basic example of usage is as follows:

from microcore import llm

while user_msg := input('Enter message: '):
    print('AI: ' + llm(user_msg))

🔗 Links

💻 Installation

Install as PyPi package:

pip install ai-microcore

Alternatively, you may just copy microcore folder to your project sources root.

git clone [email protected]:Nayjest/ai-microcore.git && mv ai-microcore/microcore ./ && rm -rf ai-microcore

📋 Requirements

Python 3.10 / 3.11 / 3.12 / 3.13 / 3.14

⚙️ Configuring

Minimal Configuration

Having OPENAI_API_KEY in OS environment variables is enough for basic usage.

Similarity search features will work out of the box if you have the chromadb pip package installed.

Configuration Methods

There are a few options available for configuring microcore:

Use microcore.configure(**params)
💡 All configuration options should be available in IDE autocompletion tooltips
Create a .env file in your project root; examples: basic.env, Mistral Large.env, Anthropic Claude 3 Opus.env, Gemini on Vertex AI.env, Gemini on AI Studio.env
Use a custom configuration file: mc.configure(DOT_ENV_FILE='dev-config.ini')
Define OS environment variables

For the full list of available configuration options, you may also check microcore/config.py.

Installing vendor-specific packages

For the models working not via OpenAI API, you may need to install additional packages:

Anthropic Claude 3

pip install anthropic

Google Gemini via AI Studio

pip install google-generativeai

Google Gemini via Vertex AI

pip install vertexai

📌Additonaly for working through Vertex AI you need to install the Google Cloud CLI and configure the authorization.

Local language models via Hugging Face Transformers

You will need to install transformers and deep learning library of your choice (PyTorch, TensorFlow, Flax, etc).

See transformers installation.

Priority of Configuration Sources

Configuration options passed as arguments to microcore.configure() have the highest priority.
The priority of configuration file options (.env by default or the value of DOT_ENV_FILE) is higher than OS environment variables.
💡 Setting USE_DOT_ENV to false disables reading configuration files.
OS environment variables have the lowest priority.

Vector Databases

Vector database functions are available via microcore.texts.

ChromaDB

Default vector database is Chroma. In order to use vector database functions with ChromaDB, you need to install the chromadb package:

pip install chromadb

By default, MicroCore will use ChromaDB PersistentClient (if corresponding package is installed). Alternatively, you can run Chroma as separate service and configure MicroCore to use HttpClient:

from microcore import configure
configure(
    EMBEDDING_DB_HOST = 'localhost',
    EMBEDDING_DB_PORT = 8000,
)

Qdrant

In order to use vector database functions with Qdrant, you need to install the qdrant-client package:

pip install qdrant-client

Configuration example

from microcore import configure, EmbeddingDbType
from sentence_transformers import SentenceTransformer

configure(
    EMBEDDING_DB_TYPE=EmbeddingDbType.QDRANT,
    EMBEDDING_DB_HOST="localhost",
    EMBEDDING_DB_PORT="6333",
    EMBEDDING_DB_SIZE=384,  # dimentions quantity in used SentenceTransformer model
    EMBEDDING_DB_FUNCTION=SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2"),
)

🌟 Core Functions

llm(prompt: str, **kwargs) → str

Performs a request to a large language model (LLM).

Asynchronous variant: allm(prompt: str, **kwargs)

from microcore import *

# Will print all requests and responses to console
use_logging()

# Basic usage
ai_response = llm('What is your model name?')

# You also may pass a list of strings as prompt
# - For chat completion models elements are treated as separate messages
# - For completion LLMs elements are treated as text lines
llm(['1+2', '='])
llm('1+2=', model='gpt-4')

# To specify a message role, you can use dictionary or classes
llm(dict(role='system', content='1+2='))
# equivalent
llm(SysMsg('1+2='))

# The returned value is a string
assert '7' == llm([
 SysMsg('You are a calculator'),
 UserMsg('1+2='),
 AssistantMsg('3'),
 UserMsg('3+4=')]
).strip()

# But it contains all fields of the LLM response in additional attributes
for i in llm('1+2=?', n=3, temperature=2).choices:
    print('RESPONSE:', i.message.content)

# To use response streaming you may specify the callback function:
llm('Hi there', callback=lambda x: print(x, end=''))

# Or multiple callbacks:
output = []
llm('Hi there', callbacks=[
    lambda x: print(x, end=''),
    lambda x: output.append(x),
])

tpl(file_path, **params) → str

Renders prompt template with params.

Full-featured Jinja2 templates are used by default.

Related configuration options:

from microcore import configure
configure(
    # 'tpl' folder in current working directory by default
    PROMPT_TEMPLATES_PATH = 'my_templates_folder'
)

texts.search(collection: str, query: str | list, n_results: int = 5, where: dict = None, **kwargs) → list[str]

Similarity search

texts.find_one(self, collection: str, query: str | list) → str | None

Find most similar text

texts.get_all(self, collection: str) -> list[str]

Return collection of texts

texts.save(collection: str, text: str, metadata: dict = None))

Store text and related metadata in embeddings database

texts.save_many(collection: str, items: list[tuple[str, dict] | str])

Store multiple texts and related metadata in the embeddings database

texts.clear(collection: str):

Clear collection

API providers and models support

LLM Microcore supports all models & API providers having OpenAI API.

List of API providers and models tested with LLM Microcore:

API Provider	Models
OpenAI	All GPT-4 and GTP-3.5-Turbo models all text completion models (davinci, gpt-3.5-turbo-instruct, etc)
Microsoft Azure	All OpenAI models, Mistral Large
Anthropic	Claude 3 models
MistralAI	All Mistral models
Google AI Studio	Google Gemini models
Google Vertex AI	Gemini Pro & other models
Deep Infra	deepinfra/airoboros-70b jondurbin/airoboros-l2-70b-gpt4-1.4.1 meta-llama/Llama-2-70b-chat-hf and other models having OpenAI API
Anyscale	meta-llama/Llama-2-70b-chat-hf meta-llama/Llama-2-13b-chat-hf meta-llama/Llama-7b-chat-hf
Groq	LLaMA2 70b Mixtral 8x7b Gemma 7b
Fireworks	Over 50 open-source language models

Supported local language model APIs:

HuggingFace Transformers (see configuration examples here).
Custom local models by providing own function for chat / text completion, sync / async inference.

🖼️ Examples

Code review tool

Performs code review by LLM for changes in git .patch files in any programming languages.

Image analysis (Google Colab)

Determine the number of petals and the color of the flower from a photo (gpt-4-turbo)

Banchmark LLMs on math problems (Kaggle Notebook)

Benchmark accuracy of 20+ state of the art models on solving olympiad math problems. Inferencing local language models via HuggingFace Transformers, parallel inference.

Other examples

Python functions as AI tools

@TODO

🤖 AI Modules

This is experimental feature.

Tweaks the Python import system to provide automatic setup of MicroCore environment based on metadata in module docstrings.

Usage:

import microcore.ai_modules

Name		Name	Last commit message	Last commit date
Latest commit History 487 Commits
.github/workflows		.github/workflows
examples		examples
microcore		microcore
requirements		requirements
tests		tests
.env.anthropic.example		.env.anthropic.example
.env.example		.env.example
.env.gemini.example		.env.gemini.example
.env.google-vertex-gemini.example		.env.google-vertex-gemini.example
.env.mistral.example		.env.mistral.example
.flake8		.flake8
.gitignore		.gitignore
.pylintrc		.pylintrc
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
coverage.svg		coverage.svg
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

License

Nayjest/ai-microcore

Folders and files

Latest commit

History

Repository files navigation

AI MicroCore: A Minimalistic Foundation for AI Applications

🔗 Links

💻 Installation

📋 Requirements

⚙️ Configuring

Minimal Configuration

Configuration Methods

Installing vendor-specific packages

Anthropic Claude 3

Google Gemini via AI Studio

Google Gemini via Vertex AI

Local language models via Hugging Face Transformers

Priority of Configuration Sources

Vector Databases

ChromaDB

Qdrant

🌟 Core Functions

llm(prompt: str, **kwargs) → str

tpl(file_path, **params) → str

texts.search(collection: str, query: str | list, n_results: int = 5, where: dict = None, **kwargs) → list[str]

texts.find_one(self, collection: str, query: str | list) → str | None

texts.get_all(self, collection: str) -> list[str]

texts.save(collection: str, text: str, metadata: dict = None))

texts.save_many(collection: str, items: list[tuple[str, dict] | str])

texts.clear(collection: str):

API providers and models support

List of API providers and models tested with LLM Microcore:

Supported local language model APIs:

🖼️ Examples

Code review tool

Image analysis (Google Colab)

Banchmark LLMs on math problems (Kaggle Notebook)

Other examples

Python functions as AI tools

🤖 AI Modules

Usage:

Features:

🛠️ Contributing

📝 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 78

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages