FieldWise AI Backend

Overview

Backend server leveraging Genkit for AI-driven conversation and transcription services, powered by OpenAI's GPT-4 for conversation, Whisper for audio transcription, and Google Cloud Text-to-Speech (TTS) for speech synthesis. Built with Express.js, it provides routes for AI conversations, audio transcription, and TTS conversion.

Note: Currently, this project only utilizes the OpenAI-powered features (GPT-4 conversations and Whisper transcription). Google Cloud TTS integration and firebase authentication is available but not in active use.

Features

AI Conversations
- Context-aware, human-like responses using GPT-4o
- Multiple language support
- Customizable conversation scripts
Audio Transcription
- High accuracy with OpenAI's Whisper
- Handles various audio formats (MP3, WAV, FLAC)
Text-to-Speech
- Converts text to natural-sounding speech with Google Cloud TTS
- Supports multiple languages and voices
Authentication
- Firebase authentication ensures secure endpoints (can be bypassed with BYPASS_AUTH=true)
- Easy integration for user management

Installation

Clone the repository:

git clone <repository-url>
cd <repository-directory>

Install dependencies:
```
npm install
```

Set up environment variables by creating a .env file:

OPENAI_API_KEY=your_OPENAI_API_KEY
GOOGLE_API_KEY=your_google_api_key
GOOGLE_CLOUD_CREDENTIALS=./secret_keys/google_cloud_key.json
FIREBASE_SERVICE_ACCOUNT_KEY=./secret_keys/firebase_key.json
PORT=4000
BYPASS_AUTH=true

Add your Firebase service account key to ./secret_keys/firebase_key.json.
Add your Google Cloud service account key to ./secret_keys/google_cloud_key.json.

Usage

Start the Server

npm start

API Endpoints

Health Check

GET /:
Simple health check route.
```
curl http://localhost:4000/
```

AI Conversation

POST /ai/converse: Starts or continues a conversation.

Request Body:

{
   "language": "string",     // ISO 639-1 language code (e.g., "en", "es")
   "script": "string",       // Required. Defines conversation context and rules
   "history": [              // Mandatory. Array of messages
      {
         "role": "user" | "assistant",
         "content": "string"
      }
   ]
}

language: Determines the language for AI responses
script: Required. Contains conversation rules, context, and flow logic. Example of script:

Name: Relation Databases
Topics:
1. What are relational databases?
2. How do they work?
3. What are the benefits of using them?
...etc

history: Conversation history.

Response Body:

{
   "reply": "string",              // AI's response message. Can contain special tokens.
   "feedback": "string",           // Negative-only feedback on user's input
   "correctnessPercent": number    // Accuracy score of user's input (0-100%)
}

reply: it can contain special tokens starting with @. They are:
- @END_CONVERSATION: indicates that the conversation finished
feedback: a feedback about the prompt, only containing constructive negative criticism, or otherwise equals to @NO_FEEDBACK.

Audio Transcription

POST /ai/transcribe: Transcribes an audio file.

Request: multipart/form-data with an audio field

{
   "audio": "file"    // Supported formats: 'mp3', 'mp4', 'mpeg', 'mpga', 'wav', 'webm' (max 25MB)
}

Response Body:

{
   "transcript": "string"    // Transcribed text from the audio file
}

Text-to-Speech

POST /ai/text-to-speech: Converts text to speech.

Request Body:

{
   "text": "string",         // Text to convert to speech
   "languageCode": "string", // BCP-47 language code (e.g., "en-US")
   "name": "string"          // Voice name (e.g., "en-US-Standard-A")
}

Response: Returns audio file in audio/mpeg format

Testing

Run tests using Jest:

npm test

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
src		src
tests		tests
.firebaserc		.firebaserc
.gitignore		.gitignore
.parcelsrc		.parcelsrc
LICENSE		LICENSE
README.md		README.md
apphosting.yaml		apphosting.yaml
backend.js.map		backend.js.map
firebase.json		firebase.json
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FieldWise AI Backend

Overview

Features

Installation

Usage

Start the Server

API Endpoints

Health Check

AI Conversation

Audio Transcription

Text-to-Speech

Testing

About

Releases

Packages

Languages

License

immagiov4/FieldWise_Backend

Folders and files

Latest commit

History

Repository files navigation

FieldWise AI Backend

Overview

Features

Installation

Usage

Start the Server

API Endpoints

Health Check

AI Conversation

Audio Transcription

Text-to-Speech

Testing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages