A standalone translation utility for multilingual web projects. This tool can be integrated into any project to manage the automatic translation of content using AI-powered translation services.
- Clone this repository into your project:
git clone https://github.com/your-org/translator.git
cd translator
chmod +x bin/*
The translator uses three main configuration files, all located in your project root:
File | Purpose | Default Template |
---|---|---|
translator.config.yaml |
Project-specific settings | translator/config/translator.config.template.yaml |
translator.models.yaml |
Translation model configuration | translator/config/translator.models.template.yaml |
translator.role.tpl |
Translation role template | translator/config/translator.role.template.tpl |
If any of these files are not found in your project root, the default templates from the translator directory will be used.
- Setup: Clone the translator repository into your project
- Configure: Create the configuration files in your project root
- Run: Execute the translation scripts
The translator will automatically:
- Load project configuration
- Load model definitions
- Find or create translation roles
- Process your content
Create a translator.config.yaml
file in your project root directory:
# Project translation configuration
# Source and target directories
source_directory: content/english # Path to your source content
target_directory: content # Parent directory containing all languages
# Translation parameters
translation_chunk_size: 6144 # Maximum size of text chunks for translation
main_branch: master # Main branch for git diff operations
# MD5 tracking file for changes
md5_file: translation.json # File to track content changes
# Role template location (project-specific)
role_template: translator.role.tpl # Template for translation roles
A template file is available at translator/config/translator.config.template.yaml
.
Create a translator.models.yaml
file in your project root directory to configure the AI models used for translation:
# Translation models configuration
# Models are sorted by token price in ascending order (input/output)
models:
- name: openai:gpt-4o-mini
priority: 1
price_notes: $0.15 / $0.6
- name: claude:claude-3-5-haiku-latest
priority: 2
price_notes: $0.8 / $4
# Add more models as needed
A template file is available at translator/config/translator.models.template.yaml
.
If either of these configuration files are not found, default values will be used.
The translation role template (translator.role.tpl
) defines the instructions given to the AI model for translation tasks. This is a critical component that affects translation quality and accuracy.
- Primary location:
<project_root>/translator.role.tpl
- Default template:
translator/config/translator.role.template.tpl
Create a translator.role.tpl
file in your project root with instructions for the translation model. Here's a template:
You are a professional translator to $LANGUAGE language. You are specialized in translating Hugo template and markdown files with precise line-by-line translation. Your task is to:
1. Translate the entire document provided
2. Preserve the original document's structure exactly: all original formatting, spacing, empty lines and special characters
3. DO NOT translate:
- YAML front matter keys
- Hugo shortcodes and constructs
- Code blocks
- File paths
- Variables
- HTML tags
4. Do not ask questions or request continuations
5. ENSURE each line in the original corresponds to the same line in the translated version
6. If there is nothing to translate, just leave it as is and respond with original document, do not comment your actions
Translate the following document exactly as instructed.
The template uses the following environment variables for substitution:
$LANGUAGE
: Replaced with the target language name during role creation
You can customize the role template to provide more specific instructions for your content types. For example:
- Add specific terminology to maintain across translations
- Include special instructions for domain-specific content
- Define rules for handling particular content elements
During translation:
- The translator checks if the role already exists in the aichat roles directory
- If not, it creates the role using your template or the default template
- When translating, files are sent to the AI model with the appropriate role instruction
The role files are created in the aichat roles directory as translate-to-<language>.md
.
To automatically translate all content:
./translator/bin/auto-translate [project_directory]
If project_directory
is not specified, the parent directory of the translator folder will be used as the project directory.
To synchronize translations with the source files (preserving the exact line structure):
./translator/bin/sync-translations [project_directory]
The translator tool requires the following dependencies:
aichat
- CLI tool for interacting with AI modelsjq
- Command-line JSON processoryq
- Command-line YAML processorenvsubst
- Substitutes environment variables in shell format strings
To add a new language, create a directory with the language name inside your target_directory
. The auto-translate tool will automatically detect and translate content for all language directories found.
The translator expects the following directory structure:
project/
├── content/ # target_directory
│ ├── english/ # source_directory
│ │ ├── file1.md
│ │ └── folder/
│ │ └── file2.md
│ ├── spanish/ # target language
│ └── french/ # target language
└── translator/ # translator directory
├── bin/
└── config/
The translation process works as follows:
- Scan for markdown files in the source directory
- Calculate MD5 hash of each file to detect changes
- Create a diff if the file has changed since last translation
- For each target language, translate the file or apply the diff
- Synchronize the line structure between source and translation
- Update the MD5 hash record for the file
The translator maintains a JSON file (defined by md5_file
in config) that tracks the MD5 hash of each source file. This ensures that only files that have changed are retranslated.
- If translations are failing, check the output for specific error messages.
- Ensure your AI service API keys are properly configured for the
aichat
tool. - For line-count mismatch errors, the tool will attempt multiple models based on priority until a good translation is found.
This project is licensed under the MIT License - see the LICENSE file for details.