You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
P.J. Finlay 8cdc079fa6
Add tutorial link
3 days ago
argostrain Bug fix 3 months ago
bin Warn on checkpoints exist 1 week ago
docs Converting to Argos Train 7 months ago
scripts Opus conversion script 4 months ago
.gitignore Move source and target data to run/ dir 3 months ago
Dockerfile Bug fix 7 months ago
LICENSE Initial commit 2 years ago Auto generate file 7 months ago Add tutorial link 3 days ago
config.yml Reduced default train length to 10000 7 months ago
data-index.json Update urls 3 days ago
requirements.txt Set CTranslate2 version 2 months ago Improvements 1 year ago Converting to Argos Train 7 months ago

Argos Train

Argos Translate | Tutorial | Video tutorial

Argos Train trains an OpenNMT PyTorch Transformer model and a SentencePiece tokenizer and packages them with Stanza data as an Argos Translate package. Argos Translate packages, which are zip archives with a .argosmodel extension, can be used with Argos Translate, LibreTranslate, and Dot Lexicon.

Pre-trained Argos Translate packages are available for download. If you have trained packages you're willing to share please get in contact so that they can be published on the Argos Translate package index.

Training example

$ su argosopentech
$ source ~/argos-train-init


$ argos-train
From code (ISO 639): en
To code (ISO 639): es
From name: English
To name: Spanish
Version: 1.0


Package saved to /home/argosopentech/argos-train/run/en_es.argosmodel


Data from data-index.json is used for training. Argos Translate primarily uses data from the Opus project.

To train a model with custom data add your data to data-index.json after running argos-train-init with a link to download your custom data package. Data packages are zipped directories with a .argosdata extension that contain a source and target file with parallel data in corresponding lines and a metadata.json file.


Docker image available at argosopentech/argostrain.

docker run -it argosopentech/argostrain /bin/bash

Run training



CUDA required, tested on


Licensed under either the MIT or CC0 License (same as Argos Translate).