From Transkribus Wiki
Revision as of 14:07, 19 December 2016 by Philip Kahle (Talk | contribs)

Jump to: navigation, search


For training a new HTR model using the new API, at first a configuration XML has to be created. Besides parameters (the example below includes the default values) mandatory fields are:

  • a model name
  • a description
  • the language
  • the collection ID where the input documents can be found

The input for training is described in the TrainList section of the XML and is made up of train elements where each includes:

  • the document ID
  • a list of pages where each page includes
    • the page-ID
    • the ID of the transcript version that should be used for training

Optionally a test set can be specified in the TestList element analogously.

The training descriptor then should look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <modelName>Test Model</modelName>
    <description>A description</description>

That XML is then send via POST to and the call returns the job-ID of the training.