Difference between revisions of "Users guide"

From Transkribus Wiki
Jump to: navigation, search
(Upload Documents)
(Upload Documents)
Line 231: Line 231:
 
====Upload Documents====
 
====Upload Documents====
  
* '''Upload a document to the current collection''': A user is enabled to select one directory with images and to upload it to the server. Note: This type of upload is slow and only recommended for some dozens of pages!
+
* '''Upload a document to the current collection''': A user is enabled to select one directory with images and to upload it to the server.  
 +
** Note: This type of upload works very well for documents with a maximum of 500 MB!
 
* '''Ingest documents from FTP storage''': This is a two step upload procedure and requires an FTP client, but it is fast, reliable and large amounts of files and documents can be uploaded at once in a convenient way.
 
* '''Ingest documents from FTP storage''': This is a two step upload procedure and requires an FTP client, but it is fast, reliable and large amounts of files and documents can be uploaded at once in a convenient way.
** FTP Server: Use the following server address for ftp upload: ftp://transkribus.eu/. E.g. copy the link (ftp://transkribus.eu/) into your Windows Explorer, into an FTP Client (WS-FTP) or Filezilla.
+
** FTP Server: Use the following server address for ftp upload: ftp://transkribus.eu/. We recommend to use a dedicated FTP client, e.g. [https://filezilla-project.org/ FileZilla] is an Open Source free software.  
 
** It will require your '''Transkribus user name and password''' to connect to the Transkribus FTP
 
** It will require your '''Transkribus user name and password''' to connect to the Transkribus FTP
 
** Copy and paste documents from your harddrive or a network storage to your private Transkribus FTP folder
 
** Copy and paste documents from your harddrive or a network storage to your private Transkribus FTP folder

Revision as of 08:03, 7 September 2016

Transkribus Expert Interface

This is the detailed User's guide for the Graphical User Interface of Transkribus.

Transkribus is an expert and multi-purpose tool. It comes with a large number of features and therefore requires some background knowledge. Once you are familiar with some of the key concepts you will be able to enjoy the full advantages offered by the platform.

Documents in the Transkribus platform are only visible to you as the owner of your collection and to those Transkribus users whom you give a permission to view and edit your collection. Note: Since your documents cannot be viewed by the public you can also work safely with copyrighted documents and enjoy the privileges of copyright exceptions offered by the EU Directive on Copyright for individual (private) use as well as for research and education.

Look and Feel

Elements of the screen

Transkribus consist of five important elements:

  • The menu bar at the top of the screen
  • The tabs on the left hand side - mainly used to provide information to the user
  • The tabs on the right hand side - mainly used to activate something
  • The canvas and its specific menu bar - mainly used to show the image of the page and segmented blocks, lines, etc.
  • The text editor and its specific menu bar - used to edit the text

Move borders of elements

If you want to fit the screen you may move the border of any element.

Dock, undock and make elements invisible

  • Use the "Docked, Undocked, Invisible" buttons at the top menu bar to configure your screen according to your needs. "Dock" means that the element is fixed, "undocked" means that you can move it around and "invisible" hides the element (assuming that you don't need it). If you close the window (in case of "Undocked") then it is automatically docked.
  • It is recommended that you use this feature if you are doing repetitive activities or working with 2 or 3 screens!

Menu bar on top

Menue Bar on top
  • Main Menu
    • Contains a collection of different commands which most of them are known from some other prominent locations in the tool. So here we explain only the commands which can be found only in this menu.
      • You can check for updates of the tool and directly install it or search for an older or specific version of the tool. The versions which end with -Snapshot are highly experimental and are mainly created to test new features.
      • Change viewing settings is really helpful to allow the user to change the line width, colour, and so on in the canvas according to their preferences. Different tasks may require different settings too.
  • Login/Logout
    • This is to login to the Transkribus cloud server. The login session expires after some inactive time and the user must login again.
  • Define Docking states for the tab folder on the left, the tab folder on the right and the transcription widget on the bottom:
    • docked: Strongly integrated in the GUI
    • undocked: Not tied to the main GUI anymore. The window can be displayed on a second screen for example.
    • invisible: If the user does not need the functionalities of the view at the moment it can be made invisible and other areas of the GUI receive more space
  • Open local folder
    • Open a local document by choosing a local folder containing images and page files
  • Close document
    • Close the current document
  • Save page
    • Stores a new version of the transcription. All versions of a page can be loaded via the Versions Tab
  • Reload document
    • Refreshes the document in all current views
  • Export document
    • Export the current document. A detailed description of the export options and different settings are reported in Export Documents
  • Page navigation
    • First, last, next and previous page. You can also navigate to a specific page.
  • Reload page
    • Refreshes the page view and metadata wherever necessary
  • Open transcript source
    • Shows the transcript source in an XML viewer with a text search facility
  • Show different segmentation types
    • Show printspace: F1
    • Show text regions: F2
    • Show lines: F3
    • Show base lines: F4
    • Show words: F5
  • Bug report
    • Send a bug report or feature request

The Canvas

The Canvas and Canvas Menu Bar

Initially the canvas does not show any image. Once you open a local document or - after logging in - select a collection and a document in that collection, the first page of the loaded document will show up.

The coloured elements you see on the image strongly depend on the one hand if segmentation has already taken place, and on the other hand, which structure types are selected for display because each different structure type can be selected or deselected to be visible in the canvas. The adjustments can be found in the Main menu and there are also shortcuts for these:

  • Show printspace: F1
  • Show text regions: F2
  • Show lines: F3
  • Show base lines: F4
  • Show words: F5

All visible structure elements are selectable and there is a strong connection between the Canvas, the Text Editor and the Structure Tab. This means that regardless of the place where you select a structural element, this element is also selected in the other connected parts of Transkribus. This should give a very clearly represented view of image, text and hierarchy - of course only if all parts are available. This is in our opinion one big advantage in comparison to some other transcription strategy where image and text are not linked together and therefore the transcriber always has to take care of the current region/line/word by himself.

All image and shape related operations can be found in the

Canvas Menu Bar

Descriptions from the very left according to the "mouse over" text:

  • Selection mode: The "usual mode" if you work with the image. Important to finish a process, e.g. adding baselines, or adding text regions.
  • Zoom mode: If selected, you can zoom into the image by dragging and holding the left mouse button.
  • Zoom in: To increase the image within the canvas
  • Zoom out: To decrease the image within the canvas

Note: Alternatively use the mouse scroll wheel to zoom in or out.

  • Fit to Page
    • Fit to Page: fits the page completely to the canvas, (Tip: Pressing the mouse scroll wheel does this as well)
    • Original Size: 1:1 display of the image
    • Fit to width: Fits the image to the left and right border
    • Fit to height: Fits the image to the top and bottom border
  • Rotate
    • Rotate left or right: Image is rotated left or right
  • Translate image
    • Translate left, right, top, bottom: Image moves to the left, right, bottom and top.

Note: Alternatively press the left or the right mouse button, and hold it and you can move the image or a selected element

  • Focus selected object: The selected object, e.g. a text region or line region is focused

Note: Alternatively you can double-click on the element and it will be focused as well

  • Enable shape editing: If selected text and line regions can be edited. All these editing operations can be undone with the 'Undo' button found at the end of the tool bar
  • Add a printspace: If selected you can add a printspace by clicking into the image. A printspace is not needed for a good transcription but might be helpful when transcribing e.g. printed books and converting them into a printable version.
  • Add a text region: If selected you can add a text region. Text regions are necessary for further processing.

Note: Borders of text regions should be near to the text, but need not be "perfect". In most cases simple rectangles are sufficient.

  • Add a line: If selected you can add a line region. Line regions are necessary for further processing.
  • Add a baseline: If selected a baseline is added to an existing line region (or the line region is created automatically). Baselines are necessary for further processing, since the HTR engine takes them as reference point.
  • Add a word": If selected a word is added to an existing line region. Words are not necessary for further processing.

Note: The following shape editing features can be applied to existent shapes. These shapes must be selected beforehand either on the image canvas or the structure tree (reachable via the Structure_Tab at the left of the canvas)

  • Remove a shape: removes all preselected shapes
  • Add point to selected polygon: Used to manually correct the shape to reach a better enclosure of the text on the image
  • Remove point from selected polygon: See feature above - the other way round
  • Splits a shape into subshapes horizontally: Sometimes necessary to make two shapes out of one, e.g. if one text region contains two columns
  • Splits a shape into subshapes vertically: Used to correct a wrong segmentation. May we want to split one big text region into several (logical) paragraphs, or one line into two and so on.
  • Splits a shape into subshapes by a user defined line: The split line is freely definable, allowing more advanced splits
  • Merge selected shapes: At least two selected structure shapes merge into one new shape
  • Simplifying selected polygon: The selected polygon receives a much simpler shape with less points using an implemented algorithm. The parameter relates to the strength of the simplification - the higher the value the more points of the polygon are removed.

The Text Editor

The text editor and the text editor menu bar

The Text Editor is strongly connected with the Canvas. More precisely the full text - either automatically detected or transcribed - is connected with the image on line or word level. If a line is selected and therefore highlighted in the Canvas the same happens simultaneously in the text editor. So Transkribus is made for fast and comfortable transcription or correction of text. All tools needed for that task can be found in the

Text Editor Menu Bar

  • Line based/Word based: There are two modes to work with the text of a document. For HTR recognition the "Line based" mode is used, for OCR text use "Word based" display. The reason for this is that currently HTR does neither provide coordinates of words, nor are word regions (segments) necessary to train the HTR engine.
  • Region: Easily change text regions back and forth or jump directly to the region number you type in
  • Change font: Change global display font of the text fields to your favorite settings. Note: this is only for display purposes, not for later exporting the text.
  • Toggle line bullets: Show or hide line numbering. If the line bullets are green, an HTR has been performed and the suggestion editor can be used!
  • Alignment: Left / Center / Right alignment of text
  • Visible line above and beneath: If selected the current line in the editor lies not entirely on the upper or lower border
  • Show styles: Toggle each of the following options to render them in the editor
    • Font type styles: serif, monospace, letter spaced
    • Text style: normal, italic, bold, bold & italic
    • Other: underlined, strikethrough, etc.
    • Tags: coloured underlines for tags
  • Delete text: at different levels
    • Delete text of current region
    • Delete text of current line
    • Delete text of current word
  • Enable autocomplete
    • Note: This is a very limited experimental feature and well known from your mobile phone: It tries to suggest words during writing. At the moment the bag of words is restricted to the current page but will be extended to a language dictionary in the future.
  • Long dash: Inserts a long dash ('Geviertstrich')
  • Angled dash: Inserts an angled dash (not sign). It is strongly recommended to use this angled dash at the end of a line
  • Undo: Undo last text change (Shortcut: ctrl + z)
  • Redo: Redo last undone text change (ctrl + y)
  • HTR suggestions: Toggle visibility of the suggestion editor using HTR results. Note: this option is only available if a HTR process was carried out on this document. The green bullets at the beginning of the line indicate that an HTR result is available for the line!
    • The suggestion editor is more or less a table and contains in each column for each word in the text editor some next best hits. By clicking a word in the table, the corresponding word in the editor gets replaced. This means that if the column n is single clicked, the n-th word in the text editor line gets replaced. If the line contains less words than the table has columns the word gets appended at the end of the line, except when the user makes a double click. In that case the word gets replaced at the current position of the cursor.
  • CATTI: Enables the CATTI server suggestion mode and is only available for documents where HTR was performed and thus wordgraphs exist.
    • This feature offers an interactive transcription with suggestions based on the so-called wordgraph (a complex method to store the results of the HTR process). So if the user types a word then the best possible entry for the whole remaining line shows up. By pressing 'CTRL'+'n' the next best hit always pops up. All in all this feature allows to quicken correction speed significantly. Note: We recommend to use 'CATTI' as well as 'HTR suggestions' at the same time offering the full options to the user.

Reload: reload wordgraph editor

The Tabs

Transkribus offers a total of nine tabs in order to process a loaded document. These tabs are:

Left hand side:

  • Documents: Display, select, upload, delete, share, and search documents and collections
  • Structure: A tree view showing the segmentation of a page and rendering of blocks, lines, and words
  • Jobs: Displays all the jobs applied to a document or a collection
  • Versions: Shows all versions of the transcription process
  • Pages: Shows thumbnails of the document

Right hand side:

  • Metadata: Enables users to render structural metadata of the page
  • Tools: Offers a number of tools for automated processing
  • Virtual Keyboard: Displays special character and language sets
  • Tagging (Beta): Supports you to enrich your transcription with tags, such as abbreviations/extensions

Documents Tab

Start Collection Manager within the Documents Tab View

Within the Documents tab users are enabled to

  • display general information about the document
  • change the metadata of the document (title, date, language, etc.)*
  • define and add an Editorial Declaration
  • select a collection
  • select a document
  • call the collection manager
  • upload new documents either individually (slow) or via FTP (fast!) as a batch job
  • search for collections and documents

Document metadata

  • Loaded doc: Shows the title and unique ID of the document within the Transkribus Cloud
  • Current collection: Shows the ID and the name of the collection from which the document is opened
  • Current file: Indicates the original file name.

Note: All this information can also be found at the top of the Transkribus interface

  • Document metadata...

The metadata window allows users to add some basic metadata to the document: Title, author, Date of Upload, Genre, Writer, Language(s), Script type, Date of writing and Description

Editorial Declaration'

In order to be able to produce a Scholarly Digital Edition of a text which complies with the state of the art in the field it is important to be able to provide a comprehensive and transparent description how this transcription was created. In order to support users in this task we have introduced a specific mechanism in Transkribus. A more detailed description can be found in the Section Editorial Declaration.

Server documents

Though it is possible to work with Transkribus locally most operations require that documents are stored on the server ("Transkribus cloud"). The following information is therefore only valid for documents residing on the Transkribus servers.

Collections

  • Refresh collections (e.g. after uploading several documents into a newly created collection)
  • Call the Collection Manager

Collection Manager

Collection Manager Overview

On the information view on the left the user can choose among each collection he is a member of and immediately gets a list of all documents contained in this collection with information about ID, title, number of pages, and so on.

To add other documents to a collection or change the membership of a document the Collection Manager can be used. To start the Collection Manager click the symbol above the collections drop down box. Then a new window opens where collections can be managed in following ways:

  • Create a new collection or remove an empty collection
  • User management: Add/Remove users of the selected collection. To support this task the users of Transkribus can be found inside the manager with help of several search fields like first name, last name or email address. Furthermore the role of an already added user is changeable. At this moment, the following roles with corresponding rights exist:
    • ‘Owner’: Has all rights for their collection. Can manage the collection by adding documents, adding users, changing user roles
    • ‘Editor’: Can transcribe documents they are connected to and can add other transcribers
    • ‘Transcriber’: Can work with the document
  • Document management:
    • Documents can be added to or removed from the selected collection.
    • The manager knows all the uploaded documents of each single user and shows them on the lower right hand side where they can be selected and added.
    • Note: One document can belong to several collections!
    • The collection manager can also be used to delete documents from the server.

Upload Documents

  • Upload a document to the current collection: A user is enabled to select one directory with images and to upload it to the server.
    • Note: This type of upload works very well for documents with a maximum of 500 MB!
  • Ingest documents from FTP storage: This is a two step upload procedure and requires an FTP client, but it is fast, reliable and large amounts of files and documents can be uploaded at once in a convenient way.
    • FTP Server: Use the following server address for ftp upload: ftp://transkribus.eu/. We recommend to use a dedicated FTP client, e.g. FileZilla is an Open Source free software.
    • It will require your Transkribus user name and password to connect to the Transkribus FTP
    • Copy and paste documents from your harddrive or a network storage to your private Transkribus FTP folder
    • Within the Transkribus client open the button "Ingest documents from FTP storage", select the relevant documents and ingest them into your exisiting collection, or a newly created one.
  • Note: You can only upload selected directories/folders, but not single image files. (Nevertheless there is the chance to replace single image files once you discover an erroneous image in an existing document.)
  • (NEW) Upload document via METS URL over the web.

Structure Tab

Structure Tab View

The Structure tab enables users to navigate quickly through the segmented page image.

The structure of each page is hierarchical in which a page can be divided into

  • print space
  • text regions
    • lines
      • base lines
      • words as the lowest structure type in this hierarchy and
  • separator regions

A text region consists of 1-n lines and a line has a base line and may have several words as child elements. For each of these structure types some information is shown, e.g. the text contained in the corresponding area (if already transcribed or automatically detected), coordinates of the outline given as a point list, unique ID and reading order.

The Structure Tab widget contains a small tool bar too. Possible tasks are expanding or collapsing the structure tree, assigning unique IDs to all elements according to their current sorting and to automatically determine the ordering through consideration of the coordinates. By clicking onto the reading number and typing a new number, the ordering can be changed very easily. Because of the very versatile nature of handwritten documents (but also printed documents) the direct control of the reading order is very important. The reading order gets always computed for the actual selected element. So if a text region is selected, the ordering gets calculated for the underlying text lines. This reading order can be deleted as well.

Beyond that the ‘Delete’ button - a red cross - clears the whole page content. This button must be used very carefully. If the structure of the current page was deleted accidentally, restoration is possible. Either press the ‘Undo’ button in the canvas toolbar or do not save the changes in the transcript when changing the page or in all other situations when the ‘Unsaved changes’ dialog appears.

Jobs Tab

Jobs Tab View

The Jobs tab gives an overview on all jobs performed on this document, but also on all other documents where a user has access rights.

A table gives details about the jobs.

Type: The type of the job.

  • Create Document: A document is uploaded to the Transkribus Cloud
  • OCR: An OCR job has been performed with Abbyy FineReader
  • HTR: A HTR job was applied

State: The status of the job.

  • WAITING: The server is busy but the job is queued
  • RUNNING: The job was started at the server and is now being performed.
  • FINISHED: The job was performed correctly.
  • CANCELLED: The user aborted the job.
  • FAILED: The job could not be performed by the server.

More information of the table:

  • Creation date: Indicates when the job was created by the user
  • Doc-Id: The unique identification number of the document
  • User-Id: The unique identification number of the user who started the job. Note: all jobs of all users are shown who share rights with you within a collection.
  • Page: The page number resp. the image number within the document. -1 if related to the whole document.
  • Description: A more detailed message from the server
  • ID: A unique identification code for the job itself

Versions Tab

Versions Tab View

This feature is only available for remote documents! For local documents only the actual version is stored and loadable.

Transkribus stores several versions of a document on the server. Each version represents a complete PAGE XML file which includes all the information of the segmentation and transcription.

The versions tab gives an overview of all versions which were created during the workflow. Each time the user or a tool stores a new result, a new version is generated.

The main advantages of several versions are that

  • you always have an overview of the changes performed by you or other users
  • you may go back to an earlier version to restart the work
  • you do not lose work as could happen if there is only one version which always gets overwritten
  • you may carry out some experiments and compare different versions of a document/transcription.

Pages Tab

The Pages Tab is a simple thumbnail overview. It provides quick access to the pages (by double clicking the page where you want to jump to) and also shows the original filenames.

Metadata Tab

Metadata Tab View

With this tab general data about the progress of the transcription process, as well as structural metadata about the page and the segmented elements, along with the text style of the transcription can be edited.

Edit status:

  • New, In Progress, Done and Final can be applied to a document, respectively a version of a document. Set the status and store the document to render the status.

The idea behind this is that with this status you can organise your work easier within a team. E.g. if one user is responsible for the segmenation step, he may indicate that his work is finished with a "Done".

Page type: Provides a number of page types.

  • front-cover, back-cover, title, table-of-contents, index, content, blank, other are currently offered.

Links: Connects two segmented elements, e.g. two lines, or two blocks with each other.

  • Press CTRL button and mark two elements with the mouse. Then press the "Link" Button.
  • The link is represented in the PAGE file and displayed in the text field.
  • Example: A link between line one and line four of a document may look like this: tl_1 <--> tl_4
  • Display the link: Select the text representation of the link in the link field, e.g. tl_1 <--> tl_4

Tools Tab

  • Transkribus includes a number of automated services/tools which can be called via the Interface. These tools run in the Transkribus cloud and are hosted by the University of Innsbruck.
  • Some of the tools are also operated on the High Performance Cluster LEO3 of the Central Computing Service for which we are highly thankful!

Detect Regions

Status

  • Experimental
  • Needs enhancement

Behaviour

  • Text regions are detected at single pages
  • Already available text regions are deleted/overwritten

Background

  • HTR processing needs correctly detected text regions and baselines
  • In the future it is planned to have integrated solutions available where text regions and baselines are detected in one process

Provider

  • National Centre for Scientific Research (NCSR) – Demokritos in Greece/Athens.

Contact

Detect Lines and Baselines

Status

  • Beta version
  • Can be used for productive work

Behaviour

  • Detects line regions and baselines in text regions.
  • Note: For HTR purposes only baselines are necessary, therefore no need to correct line regions.

Background

  • The PAGE format which is used internally in TRANSKRIBUS requires that each baseline is part of a line region. Therefore the tool needs to produce line regions although the line regions are not used for further processing (and can therefore be ignored in the correction process).

Provider

  • National Centre for Scientific Research (NCSR) – Demokritos in Greece/Athens.

Contact

Detect Baselines

Status

  • Beta version
  • Can be used for productive work

Behaviour * Note: This is a tool with a very special purpose: If line regions are already available the tool will detect corresponding baselines

  • Needs correct line regions as input
  • Detects baselines within line regions

Background

  • In some rare cases researchers may have correct line regions available, these line regions can be enriched with baselines.

Provider

  • National Centre for Scientific Research (NCSR) – Demokritos in Greece/Athens.

Contact

Start OCR for page / Start OCR for document

Note: This is an external tool for recognizing printed text - not handwritten text! Satus

  • Productive

Behaviour

  • All pages/images of the document are processed with ABBYY FineReader 11 SDK
  • Select one or more languages
  • Select “combined” if Gothic text and Roman Typeface are used within one document
  • Select “OldGerman”, “OldEnglish”, etc. to activate the recognition of the long “s” in Roman Type Face books
  • The document is processed from scratch, manually segmented text blocks are not taken into account.

Background

  • ABBYY FineReader is one of the leading OCR engines worldwide.
  • We have implemented only a very small set of the features provided by the ABBYY SDK.
  • UIBK runs a powerful ABBYY FineReader SDK Cluster and is able to process large amounts of documents.

Provider

  • ABBYY FineReader
  • University Innsbruck, Digitisation and Digital Preservation group

Credits

  • ABBYY FineReader for 15 years of cooperation
  • This implementation is based on the infrastructure set up during the Europeana Newspaper Project (2013-2015) coordinated by the Staatsbibliothek Berlin: http://www.europeana-newspapers.eu/

Structure Analysis

Status

  • Beta version. Can be used for production

Behaviour

  • SA needs as input a page which was processed with an OCR engine.
  • Based on several rules it will detect
    • page numbers
    • headers (=running titles) and
    • footnotes (regions, not single footnotes)
  • Note: The detected structure values appear in the “Structure” tab on the left hand side.

Background

  • As part of the IMPACT project (2008-2012) University of Innsbruck, Digitisation and Digital Preservation group developed several rule sets for processing historical printed documents
  • The rule sets can easily be extended to other document types as well.

Provider

  • University Innsbruck, Digitisation and Digital Preservation group

Credits

Contact

HTR Processing

Status

  • Experimental
  • Do not use for production

Behaviour

  • Trained HTR models can be selected and applied to one page
  • Note: HTR is a sophisticated system where character sets and language models need to play together. At the current stage of HTR it needs to be trained for each document/collection of documents separately. The more data become available the higher will be the chance that these models can be merged so that the training phase will be reduced.
  • Words which are not in the lexicon will not be recognized.
  • Characters (e.g. special characters) which were not seen by the HTR engine during the training process, can also not be recognized

Available HTR models

  • Reichsgericht_Training
    • Trained on German Kurrent from the early 20th century. Three writers.
    • Only a very limited vocabulary based on juridical texts was used for training.
  • Forrest Collection 1-3
    • Trained on specimen, mainly written by George Forrest.
    • Very limited test set and vocabulary.
  • Bozen HS37a
    • Trained on 100 pages on German Kurrent text from the Bozen collection. Several writers, very limited vocabulary.
  • Zwettl 30
    • Trained on about 30 pages of German Kurrent text from the 17th century. Several writers.
  • Frisch
    • Trained on 100 pages of a printed book (Gothic letter) with German text from the 17th century.
  • MarineLives
    • Trained on 30 pages of English texts from the 18th century. Limited vocabulary.
    • No lexicon currently available in the background, therefore limited applicability

Background

  • This is one of the very first implementations worldwide for processing handwritten historical texts out-of-the-box.

Provider

  • Technical University Valencia, Pattern Recognition and Human Language Technology

Contact

Compute Accuracy

Status

  • Productive

Behaviour

  • Compares one version of the text with another one. Typically a reference page (ground truth) with an automatically produced version of the page.
  • Note: Segmentation must not be changed, since the tool requires that the same lines appear in the page.

Background

  • The tool provides a Word Error Rate (WER) and a Character Error Rate (CER). Both measures are interesting. Usually the WER is much higher than the CER.

Example

  • Reference text: "nahme, daß der Beschluß vom 23. August 1901 unter allen Umständen rechts¬"
  • HTR recognized text: "nahme daß der Beschluß vom 2 . August 1901 unter allen Umständen ."
  • So we have 12 words and 4 of them are incorrect:
    • "nahme" should be: "nahme,"
    • "2 ." should be: "23."
    • "." was inserted incorrectly
    • And one word needs to be inserted: "rechts¬".
  • Therefore we have 12 words, 4 of them are incorrect and WER = 33%.
  • The same operation is done for characters, we find 73 characters (including blank space) and 9 of them are incorrect or need to be inserted. So CER=12%.
  • Note: The WER as it is calculated in this way is really a strict measure. Human beings are able to understand the general message of a text if WER is around 30% and CER around 15%.

Provider

  • Technical University Valencia, Pattern Recognition and Human Language Technology

Contact

Virtual Keyboards Tab

Virtual keyboards extend the range of available letters and signs dramatically and can be further extended if needed.

  • Open the Transkribus directory on your computer.
  • Open the file: "virtualKeyboards.xml"
  • Here you can add other UNICODE blocks than those used in the standard edition.
  • Note: It should also be possible to add alphabets which are written from Right to Left, such as Arabic or Hebrew.

Tagging (Beta) Tab

Abbreviation tags
Find Tags ...
... and Normalize Tags

Transcriptions of a (historical) text often require some kind of "meta information". E.g. an abbreviated word shall be "explained" with its expansion, or an "unclear" word shall be marked.

For this purpose we provide a comprehensive Tagging system which enables you to use either "predefined tags" or to define your own tag set. The predefined tags have the advantage that several special features can be connected with it.

Behaviour of tags

  • Predefined tags are displayed in italic
  • Tags can only be set within the text editor
  • Tags cannot be set for blocks, only for lines and words.
  • Tags are displayed in the text editor.
  • The colour of tags is produced automatically but can be changed by simply clicking the colour.
  • A word or text section may be part of several tags.
  • Tags can be defined in more detail with "attributes". E.g. the abbreviation tag has the attribute "expansion".
  • When exporting a document tags can be displayed in various formats. See Export Documents

Predefined tags

  • abbrev: For words abbreviated with tildes, loops, superscript letters etc. Expansions (as far as known) can be entered.
  • unclear: For uncertain readings and/or illegible words.
  • person: For person names
  • speech: For direct speech
  • place: For place names
  • textStyle: For specific styles (combined text features)
  • organization
  • ...

Find tags

One very important and helpful feature in the Tagging Tab is the 'Find tags' functionality.

In the additional search window the scope can be set at the beginning. So the tags will be searched either in the current region, the page, the document or even collection. Of course the search can be restricted to tag name (could be 'place' for example) and/or tag value (e.g. 'Paris). Another restriction could be to add a property facet like 'country' with the value e.g. 'France'. Property facets can be chosen from the tag attributes and these are for each tag type different. Each property facet can be added or deleted and the amount is in principle unlimited. After setting the search terms and restrictions the search can be started and the results are shown directly in the same window. There you see the tag value, the context (= some words before and after the tag to give some broader understanding) and the information about the location of the tag.

With the displayed result set you can start following actions:

  1. Just use it to get an overview of the tags in your document with respect to amount, different types, characteristics and so on.
  2. Doubleclick on one result display the corresponding image area in the canvas and the corresponding text area in the text editor
  3. Use the Previous or Next Button to browse through the tags in the canvas and text editor
  4. Multiselect tags with same tag name (ctrl + mouse click) in order to normalize the attributes of these tags
    • Example: Select two 'Person' tags, press 'Normalize' to see the Normalization window with all attributes of the chosen tags. Fill out the attribute values you want to have for all selected tags and then press 'Normalize selected tags'. In our example we could have selected two tags with value 'Goethe' and normalize the first name of these to 'Johann Wolfgang von'. This can lead to a highly standardized tagging set. Please note that normalization does not work for different tag names since they have different attributes too.

Segmentation of images

Key concepts

The segmentation of images into text regions, line regions and baselines is crucial and one of the most important tasks within Transkribus. It is - compared to a purely text based transcription - an extra step and workload, but it also provides many advantages. The segmentation step is needed

  • to train the HTR model
  • to recognize handwritten text with a trained model
  • to export text and images in various formats, such as a searchable PDF.

Note: All the segmented elements, such as printspace, text region, line region or baseline are stored in the PAGE file with their coordinates.

All tools for segmentation can be found in the The Canvas section with a short explanation of their functionality.

The following main rules are applied to the segmentation of images:

  • Manual transcription of text can only be done, if there are text regions and line regions marked in the page image.
  • Therefore the first step is usually to draw text regions on the page image.
  • The next step is than to draw baselines - the line regions can be generated automatically.
  • Note: All regions can be defined either as polygon, or as rectangle. In most cases rectangles are fully sufficient for the purpose of transcribing a text correctly or for training the HTR engine.

How TO

Aim: Identification of Text Regions (TR), Line Regions (LR) and Baselines (BL) within the text. This can be done automatically or manually.

Since text regions depend very strongly on the type of your documents, users should first get a general idea of what your text regions should look like. After that we suggest identifying the Text Regions manually, as it is very unlikely that the tool can identify all text regions exactly as you want. But you could give it a try as well. You can then use the automated tool to 'Detect Line Regions and Baselines', and then make manual corrections where necessary. Automatic segmentation works well if the lines do not overlap frequently.

The segmentation process may be more comfortable if the panes are undocked or invisible. Working with two monitors is best.

Transcribe Documents

Export Documents

Export Document, e.g. PDF with tag highlighting

After clicking the ‘Export Document’ button on the main tool bar the export dialog opens. Here Transkribus shows all possible export formats. Several of these export formats can be selected and exported at once:

  1. Image/Page(Alto)/Mets: Possibility to export the images plus the full text either in the Page or in the Alto format (selectable via the export options tab) plus the document describing METS file.
  2. PDF: A PDF can be exported where the full text layer lies under the image layer. Additionally Transkribus offers the possibility to export extra text pages. This means that after each image page one text page with pretty layout gets added. Another export option gives the option to highlight/underline all the (custom) tags in the exported document. The last page in the PDF shows the overview of these tags. By clicking on one tag a PDF search for this tag starts in the used PDF viewer.
  3. For TEI export either ‘Zone per region’ or ‘Zone per line’ is selectable. Default value is ‘Zone per region’ which gives a much simpler TEI structure than the second option. TODO: extend
  4. RTF export format: Similar to the PDF export also for RTF export the allocated tags can be exported. When choosing that option all tag names and corresponding tags are listed at the end of the document. The option ‘Word based’ means that tags allocated for the words also get exported.
  5. Tag Export: This export format allows the export of tags only. The export format used is an Excel document. The first sheet in this Excel gives an overview about all exported tags, whereas each subsequent sheet shows tags of the same type where the sheet name corresponds to the exported tag name. The columns on each sheet show the tag attributes and each row contains one tag.

Some options in the export dialog correspond to several or all export formats. For all formats the pages to export are selectable. By default all pages are foreseen to be in the export document but each page can be easily included or excluded. For most export formats the custom tags can be exported too. To in- or exclude some of the used tags a button ‘select Tags’ must be pressed and in the newly opened window the checkboxes for all those tags can be (de)selected. Naturally the export location is freely selectable. Pressing ‘OK’ completes the export process but can contain a warning that the file already exists. If so, the user can decide to overwrite the file or cancel the operation.

Shortcuts

Here you find a comprehensive list of shortcuts in Transkribus which will allow you to work not only faster but also to use the keyboard instead of the mouse (much better for your arms and shoulders).

  • Alt C: Applies a tag more quickly to a selected word. If you want to apply the same tag several times you select the tag with the check box, you mark a word or a section of text and press "Alt C". The selected tag will be applied.

Did you know?

  • If the line bullets in the transcription window at the bottom are green, an HTR has been performed and the suggestion editor can be used.
  • You can change the parent of shapes (i.e. for lines, baselines and words) by drag-and-dropping them on the new parent in the "Structure" tab on the left
  • You can resize a shape keeping its aspect ratio by holding the shift key while resizing it
  • Holding the shift key while moving a shape also moves its child shapes
  • Press the escape button at any time to jump back to selection mode in the canvas area
  • You can move the image by either holding the right or left mouse button. When using the left button, make sure the mouse is not over a selected shape, or this shape will be moved.
  • You can multiselect shapes by holding the ctrl key while selecting them one by one or drawing a selection rectangle for all shapes that shall be selected.