Frequently Asked Questions (FAQ)


Why not just use full-text search?

When indexing the text rather than just picking the best candidate for your search, KWS indexes a list of lots of possible candidates, so even if the HTR is not 100% accurate, you might still find what you are looking for.


Some of the results I’m getting are not the word I’m looking for.

You might want to set the confidence filter to a higher percentage.

However, sometimes you might get a false positive result even on high confidence levels, especially on pages that are harder to read and thus have higher character error rate. This is because if the character error rate is high, the probability of the HTR misrecognizing characters in KWS is also higher. If you are searching words from hard to read pages, you might want to set the confidence filter lower, since the word you’re looking for might be lower on the confidence list.

Some letters and letter combinations are also vulnerable of being confused with each other, even on high confidence levels (a and u, k and h, St and H, etc.), which might result in false positive results.

Read more: https://makingamodernarchive.blogspot.com/2019/07/keyword-spotting-effective-search-tool.html


I’m viewing a page on image and text view. Why are some of the lines in the text field only partial lines in the image?

The lines in the text field correspond with the automatically detected lines in the image. Sometimes the page’s layout is hard for the automatic line detection, so the lines won’t perfectly correspond with the page.


Will there be more material to search in the future?

Yes.


Can I download the transcriptions and images?

You can download the page's transcription as a txt-file. You can't download the images yet.