Just released PDF Impress 10 comes with multi language OCR (optical character recognition) engine that will easily capture text in your native language. OCR engine is based on Tesseract and default language support includes English, German, French and Spanish, more languages can be added on. Follow these steps to add your language data pack.

  1. Visit https://github.com/tesseract-ocr/tessdata and download your language data pack. 3.02 version is required ( e.g. tesseract-ocr-3.02.tur.tar.gz for Turkish).
  2. Unpack your download and copy (e.g. tur.traineddata for Turkish) into this folder C:\Program Files (x86)\BinaryNow\PDFImpress 10\tessdata .
  3. Run PDF Impress Tools (restart it if is already running).
  4. Click on Scan icon (top left).
  5. Go to Settings and under PDF options, select a new language from pull down menu.

PDF_Impress_10_OCR_languages

You can learn more and download trial of PDF Impress 10 here. Previous versions ( 2013, 2014) can purchase discounted upgrade in a BinaryNow online store.  One task alternative solution for scanning with OCR is provided with Scan2Encrypt.

Related Articles

  1. How to add support for foreign languages into Scan2Encrypt OCR engine
  2. PDF Impress is a multi-language PDF converter available in 15 languages (Spanish...
  3. Scan to PDF with OCR using PDF Impress Tools 10
  4. How to scan and OCR documents directly into PDF Impress Tools
  5. How to extract selected text from PDF using PDF Impress Tools

Tags: , , , , , , , , , ,

scan and capture text any language (1), tesseract ocr new language (1)