Академический Документы
Профессиональный Документы
Культура Документы
INTRODUCTION
This document provides an overview of the recognition engines in support of both
EMC® Captiva® Capture and Advanced Recognition engines. It outlines what each
recognition engine provides and notes which engines are included, as well as those
that are licensed separately. You will also find tips on which engine is recommended for
certain use cases.
BARCODE RECOGNITION 7
WESTERN OCR 9
BARCODE RECOGNITION 13
CHECK READING 14
CAPTIVA CAPTURE
GENERAL-USE OCR
EMC Captiva General-Use OCR is the standard engine included with the Captiva Capture Standard
and Enterprise Capture Server.
With the inclusion of Advanced Zonal OCR/ICR and its superior handprint recognition into EMC
Captiva Advanced Recognition, the current Standard Handprint/General-Use ICR engine will be
deprecated in the near future. Existing projects that use Standard Handprint/General-Use ICR
should be migrated to use the handprint engine included with Advanced Zonal ICR as soon as
possible.
Barcode support
EMC Captiva General-Use OCR/ICR engine also provides barcode recognition and supports the
following barcode types:
Codabar, Codabar with start-stop char transmit, Code 128, Code 128 with check digit transmit,
Code 39, Code 39 full ASCII mode, Code 39 with check digit control and transmit, Code 39 with
start-stop char transmit, EAN8/13, EAN/UPC with 2 and 5 digit supplement, ITF (2 of 5 interleaved),
ITF with check digit control and transmit, Postnet code, UCC Code 128, UPC-A, UPC-E (6-digit)
Language recognition support
Afrikaans, Albanian,Aymara, Basque,Bemba, Blackfoot, Brazilian-Portuguese, Breton, Bugotu,
Bulgarian (Cyrillic), Byelorussian (Cyrillic),Catalan, Chamorro, Chechen, Chinese (Simplified),
Chinese (Traditional), Corsican, Croatian, Crow, Czech, Danish, Dutch, English, Eskimo (Inuit),
Esperanto, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Gaelic (Irish), Gaelic
(Scottish), Galician Ganda, German, Greek, Guarani, Hani, Hungarian, Icelandic, Ido, Indonesian,
Interlingua, Italian, Japanese, Kabardian, Hawaiian, Kasub, Kawa, Kikuyu, Kongo, Korean,
Kpelle,Kurdish, Latin, Latvian, Lithuanian, Luba, Lule Sami, Luxembourgian, Macedonian (Cyrillic),
Malagasy, Malay, Malinke, Maltese, Maori, Mayan, Mia, Minankabaw, Mohawk, Moldavian (Cyrillic),
Nahuatl, Northern Sami, Norwegian, Nyanja, Occidental, Ojibway, Papiamento, Pigin English, Polish,
Portuguese, Provencal, Quechua, Rhaetic, Romanian, Romany, Ruanda, Rundi, Russian (Cyrillic),
Sami (Lappish), Samoan, Sardinian, Serbian (Cyrillic), Serbian (Latin alphabet), Shona, Sioux,
Slovakian, Slovenian, Somali, Sorbian (Wend), Sotho, Southern Sami, Spanish, Sundanese, Swahili,
Swazi, Swedish, Tagalog, Tahitian, Tinpo, Tongan, Tswana (Chuana), Tun, Turkish, Ukrainian
(Cyrillic), Visayan, Welsh, Wolof, Xhosa, Zapotec, Zulu
Note: Twenty languages installed by default and up to 123 languages available, including English,
Simplified Chinese, Japanese and Korean.
CAPTIVA CAPTURE
Barcode support
EMC Captiva Capture East Euro/APAC OCR module also performs barcode recognition and supports
the following barcode types:
Codabar, Code 128, Code 39, Code 93, EAN 13, EAN 8, IATA 25, Industrial 25, Interleaved 25,
Matrix 25, PostNet, UCC 128, UPC-E
Language recognition support
Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altaic, Armenian (Eastern), Armenian (Grabar),
Armenian (Western), Avar, Aymara, Azerbaijani (Cyrillic), Azerbaijani (Latin), Bashkir, Basque,
Belarussian, Bemba Blackfoot, Breton, Bugotu, Bulgarian, Buryat, Catalan, Chamorro,Chechen,
Chinese (PRC),Chinese (Taiwan), Chukcha, Chuvash, Corsican, Crimean Tatar, Croatian, Crow,
Czech, Danish, Dargwa, Dungan, Dutch (Netherlands), Dutch (Belgium), English, Eskimo (Cyrillic),
Eskimo (Latin), Esperanto, Estonian, Even, Evenki, Faroese, Fijian, Finnish, French, Frisian, Friulian,
Gaelic Scottish, Gagauz, Galician, Ganda, German, German (new spelling), German (Luxembourg),
Greek, Guarani, Hani, Hausa, Hawaiian, Hebrew, Hungarian, Icelandic, Ido, Indonesian, Inguish,
Interlingua, Irish, Italian, Japanese, Kabardian, Kalmyk, Karachay-Balkar, Karakalpak, Kasub, Kawa,
Kazakh, Khakas, Khanty, Kikuyu, Kirghiz, Kongo, Korean, KoreanHangul, Koryak, Kpelle, Kumyk,
Kurdish, Lak, Lappish, Latin Latvian, Lezgin, Lithuanian, Luba, Macedonian, Malagasy, Malay,
Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minankabaw, Mohawk, Mongol, Mordvin, Nahuatl,
Nenets, Nivkh, Nogay, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Nyanja, Occidental,
Ojibway, Old English, Old French, Old German, Old Italian, Old Spanish, Ossetian, Papiamento, Pigin
English, Polish, Portuguese (Brazil), Portuguese (Portugal), Provencal, Quechua, Rhaeto-Romanic,
Romanian, Romanian (Moldavia), Romany, Ruanda, Rundi, Russian (Old Spelling), Russian,
Samoan, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali,
Sorbian, Sotho, Spanish, Sunda, Swahili, Swazi, Swedish, Tabassaran, Tagalog, Tahitian, Tajik,
Tatar, Thai, Tinpo, Tongan, Tswana, Tun, Turkish, Turkmen, Tuvan, Udmurt, Uighur (Cyrillic),
Uighur (Latin), Ukrainian, Uzbek (Cyrillic), Uzbek (Latin), Visayan, Welsh, Wolof, Xhosa, Yakut,
Yiddish, Zapotec, Zulu
Also support for several formal languages: Basic, C++, Cobol, Fortran, Java, Pascal, chemical
formulas, E13B, CMC7.
CAPTIVA CAPTURE
BARCODE RECOGNITION
EMC Captiva Capture Barcode Recognition carries out barcode recognition. Barcode Recognition
settings supports two barcode types:
• 1D barcode parameters
• 2D barcode parameters
2D barcodes:
PDF-417, QR, Data Matrix
EMC CAPTIVA ADVANCED RECOGNITION (FORMALLY
KNOWN AS DISPATCHER)
EMC Captiva Advanced Recognition supports two processing types: zonal recognition and full-text
recognition. Zonal recognition is used to capture data from structured forms where field location is
the same from one image to the next. Full-text recognition is primarily used for semi-structured or
unstructured documents where text does not reside in a static location from one image to the next.
EMC Captiva Advanced Recognition supports several recognition types, including machine-printed
text, hand-printed text, check marks, courtesy amount recognition (CAR), legal amount recognition
(LAR), and MICR/CMC7.
The following information provides a breakdown on all the recognition options for EMC Captiva
Advanced Recognition and indicates which engines are included.
GENERAL-USE OCR
The General-Use OCR is the standard engine included with EMC Captiva Advanced Recognition.
With the inclusion of Advanced Zonal OCR/ICR and its superior handprint recognition into EMC
Captiva Advanced Recognition, the current Standard Handprint/General-Use ICR engine will be
deprecated in the near future. Existing projects that use Standard Handprint/General-Use ICR
should be migrated to use the handprint engine included with Advanced Zonal ICR as soon as
possible.
WESTERN OCR
The Western OCR performs full-text and zonal capture of machine-printed text and is included in
EMC Captiva Advanced Recognition.
Barcode support
EMC Captiva Advanced Recognition East Euro/APAC OCR module also performs barcode recognition
and supports the following barcode types:
Codabar, Code 128, Code 39, Code 93, EAN 13, EAN 8, IATA 25, Industrial 25, Interleaved 25,
Matrix 25, PostNet, UCC 128, UPC-E
Language recognition support
Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altaic, Armenian (Eastern), Armenian (Grabar),
Armenian (Western), Avar, Aymara, Azerbaijani (Cyrillic), Azerbaijani (Latin), Bashkir, Basque,
Belarussian, Bemba Blackfoot, Breton, Bugotu, Bulgarian, Buryat, Catalan, Chamorro,Chechen,
Chinese (PRC),Chinese (Taiwan), Chukcha, Chuvash, Corsican, Crimean Tatar, Croatian, Crow,
Czech, Danish, Dargwa, Dungan, Dutch (Netherlands), Dutch (Belgium), English, Eskimo (Cyrillic),
Eskimo (Latin), Esperanto, Estonian, Even, Evenki, Faroese, Fijian, Finnish, French, Frisian, Friulian,
Gaelic Scottish, Gagauz, Galician, Ganda, German, German (new spelling), German (Luxembourg),
Greek, Guarani, Hani, Hausa, Hawaiian, Hebrew, Hungarian, Icelandic, Ido, Indonesian, Inguish,
Interlingua, Irish, Italian, Japanese, Kabardian, Kalmyk, Karachay-Balkar, Karakalpak, Kasub, Kawa,
Kazakh, Khakas, Khanty, Kikuyu, Kirghiz, Kongo, Korean, KoreanHangul, Koryak, Kpelle, Kumyk,
Kurdish, Lak, Lappish, Latin Latvian, Lezgin, Lithuanian, Luba, Macedonian, Malagasy, Malay,
Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minankabaw, Mohawk, Mongol, Mordvin, Nahuatl,
Nenets, Nivkh, Nogay, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Nyanja, Occidental,
Ojibway, Old English, Old French, Old German, Old Italian, Old Spanish, Ossetian, Papiamento, Pigin
English, Polish, Portuguese (Brazil), Portuguese (Portugal), Provencal, Quechua, Rhaeto-Romanic,
Romanian, Romanian (Moldavia), Romany, Ruanda, Rundi, Russian (Old Spelling), Russian,
Samoan, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali,
Sorbian, Sotho, Spanish, Sunda, Swahili, Swazi, Swedish, Tabassaran, Tagalog, Tahitian, Tajik,
Tatar, Thai, Tinpo, Tongan, Tswana, Tun, Turkish, Turkmen, Tuvan, Udmurt, Uighur (Cyrillic),
Uighur (Latin), Ukrainian, Uzbek (Cyrillic), Uzbek (Latin), Visayan, Welsh, Wolof, Xhosa, Yakut,
Yiddish, Zapotec, Zulu
Also support for several formal languages: Basic, C++, Cobol, Fortran, Java, Pascal, chemical
formulas, E13B, CMC7.
CAPTIVA ADVANCED RECOGNITION
BARCODE RECOGNITION
EMC Captiva Advanced Recognition Barcode Recognition carries out barcode recognition. Barcode
Recognition settings supports two barcode types:
• 1D barcode parameters
• 2D barcode parameters
CHECK READING
EMC Captiva Advanced Recognition Check Reading Check Reading engine is used for automatic
reading of business and personal checks, deposits slips, cash-in and cash-out documents. It can
read hand printed, handwritten and machine printed documents and performs entire check
recognition efficiently. It has been successfully used for check and remittance processing.
East
General-Use Advanced Prime
Western OCR Euro/APAC
OCR OCR/ICR Recognition
OCR
Feature
Specialty Dot matrix, None Farrington 7B, Dot matrix, Dot matrix
font types customizable OCR A/B typewriter,
OCR A/B,
MICR E13B,
MICR CMC7
Pre- No No Yes No No
configured
OCR engine
settings for
specific
applications
and
countries
Multi-engine Yes No Yes No Yes
voting
Captiva Capture
Captiva Advanced
Recognition
languages
CONTACT US
To learn more about how EMC
products, services, and solutions can
help solve your business and IT
challenges, contact your local
representative or authorized reseller—
or visit us at www.EMC.com.
EMC2, EMC, the EMC logo, are registered trademarks or trademarks of EMC Corporation in the
United States and other countries. VMware are registered trademarks or trademarks of VMware,
Inc., in the United States and other jurisdictions. All other trademarks used herein are the
property of their respective owners. © Copyright 2014 EMC Corporation. All rights reserved.
Published in the USA. 06/14 EMC Product Description Guide H4755.6
EMC believes the information in this document is accurate as of its publication date. The
information is subject to change without notice.
www.EMC.com