Вы находитесь на странице: 1из 19

EMC CAPTIVA RECOGNITION

INTRODUCTION
This document provides an overview of the recognition engines in support of both
EMC® Captiva® Capture and Advanced Recognition engines. It outlines what each
recognition engine provides and notes which engines are included, as well as those
that are licensed separately. You will also find tips on which engine is recommended for
certain use cases.

EMC PRODUCT DESCRIPTION GUIDE


Table of Contents
EMC CAPTIVA CAPTURE (FORMALLY KNOWN AS
INPUTACCEL) 3
GENERAL-USE OCR 3

EAST EURO/APAC OCR 5

BARCODE RECOGNITION 7

EMC CAPTIVA ADVANCED RECOGNITION


(FORMALLY KNOWN AS DISPATCHER) 8
GENERAL-USE OCR 8

WESTERN OCR 9

EAST EURO/APAC OCR FOR CAPTURE 10


ADVANCED ZONAL OCR/ICR AND FULL PAGE OCR 12

BARCODE RECOGNITION 13

CHECK READING 14

THIRD PARTY RECOGNITION ENGINE SUPPORT


15
PRIME RECOGNITION FOR EMC CAPTIVA CAPTURE 15

CAPTIVA RECOGNITION CAPABILITIES 16


GENERAL GUIDELINES 18
ADDITIONAL INFORMATION 19
EMC CAPTIVA CAPTURE (FORMALLY KNOWN AS
INPUTACCEL)
EMC Captiva Capture supports two processing types: zonal recognition and full-text recognition.
Zonal recognition is used to capture data from structured forms where field location is the same
from one image to the next. Full-text recognition is used to capture all text on the page for full-text
search and archiving purposes.

CAPTIVA CAPTURE

GENERAL-USE OCR
EMC Captiva General-Use OCR is the standard engine included with the Captiva Capture Standard
and Enterprise Capture Server.
With the inclusion of Advanced Zonal OCR/ICR and its superior handprint recognition into EMC
Captiva Advanced Recognition, the current Standard Handprint/General-Use ICR engine will be
deprecated in the near future. Existing projects that use Standard Handprint/General-Use ICR
should be migrated to use the handprint engine included with Advanced Zonal ICR as soon as
possible.

Processing type Recognition type Licensing Notes

Full-text Machine printed Machine • Recognizes machine print


Zonal 1D barcodes printed – including dot matrix, OCR-A,
delivered as and OCR-B fonts
ICR
standard. • Uses up to three engines for
1D barcodes – optimal accuracy
delivered as • Provides support for a wide
standard. variety of output formats,
ICR requires including TIFF, PDF, PDF/A
an additional • Can carry out full page form
license recognition, ICR, OCR, multi-
Note: The line and omni-font characters
three recognition and barcode
processing recognition
types can be • Over 100 languages are
run at the available
same time.
• Engine requires a license
• Supports 2 recognition types:
o Character recognition
o Barcode recognition

Barcode support
EMC Captiva General-Use OCR/ICR engine also provides barcode recognition and supports the
following barcode types:
Codabar, Codabar with start-stop char transmit, Code 128, Code 128 with check digit transmit,
Code 39, Code 39 full ASCII mode, Code 39 with check digit control and transmit, Code 39 with
start-stop char transmit, EAN8/13, EAN/UPC with 2 and 5 digit supplement, ITF (2 of 5 interleaved),
ITF with check digit control and transmit, Postnet code, UCC Code 128, UPC-A, UPC-E (6-digit)
Language recognition support
Afrikaans, Albanian,Aymara, Basque,Bemba, Blackfoot, Brazilian-Portuguese, Breton, Bugotu,
Bulgarian (Cyrillic), Byelorussian (Cyrillic),Catalan, Chamorro, Chechen, Chinese (Simplified),
Chinese (Traditional), Corsican, Croatian, Crow, Czech, Danish, Dutch, English, Eskimo (Inuit),
Esperanto, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Gaelic (Irish), Gaelic
(Scottish), Galician Ganda, German, Greek, Guarani, Hani, Hungarian, Icelandic, Ido, Indonesian,
Interlingua, Italian, Japanese, Kabardian, Hawaiian, Kasub, Kawa, Kikuyu, Kongo, Korean,
Kpelle,Kurdish, Latin, Latvian, Lithuanian, Luba, Lule Sami, Luxembourgian, Macedonian (Cyrillic),
Malagasy, Malay, Malinke, Maltese, Maori, Mayan, Mia, Minankabaw, Mohawk, Moldavian (Cyrillic),
Nahuatl, Northern Sami, Norwegian, Nyanja, Occidental, Ojibway, Papiamento, Pigin English, Polish,
Portuguese, Provencal, Quechua, Rhaetic, Romanian, Romany, Ruanda, Rundi, Russian (Cyrillic),
Sami (Lappish), Samoan, Sardinian, Serbian (Cyrillic), Serbian (Latin alphabet), Shona, Sioux,
Slovakian, Slovenian, Somali, Sorbian (Wend), Sotho, Southern Sami, Spanish, Sundanese, Swahili,
Swazi, Swedish, Tagalog, Tahitian, Tinpo, Tongan, Tswana (Chuana), Tun, Turkish, Ukrainian
(Cyrillic), Visayan, Welsh, Wolof, Xhosa, Zapotec, Zulu
Note: Twenty languages installed by default and up to 123 languages available, including English,
Simplified Chinese, Japanese and Korean.
CAPTIVA CAPTURE

EAST EURO/APAC OCR


EMC Captiva Capture East Euro/APAC OCR module performs optical character recognition of scanned
or imported images and exports the image and index data to more than 25 different word
processing and text formats. It recognizes several languages with a specialty towards Eastern
European and Asia Pacific languages. EMC Captiva Capture East Euro/APAC OCR module is an add-
on to the Captiva Capture Standard and Enterprise Capture Severs.
Note: The East Euro / APAC OCR engine can be used for zonal extraction without Advance
Recognition but this is not strongly recommended. In this specific mode, the data retuned from the
OCR engine is not UIM data but data that is stored in IA Values. Thus, to pass the data to Captiva
Desktop, custom code needs to be written to pass the IA Values to UIM data (in order to present the
data in Captiva Desktop). Furthermore, the zonal setup is accomplished via the module itself and
not through the Captiva Designer. Therefore, it is strongly and highly recommended to use the East
Euro / APAC OCR engine with Advanced Recognition (formerly known as Dispatcher) instead.

Processing type Recognition type Licensing Notes

Full-text Machine printed Unlimited • Uses a single recognition


Zonal (not recognition per engine. Configure and define
strongly client different zones for different
recommended per content types, including text,

note above) pictures, tables, barcodes,


and check marks.
• Recognizes multiple
recognition languages
simultaneously
• Supports many popular output
formats including PDF
• Supports more than one
hundred languages
• Not strongly recommended for
zonal but appropriate for full
text machine printed text
• Recognizes many types of
barcodes
• Can process documents
corresponding to one or more
languages at a time

Barcode support
EMC Captiva Capture East Euro/APAC OCR module also performs barcode recognition and supports
the following barcode types:
Codabar, Code 128, Code 39, Code 93, EAN 13, EAN 8, IATA 25, Industrial 25, Interleaved 25,
Matrix 25, PostNet, UCC 128, UPC-E
Language recognition support
Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altaic, Armenian (Eastern), Armenian (Grabar),
Armenian (Western), Avar, Aymara, Azerbaijani (Cyrillic), Azerbaijani (Latin), Bashkir, Basque,
Belarussian, Bemba Blackfoot, Breton, Bugotu, Bulgarian, Buryat, Catalan, Chamorro,Chechen,
Chinese (PRC),Chinese (Taiwan), Chukcha, Chuvash, Corsican, Crimean Tatar, Croatian, Crow,
Czech, Danish, Dargwa, Dungan, Dutch (Netherlands), Dutch (Belgium), English, Eskimo (Cyrillic),
Eskimo (Latin), Esperanto, Estonian, Even, Evenki, Faroese, Fijian, Finnish, French, Frisian, Friulian,
Gaelic Scottish, Gagauz, Galician, Ganda, German, German (new spelling), German (Luxembourg),
Greek, Guarani, Hani, Hausa, Hawaiian, Hebrew, Hungarian, Icelandic, Ido, Indonesian, Inguish,
Interlingua, Irish, Italian, Japanese, Kabardian, Kalmyk, Karachay-Balkar, Karakalpak, Kasub, Kawa,
Kazakh, Khakas, Khanty, Kikuyu, Kirghiz, Kongo, Korean, KoreanHangul, Koryak, Kpelle, Kumyk,
Kurdish, Lak, Lappish, Latin Latvian, Lezgin, Lithuanian, Luba, Macedonian, Malagasy, Malay,
Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minankabaw, Mohawk, Mongol, Mordvin, Nahuatl,
Nenets, Nivkh, Nogay, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Nyanja, Occidental,
Ojibway, Old English, Old French, Old German, Old Italian, Old Spanish, Ossetian, Papiamento, Pigin
English, Polish, Portuguese (Brazil), Portuguese (Portugal), Provencal, Quechua, Rhaeto-Romanic,
Romanian, Romanian (Moldavia), Romany, Ruanda, Rundi, Russian (Old Spelling), Russian,
Samoan, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali,
Sorbian, Sotho, Spanish, Sunda, Swahili, Swazi, Swedish, Tabassaran, Tagalog, Tahitian, Tajik,
Tatar, Thai, Tinpo, Tongan, Tswana, Tun, Turkish, Turkmen, Tuvan, Udmurt, Uighur (Cyrillic),
Uighur (Latin), Ukrainian, Uzbek (Cyrillic), Uzbek (Latin), Visayan, Welsh, Wolof, Xhosa, Yakut,
Yiddish, Zapotec, Zulu

Also support for several formal languages: Basic, C++, Cobol, Fortran, Java, Pascal, chemical
formulas, E13B, CMC7.
CAPTIVA CAPTURE

BARCODE RECOGNITION
EMC Captiva Capture Barcode Recognition carries out barcode recognition. Barcode Recognition
settings supports two barcode types:
• 1D barcode parameters
• 2D barcode parameters

Processing type Recognition type Licensing Notes

Zonal 1D barcodes Delivered as • Can detect several barcodes

2D barcodes standard of different types in a


document
• Configuration files can be
customized to detect specific
1D or 2D barcodes
• A resolution of 300 DPI is
recommended for smaller-
sized barcodes - anything
lower could render the
barcode unreadable by the
engine

Barcode Recognition support


EMC Captiva Capture Barcode Recognition supports the recognition of the following barcode types:
1D barcodes:
Add 2, Add 5, Airline 2 of 5 (IATA 2 of 5), Australian Post, BCD Matrix, Codabar, Code 2 of 5
(Industry 2 of 5), Code 32, Code 39, Code 39 Extended, Code 93, Code 93 Extended, Code 128,
UCC/EAN 128, DataLogic 2 of 5, EAN 8, EAN 13, Intelligent Mail, Interleaved 2 of 5, Invert 2 of 5,
Matrix 2 of 5, Patch Code, Royal Post, UPC-A, UPC-E, and PostNet.

2D barcodes:
PDF-417, QR, Data Matrix
EMC CAPTIVA ADVANCED RECOGNITION (FORMALLY
KNOWN AS DISPATCHER)
EMC Captiva Advanced Recognition supports two processing types: zonal recognition and full-text
recognition. Zonal recognition is used to capture data from structured forms where field location is
the same from one image to the next. Full-text recognition is primarily used for semi-structured or
unstructured documents where text does not reside in a static location from one image to the next.
EMC Captiva Advanced Recognition supports several recognition types, including machine-printed
text, hand-printed text, check marks, courtesy amount recognition (CAR), legal amount recognition
(LAR), and MICR/CMC7.
The following information provides a breakdown on all the recognition options for EMC Captiva
Advanced Recognition and indicates which engines are included.

CAPTIVA ADVANCED RECOGNITION

GENERAL-USE OCR
The General-Use OCR is the standard engine included with EMC Captiva Advanced Recognition.
With the inclusion of Advanced Zonal OCR/ICR and its superior handprint recognition into EMC
Captiva Advanced Recognition, the current Standard Handprint/General-Use ICR engine will be
deprecated in the near future. Existing projects that use Standard Handprint/General-Use ICR
should be migrated to use the handprint engine included with Advanced Zonal ICR as soon as
possible.

Processing type Recognition type Licensing Notes

Full-text Machine printed Machine • Recognizes machine print


Zonal 1D barcodes printed – including dot matrix, OCR-A,
delivered as and OCR-B fonts
ICR
standard. • Uses up to three engines for
1D barcodes – optimal accuracy
delivered as • Provides support for a wide
standard. variety of output formats,
ICR requires including TIFF, PDF, PDF/A
an additional • Can carry out full page form
license recognition, ICR, OCR, multi-
Note: The line and omni-font characters
three recognition and barcode
processing recognition
types can be • Over 100 languages are
run at the available
same time.
• Engine requires a license
• Supports 2 recognition types:
o Character recognition
o Barcode recognition
Barcode support
EMC Captiva Advanced Recognition General-Use OCR/ICR engine also provides barcode recognition
and supports the following barcode types:
Codabar, Codabar with start-stop char transmit, Code 128, Code 128 with check digit transmit,
Code 39, Code 39 full ASCII mode, Code 39 with check digit control and transmit, Code 39 with
start-stop char transmit, EAN8/13, EAN/UPC with 2 and 5 digit supplement, ITF (2 of 5 interleaved),
ITF with check digit control and transmit, Postnet code, UCC Code 128, UPC-A, UPC-E (6-digit)

Language recognition support


Afrikaans, Albanian,Aymara, Basque,Bemba, Blackfoot, Brazilian-Portuguese, Breton, Bugotu,
Bulgarian (Cyrillic), Byelorussian (Cyrillic),Catalan, Chamorro, Chechen, Chinese (Simplified),
Chinese (Traditional), Corsican, Croatian, Crow, Czech, Danish, Dutch, English, Eskimo (Inuit),
Esperanto, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Gaelic (Irish), Gaelic
(Scottish), Galician Ganda, German, Greek, Guarani, Hani, Hungarian, Icelandic, Ido, Indonesian,
Interlingua, Italian, Japanese, Kabardian, Hawaiian, Kasub, Kawa, Kikuyu, Kongo, Korean,
Kpelle,Kurdish, Latin, Latvian, Lithuanian, Luba, Lule Sami, Luxembourgian, Macedonian (Cyrillic),
Malagasy, Malay, Malinke, Maltese, Maori, Mayan, Mia, Minankabaw, Mohawk, Moldavian (Cyrillic),
Nahuatl, Northern Sami, Norwegian, Nyanja, Occidental, Ojibway, Papiamento, Pigin English, Polish,
Portuguese, Provencal, Quechua, Rhaetic, Romanian, Romany, Ruanda, Rundi, Russian (Cyrillic),
Sami (Lappish), Samoan, Sardinian, Serbian (Cyrillic), Serbian (Latin alphabet), Shona, Sioux,
Slovakian, Slovenian, Somali, Sorbian (Wend), Sotho, Southern Sami, Spanish, Sundanese, Swahili,
Swazi, Swedish, Tagalog, Tahitian, Tinpo, Tongan, Tswana (Chuana), Tun, Turkish, Ukrainian
(Cyrillic), Visayan, Welsh, Wolof, Xhosa, Zapotec, Zulu
Note: Twenty languages installed by default and up to 123 languages available, including English,
Simplified Chinese, Japanese and Korean.

CAPTIVA ADVANCED RECOGNITION

WESTERN OCR
The Western OCR performs full-text and zonal capture of machine-printed text and is included in
EMC Captiva Advanced Recognition.

Processing type Recognition type Licensing Notes

Full-text Machine printed Delivered as Specialty is western languages


Zonal Standard

Language recognition support


English, German, Danish, Spanish, Finish, French, Dutch, Italian, Norwegian, Portuguese, and
Swedish.
Note: The Western OCR engine cannot read Asian characters, but can be used on Asian operating
systems to read Western characters
CAPTIVA ADVANCED RECOGNITION

EAST EURO/APAC OCR FOR CAPTURE


EMC Captiva Advanced Recognition East Euro/APAC OCR for Capture module performs optical
character recognition of scanned or imported images and recognizes several languages with a
specialty towards Eastern European and Asia Pacific languages. The East Euro/APAC OCR for
Capture is an add-on to EMC Captiva Advanced Recognition.

Processing type Recognition type Licensing Notes

Full-text Machine printed Unlimited • Uses a single recognition

Zonal recognition per engine. Configure and define


client different zones for different
content types, including text,
pictures, tables, barcodes,
and check marks.
• Recognizes multiple
recognition languages
simultaneously
• Supports many popular output
formats including PDF
• Supports more than one
hundred languages
• Appropriate for zonal and full
text machine printed text
• Recognizes many types of
barcodes
• Can process documents
corresponding to one or more
languages at a time

Barcode support
EMC Captiva Advanced Recognition East Euro/APAC OCR module also performs barcode recognition
and supports the following barcode types:
Codabar, Code 128, Code 39, Code 93, EAN 13, EAN 8, IATA 25, Industrial 25, Interleaved 25,
Matrix 25, PostNet, UCC 128, UPC-E
Language recognition support
Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altaic, Armenian (Eastern), Armenian (Grabar),
Armenian (Western), Avar, Aymara, Azerbaijani (Cyrillic), Azerbaijani (Latin), Bashkir, Basque,
Belarussian, Bemba Blackfoot, Breton, Bugotu, Bulgarian, Buryat, Catalan, Chamorro,Chechen,
Chinese (PRC),Chinese (Taiwan), Chukcha, Chuvash, Corsican, Crimean Tatar, Croatian, Crow,
Czech, Danish, Dargwa, Dungan, Dutch (Netherlands), Dutch (Belgium), English, Eskimo (Cyrillic),
Eskimo (Latin), Esperanto, Estonian, Even, Evenki, Faroese, Fijian, Finnish, French, Frisian, Friulian,
Gaelic Scottish, Gagauz, Galician, Ganda, German, German (new spelling), German (Luxembourg),
Greek, Guarani, Hani, Hausa, Hawaiian, Hebrew, Hungarian, Icelandic, Ido, Indonesian, Inguish,
Interlingua, Irish, Italian, Japanese, Kabardian, Kalmyk, Karachay-Balkar, Karakalpak, Kasub, Kawa,
Kazakh, Khakas, Khanty, Kikuyu, Kirghiz, Kongo, Korean, KoreanHangul, Koryak, Kpelle, Kumyk,
Kurdish, Lak, Lappish, Latin Latvian, Lezgin, Lithuanian, Luba, Macedonian, Malagasy, Malay,
Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minankabaw, Mohawk, Mongol, Mordvin, Nahuatl,
Nenets, Nivkh, Nogay, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Nyanja, Occidental,
Ojibway, Old English, Old French, Old German, Old Italian, Old Spanish, Ossetian, Papiamento, Pigin
English, Polish, Portuguese (Brazil), Portuguese (Portugal), Provencal, Quechua, Rhaeto-Romanic,
Romanian, Romanian (Moldavia), Romany, Ruanda, Rundi, Russian (Old Spelling), Russian,
Samoan, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux, Slovak, Slovenian, Somali,
Sorbian, Sotho, Spanish, Sunda, Swahili, Swazi, Swedish, Tabassaran, Tagalog, Tahitian, Tajik,
Tatar, Thai, Tinpo, Tongan, Tswana, Tun, Turkish, Turkmen, Tuvan, Udmurt, Uighur (Cyrillic),
Uighur (Latin), Ukrainian, Uzbek (Cyrillic), Uzbek (Latin), Visayan, Welsh, Wolof, Xhosa, Yakut,
Yiddish, Zapotec, Zulu
Also support for several formal languages: Basic, C++, Cobol, Fortran, Java, Pascal, chemical
formulas, E13B, CMC7.
CAPTIVA ADVANCED RECOGNITION

ADVANCED ZONAL OCR/ICR AND FULL PAGE OCR


EMC Captiva Advanced Recognition Advanced Zonal OCR/ICR and Full Page OCR is an engine that
provides full-text and zonal capture of machine-printed text and zonal capture of hand-printed text.
Beginning with Captiva 7.0, Advanced Zonal OCR/ICR and Full Page OCR is included as part of EMC
Captiva Advanced Recognition.

Recognition Processing type Recognition Licensing Notes


name type

Advanced Full-text, zonal Machine-print Delivered as Preferred choice for


Zonal ICR standard applications that
OCR/ICR require a very high
degree of accuracy
and that include
both machine and
hand print
recognition
requirements

Advanced Full Full-page Machine-print Delivered as Preferred choice for


Page OCR standard applications that
require a very high
degree of accuracy,
and that include
both machine and
hand print
recognition
requirements

Language recognition support


Australia, Austria, Azerbaijan, Belgium, Brazil, Bulgaria, Canada, Central America, Central Europe,
Croatia, Czech Republic, Denmark, Estonia, Faroese, Finland, France, Germany, Great Britain,
Greece, Hungary, International (for OCR and MICR classifiers only), Ireland, Italy, Liechtenstein,
Lithuania, Luxembourg, Malaysia, Netherlands, New Zealand, Norway, Poland, Portugal, Romania,
Russia, Rwanda, Scandinavia, Slovakia, Slovenia, Somali, South Africa, South America, Spain,
Sweden, Switzerland, Thailand, Turkey, United States, and Western Europe
CAPTIVA ADVANCED RECOGNITION

BARCODE RECOGNITION

EMC Captiva Advanced Recognition Barcode Recognition carries out barcode recognition. Barcode
Recognition settings supports two barcode types:
• 1D barcode parameters

• 2D barcode parameters

Processing type Recognition type Licensing Notes

Zonal 1D barcodes Delivered as • Can detect several barcodes


2D barcodes standard of different types in a
document
• Configuration files can be
customized to detect specific
1D or 2D barcodes

• A resolution of 300 DPI is


recommended for smaller-
sized barcodes - anything
lower could render the
barcode unreadable by the
engine

Barcode Recognition support


EMC Captiva Advanced Recognition supports the recognition of the following barcode types:
1D barcodes:
Add 2, Add 5, Airline 2 of 5 (IATA 2 of 5), Australian Post, BCD Matrix, Codabar, Code 2 of 5
(Industry 2 of 5), Code 32, Code 39, Code 39 Extended, Code 93, Code 93 Extended, Code 128,
UCC/EAN 128, DataLogic 2 of 5, EAN 8, EAN 13, Intelligent Mail, Interleaved 2 of 5, Invert 2 of 5,
Matrix 2 of 5, Patch Code, Royal Post, UPC-A, UPC-E, and PostNet.
2D barcodes:
PDF-417, QR, Data Matrix
CAPTIVA ADVANCED RECOGNITION

CHECK READING
EMC Captiva Advanced Recognition Check Reading Check Reading engine is used for automatic
reading of business and personal checks, deposits slips, cash-in and cash-out documents. It can
read hand printed, handwritten and machine printed documents and performs entire check
recognition efficiently. It has been successfully used for check and remittance processing.

Processing type Recognition type Licensing Notes

Zonal Handwritten License with • Requires a license in order to


Machine printed two processing work in development and
types that can production environments
MICR/CMC7 -
be activated • Need to be registered using
CAR/LAR separately the Check Reading license
with the management system
possibility of
running both
at the same
time:
• Check
Reading
US
• Check
Reading
France
THIRD PARTY RECOGNITION ENGINE SUPPORT
PRIME RECOGNITION FOR EMC CAPTIVA CAPTURE
Prime Recognition is a high-accuracy, high-reliability optical character recognition (OCR) module
that works with other EMC Captiva Capture modules as part of a complete document capture
system. The OCR module is an add-on to the Captiva Capture Standard and Enterprise Capture
Servers and is available through EMC Select.

Processing type Recognition type Licensing Notes

Full-text, zonal Machine printed Requires • Includes voting between


Mark sense optional different engine results
license • Provides a wide variety of
output formats, including text
and/or image PDF, PDF-A,
JBIG, XML UTF8/16
• Recognizes a variety of page
layouts and sizes
• Recognizes multiple languages
simultaneously
• Supports lexical checks

Language recognition support


The following languages are recognized by the Prime Recognition engine:
Danish, Dutch, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish, U.K. English,
U.S. English, Japanese, Korean, Traditional and Simplified Chinese, and Russian.
CAPTIVA RECOGNITION CAPABILITIES
The following highlights the features supported by various EMC Captiva Capture and Advanced
Recognition engines:

East
General-Use Advanced Prime
Western OCR Euro/APAC
OCR OCR/ICR Recognition
OCR

Feature

Captiva Capture and Advanced Advanced Capture and Capture


Product Advanced Recognition Recognition Advanced
integration Recognition Recognition

Recognition Machine Machine Machine Machine Machine


type printed printed printed printed printed

1D barcodes Mark sense

Specialty Dot matrix, None Farrington 7B, Dot matrix, Dot matrix
font types customizable OCR A/B typewriter,
OCR A/B,
MICR E13B,
MICR CMC7

Character Alpha, Alpha, Alphanumeric, Alphanumeric, Alpha,


type settings numeric, numeric, amount, numeric, numeric,
alphanumeric, alphanumeric, numeric upper/lower alphanumeric,
upper/lower amount, case, upper/lower
case upper/lower superscript, case,
case, subscript, subscript,
customized italics, special superscript
characters

Character No Yes Yes No No


height/ pitch
settings

Customize Yes Yes No No No


character set

Image No Yes No No Yes


Enhance
tools

Accurate/ Yes No No Yes Yes


balanced/fast
setting

Pre- No No Yes No No
configured
OCR engine
settings for
specific
applications
and
countries
Multi-engine Yes No Yes No Yes
voting

Dictionary Yes No No Yes Yes


support

Logical No No Yes No Yes


context
checking

Detect multi- No No No Yes No


line fields

Multi-core No No No Yes (2) No


capable
GENERAL GUIDELINES
The following information provides more detail on several of the key engines in Captiva Capture and
Advanced Recognition. To determine the best engine for a given application, you should always test
with your set of documents.

Key strengths General use case

Captiva Capture

General-Use Recognition • Good, all-purpose • General use capture


machine-print engine applications for machine print
• 2-way and 3-way voting • Ideal for applications that
• Available in many require full-text recognition

languages and/or capturing a few index

Prime Recognition • Very accurate with • Preferred choice for


machine-print applications with high
• Voting between engines accuracy and speed
requirements, and high
• Dynamic speed/accuracy
variability of image quality,
balance, based on image
fonts, etc. between
quality
documents

East Euro/APAC OCR • Available in many • Preferred choice for Eastern


languages, with focus on European and Asia Pacific
recognition of Eastern languages
European and Asia Pacific
languages

Captiva Advanced
Recognition

East Euro/APAC OCR • Available in many • Preferred choice for Eastern


languages, with focus on European and Asia Pacific
recognition of Eastern languages
European and Asia Pacific
languages

Western OCR • Good, all-purpose • Preferred choice for free-form


machine-print engine data capture applications in
• Very accurate on numeric Western languages
fields

General-Use OCR/ICR • Good, all-purpose • General use capture


machine-print engine applications, whether
• Available in many machine or hand print

languages

Advanced Zonal OCR/ICR • Very accurate with • Preferred choice for


and Advanced Full Page machine-print and hand- applications that require a
OCR print very high degree of accuracy,
• Available in multiple and that include both

processing speeds machine- and hand-print


recognition requirements
ADDITIONAL INFORMATION
• In addition to the wide array of recognition options in Capture and Advanced
Recognition, Captiva also provides a software development kit that allows third-
party recognition engines to be incorporated into the EMC Captiva Capture
product
• EMC Captiva Capture governs the capture process and allows recognition data to
be passed between the various Capture and Advanced Recognition modules
• Licensing for the add-on engines is per client connection. To determine the
number of connections required, first determine the number of computers that
will be performing keyword based document classification, data extraction, and
rubber band OCR. That will determine the number of client connections you will
need for the add-on OCR engines.

CONTACT US
To learn more about how EMC
products, services, and solutions can
help solve your business and IT
challenges, contact your local
representative or authorized reseller—
or visit us at www.EMC.com.

EMC2, EMC, the EMC logo, are registered trademarks or trademarks of EMC Corporation in the
United States and other countries. VMware are registered trademarks or trademarks of VMware,
Inc., in the United States and other jurisdictions. All other trademarks used herein are the
property of their respective owners. © Copyright 2014 EMC Corporation. All rights reserved.
Published in the USA. 06/14 EMC Product Description Guide H4755.6

EMC believes the information in this document is accurate as of its publication date. The
information is subject to change without notice.

www.EMC.com

Вам также может понравиться