Вы находитесь на странице: 1из 256

Exploring IBM Datacap Taskmaster A Solution Showcase

Lab Exercises

An IBM Proof of Technology


PoT.IS.11.8.004.00

Copyright IBM Corporation, 2011 US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM Software

Contents
LAB 1 ACCOUNTS PAYABLE INVOICE PROCESSING ............................................................................................. 5 1.1 OVERVIEW .............................................................................................................................................. 5 1.2 PROCESSING INVOICES USING THE TASKMASTER APT THICK CLIENT ........................................................... 5 1.3 EXPLORING THE TASKMASTER WEB INTERFACE ........................................................................................ 29 QUICK AND EASY DOCUMENT SETUP WITH TASKMASTER FLEX .......................................................... 40 2.1 OVERVIEW ............................................................................................................................................ 40 2.2 USING TASKMASTER FLEX MANAGER....................................................................................................... 40 2.3 EXPLORING THE TASKMASTER FLEX APPLICATION..................................................................................... 47 REPORTING WITH IBM DATACAP TASKMASTER RV2 ............................................................................... 58 3.1 OVERVIEW ............................................................................................................................................ 58 3.2 VIEWING REPORTS ................................................................................................................................ 58 3.3 CREATING A FILTER FOR A REPORT ......................................................................................................... 63 3.4 CREATING A DASHBOARD OF REPORTS .................................................................................................... 65 DATACAP STUDIO DEEP DIVE ...................................................................................................................... 72 4.1 OVERVIEW OF DATACAP STUDIO ............................................................................................................. 72 4.2 RULESETS, RULES, FUNCTIONS, AND ACTIONS ......................................................................................... 74 4.3 LAB SCENARIO ...................................................................................................................................... 76 4.4 STARTING THE DATACAP STUDIO APPLICATION WIZARD ............................................................................ 76 4.5 SETTING UP THE DOCUMENT HIERARCHY ................................................................................................ 88 4.6 CONFIGURING SCANNING, PAGE IDENTIFICATION, AND FIELD EXTRACTION ................................................ 100 4.7 CREATING THE DOCUMENTS AND FIELDS................................................................................................ 126 4.8 TESTING THE CONFIGURATION .............................................................................................................. 135 4.9 CONFIGURING AND TESTING VISUAL VERIFICATION ................................................................................. 141 4.10 CREATING A SIMPLE VALIDATION RULE .................................................................................................. 159 IBM DATACAP NENU MONITORING............................................................................................................ 168 5.1 OVERVIEW .......................................................................................................................................... 168 5.2 CREATING A NENU CONFIGURATION USING DATACAP STUDIO ................................................................ 168 5.3 TESTING NENU .................................................................................................................................. 183 INTEGRATING IBM DATACAP TASKMASTER WITH THE FILENET P8 ECM REPOSITORY................... 187 6.1 OVERVIEW .......................................................................................................................................... 187 6.2 UPDATING THE APPLICATION CONFIGURATION IN DATACAP STUDIO .......................................................... 187 6.3 TESTING THE UPDATED APPLICATION...................................................................................................... 198 IBM DATACAP TASKMASTER AND EMAIL INTEGRATION....................................................................... 208 7.1 OVERVIEW OF DATACAP CONNECTOR FOR EMAIL AND ELECTRONIC DOCUMENTS ...................................... 208 7.2 LAB OVERVIEW.................................................................................................................................... 208 7.3 GETTING STARTED .............................................................................................................................. 208 BATCH SPLITTING........................................................................................................................................ 234 8.1 LAB OVERVIEW.................................................................................................................................... 234 8.2 THE SAMPLE CHECK PROCESSING APPLICATION .................................................................................... 234 8.3 UPDATING THE APPLICATION ................................................................................................................. 239 8.4 TESTING THE UPDATED APPLICATION...................................................................................................... 248 NOTICES ........................................................................................................................................................ 252 TRADEMARKS AND COPYRIGHTS ............................................................................................................. 254

LAB 2

LAB 3

LAB 4

LAB 5

LAB 6

LAB 7

LAB 8

APPENDIX A. APPENDIX B.

Contents

Page 3

IBM Software

Overview
Welcome to the IBM Datacap Taskmaster Proof of Technology! IBM Datacap Taskmaster is a powerful tool for capturing content (regardless of the content type), extracting important indexing and application data, and then storing both the content and data into various backend systems. It helps you eliminate labor-intensive document preparation and manual data entry, thus expediting the Capture process, as well as improving data accuracy. IBM Datacap Taskmaster can help by:

Capturing content from a variety of input points, including scanners, multifunction devices (MFDs), fax servers, email systems, and file systems Supporting remote users for both scanning and indexing/verification without having to deploy distributed servers Extracting machine print, handprint, checkbox and bar code data with multiple recognition engines (OCR/ICR, OMR) to reduce manual data entry Applying advanced validations, such as database lookups, math calculations, and check sums, to assure accurate data Enabling design and deployment of complex capture applications without expensive programming Integrating seamlessly with a variety of IBM and non-IBM ECM repositories, including IBM Content Manager Enterprise Edition, IBM Filenet Content Manager, IBM Filenet Image Services, Microsoft Sharepoint, OpenText LiveLink, and EMC Documentum Providing all this capability as a web service that can integrate with your line of business applications to provide Capture capability at any point in your enterprises process

Introduction
In this Proof of Technology (PoT), you will have the opportunity to explore many different aspects of the IBM Datacap Taskmaster capture solution. This is a hands on exploration and there are several labs for you to perform. Some are geared towards end users and focus on the standard Taskmaster end user interfaces. Other labs are designed with administrators and developers in mind. Those labs delve into the deep technical details of IBM Datacap Taskmaster. The labs do not assume any previous experience with IBM Datacap Taskmaster, nor with any other Capture solutions. Additionally, there isnt any specific sequence to the labs. You may choose to do the labs that are of the most interest to you, and in any sequence. Being able to do one lab is not predicated on the successful completion of any other lab. Your instructor will outline which labs are most applicable to you, based upon the role you play in your organization.

Page 4

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Lab 1
1.1

Accounts Payable Invoice Processing

Overview

Virtually every company needs to be able to process invoices. Most companies have some form of A/P software that is used to track, pay, report on, control payment of, and archive inbound invoices. Information needs to be read off the invoice and entered into those systems. Manual processing is costly and error prone. Automation to provide straight through processing is the goal. However, invoices can come from dozens, if not hundreds of different sources. And two invoices from the same company can have critical information located in different areas if the number of line items differs from invoice to invoice. Taskmasters APT solution is an out of the box, ready to use invoice capture application. Inbound invoices are scanned and matched against an existing fingerprint database of known invoices. Key data elements are recognized through OCR/ICR and validated against business rules. Manual verification by an end user allows them to correct low confidence recognition reads, and adjust fields that dont abide by established business rules. All the extracted information, along with the image of the invoice itself, are then available for export to third party LOB applications, databases, and ECM repositories.

1.2

Processing Invoices Using the Taskmaster APT Thick Client

In this first lab, we will process a batch of invoices. Some of the invoices are for vendors with whom we have already done business, so the fingerprints for their invoices are already in our APT fingerprint database. Other invoices are from a vendor whose invoices we have never seen. We will see how this is dealt with dynamically, instead of having to go back to the I/T department.

Because we are performing all the steps to process the batch on a single computer, it will be necessary to switch between the role of an end user and that of a Taskmaster server that is doing background processing. In a real, customer environment, the end user activity (such as scanning and verification) would be done at a user desktop, and background processes would run on a server in the computer room.

1.2.1

Scanning the Invoices

Under normal circumstances, you would place all the invoices in an actual scanner and click an icon to initiate scanning. For our lab, the images are already scanned and are on the filesystem. So we will use something called virtual scanning. This is similar to processing content that is already available as an image, such as faxes or email attachments.

Lab 1 - Accounts Payable Invoice Processing

Page 5

IBM Software

__1.

Start the APT Client application by clicking Start -> All Programs -> Datacap -> Applications > APT -> APT Client

__2.

You will be prompted to logon to the APT Client. Use the default user of admin and enter a password of admin.

__3.

This brings up the main APT client interface. Note that there are two panes or sections.

The top section (Operations window) has icons for all the user interfaces as well as background processes. A regular end user would only see icons for functions such as scanning and verification, whereas an administrator (such as yourself) sees all available functions.

The bottom section is the Job Monitor and shows the administrator the status of all the batches being processed by Taskmaster. We will use the Job Monitor to manually process our batches through the capture workflow.

Page 6

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__4.

We will simulate the scanning of the invoices. Double click on the Scan icon in the Operations window.

__5.

A dialog box opens with a list of Job definitions to choose from. A job essentially states which Taskmaster configuration you want to use to process the batch.

Select the Demo job at the top of the list and click the OK button. __6. The demo job has been configured to use virtual scanning instead of an actual scanner. A window will appear showing you the status of the job. It should complete in a very short period of time.

Lab 1 - Accounts Payable Invoice Processing

Page 7

IBM Software

Note that if you were using a physical scanner, some type of scanner interface window would appear. You would be able to see the images as they were being scanned, as well as the capability to make modifications (i.e. rearrange pages, rotate the image, delete images, etc). The specific functions that would appear depend on the scanner you are using and the capabilities of the scanner software driver.

__7.

A notification window appears when the batch has been created.

Click the Stop button. If you dont click the stop button quickly enough, Taskmaster assumes that you want to create another batch (this timeout period is completely configurable). If you dont click the stop button before the timeout period, then the Select Job window appears again.

In this case, simply click the Cancel button.

Page 8

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__8.

Click on the Job Monitor window so that it is the active window. Press F5 to refresh the job monitor.

Note that your batch has been created and that it is pending a task called Batch Profiler.

1.2.2
__9.

Using the Background Processor


As we noted previously in this lab guide, we will be performing tasks done by both end users and background servers. Once the batch is created, the normal workflow would result in one of the background servers being notified that a batch was ready for processing. That server would then perform tasks like putting the pages together as documents, verifying the document integrity, cleaning up the images (despeckle, deskew, horizontal/vertical line removal, etc), doing recognition, locating fields on the page, and performing preliminary data validation. We will initiate this background processing manually.

In the Job Monitor, double click on the batches ID number (as shown above by the red arrow). __10. You will be prompted to confirm that you want to execute the selected batch.

Click the Yes button.

Lab 1 - Accounts Payable Invoice Processing

Page 9

IBM Software

__11.

A window will appear indicating that Batch Profiler is running. This is the name Taskmaster gives the background processing task. It will take a few seconds as Taskmasters Rule Runner service does all the background processing necessary to process the invoices.

Rule Runner is the service that does all the heavy lifting in Taskmaster. It does all the image cleanup, recognition, and other background tasks mentioned previously. One key competitive advantage that Taskmaster has over other capture solutions is that its Rule Runner engine can be called as a web service. This allows external applications to take advantage of Rule Runner. For example, you could initiate a job to capture and process images from within a BPM workflow. Rule Runners availability as a web service allows for true in process capture. Capture is no longer limited to a front end application!

Page 10

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__12.

A window pops up when Batch Profiler has completed processing the batch.

Click the OK button to continue. __13. Click on the Job Monitor window and refresh it by pressing the F5 key.

Note that the status has been updated to indicate that the batch is pending the Verification step. Double click on the batch ID (as show above by the red arrow) to initiate Verification.

You will be asked to confirm if you want to execute the selected batch. Click OK to continue.

Lab 1 - Accounts Payable Invoice Processing

Page 11

IBM Software

Note: Another way you could have started the Verification task would have been to double click the Verify icon in the Operations window. This is the way most end users would initiate Verification.

1.2.3
__14.

Verifying the Invoices


Verification is the process of visually ensuring that the required data elements on the invoice were located correctly and that recognition results are accurate. You can configure the Verification module to display only the pages that contain low confidence recognition results. You can also configure it to display ALL documents, regardless of confidence.

The Verification window we are looking at has the recognition results on the left hand side of the screen and the image of the invoice on the left. You can size the windows to your preference, as well as zoom the image.

Page 12

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

This first invoice has been processed with high confidence. The system has seen invoices from Stinger Wellhead Protection before, and therefore knows where all the data elements are located. Elements such as the PO Number, invoice total, tax, invoice date, etc have all been located and recognized correctly.

You can tab from one field to the next and the related part of the image will be highlighted. __15. Lets take a look at the individual line items that appear in the invoice.

Lab 1 - Accounts Payable Invoice Processing

Page 13

IBM Software

Note that Taskmaster APT knows that there are 5 line items on the invoice. This is one of the key strengths of Taskmaster APT it can handle variability in a document. The number of line items will vary from invoice to invoice, and Taskmaster APT can accommodate that. Contrast that to other applications which use a fixed template approach to handling similar documents. You can click the Next button to go from one line item to the next. As you do so, note how the highlighted portion of the image changes also.

A teal or light blue background in a field means that the field was recognized with very high confidence. If a field has a yellow background, then it means that there is one or more characters that were read with a less than optimal confidence. The character(s) in question are shown in red. The operator can visually verify the OCR results and make changes if necessary. __16. Once the operator has looked through all the fields and line items, they can move on to the next document or the next problem. There are several ways to move from one problem or document to the next. You can use icons on the user interface or hot keys. Many experienced users eventually prefer hot keys because it is much faster. But icons are easier for new users so well click the next problem icon.

Page 14

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Look to the top of the screen, just under the tool bar, for the arrow pointing to the right with the question mark (as shown above by the red arrow). Click this icon to move to the next problem document. __17. The next document in the batch appears.

Review the results for this invoice. __18. Note that there is a button just under the shipping field marked TIO. You can click this button to see the original TIFF image that was created by the scanner but before any image cleanup was done.

Click the TIO button as indicated by the red arrow.


Lab 1 - Accounts Payable Invoice Processing Page 15

IBM Software

__19.

Notice the difference in the images. The original image (shown below) has all sorts of horizontal and vertical lines on it.

The cleaned up image doesnt have any of those lines. The ability to remove lines (and other image clean up functions) can greatly improve the accuracy of the OCR/ICR engines, however your enterprise may choose to store the original version of the image for legal purposes. The choice is up to you. __20. __21. Click the next problem icon to advance to the next invoice in the batch. The third invoice in the batch appears.

Page 16

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Here we have something a little different. One of the fields (invoice date) has a light red background. This indicates that a validation error occurred. In this case, the invoice date though recognized correctly fails validation because it is an invalid date. Feb 29 can only occur in a leap year and 2009 is not a leap year. So the operator must take some step to correct. Change the date to 2/28/09 to get around this validation error. __22. Taskmaster is capable of much more than simple validations like ensuring dates are valid or that something matches a datatype check (e.g. is numeric only). More complex, business specific validations can be enforced. For example, Taskmaster APT is configured to ensure that the total amount of the invoice is equal to the sum total of the line items (including shipping and tax).

Change the Invoice Total to something other than its true value. Now try to click the next problem icon to move to the next invoice. Youll get a validation error that says:

Click No. Change the Invoice Total back to its correct value of $3166.44 and click the next problem icon. You will now be able to move to the next invoice.

Lab 1 - Accounts Payable Invoice Processing

Page 17

IBM Software

1.2.4
__23.

Defining a New Fingerprint


The fourth invoice in the batch is now displayed.

Note that virtually all the fields are blank! This is because Taskmaster APT has never seen an invoice from this vendor before and therefore doesnt know where to find most of the fields. Some of the fields, like Invoice Date and Invoice Total were found through a location technique that searches for specific text on the page that leads APT to know where the data is. But APT will require that a human tell it where the majority of the field data is located. With most capture products, this means going back to the capture administrator and asking them to create a new template for the invoice. But Taskmaster APT doesnt require that instead we will use a Taskmaster feature called Click and Key which lets an end user dynamically add a new fingerprint to the fingerprint database. __24. The first thing we want to do is ensure that the vendor we are paying is someone we are authorized to pay. Most companies have a vendor database that tracks approved suppliers. Taskmaster APT has database lookups directly integrated into the Verification process.

Page 18

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Enter the first few characters of the vendor name (in this case JWS) and click the button marked Lookup Vendor. __25. A window pops up showing the vendors who start with JWS.

Double click on the entry for JWS of Colorado Inc. __26. Now we will start using the Click and Key feature to tell Taskmaster APT where to get the rest of its fields. Tab to the Remittance_Zip field and click on the zip code for the vendor.

When you move the cursor to a string of text, it will get highlighted in yellow. Click on the string to select it. The selected ZIP code will show up in the Remittance Zip field.

If you want to make things a little easier for you, you can right click on the image view window and select one of the zoom options. Zoom to width is probably a good idea.

__27.

Tab to the Invoice_Number field and click on the invoice number, located in the top right corner of the invoice.

Lab 1 - Accounts Payable Invoice Processing

Page 19

IBM Software

__28.

Tab to the Tax field and click on the sales tax amount in the lower left part of the invoice.

__29.

Now tab to the PO_Number field and click the purchase order number on the image,

__30.

Now we can start telling Taskmaster APT where the line item detail is. The great thing about APT is that you only have to tell it where the first line item is, and it will figure out all the remaining line items after that.

Click the Add button in the Details section of the screen (as shown above by the red arrow). The line item number should change to 1 of 1.

Page 20

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__31.

Tab to the ItemID field and click on the first item number (the string 201 on the image).

__32.

Tab to the Qty field and click on the first quantity amount (the string 6 on the image).

__33.

Tab to the ItemDesc field. In this case, we want to select multiple words, not just a single string. There are two ways we can do this. You can hold down the Shift key while you click on each word in the string. An easier way is to simply hold down the left mouse button while you draw a box around the desired area.

Lab 1 - Accounts Payable Invoice Processing

Page 21

IBM Software

__34.

Tab to the Price field and click on the first unit price amount (the string 65.00 on the image).

__35.

Tab to the LineTotal field and click on the first line total amount (the string 390.00 on the image).

Page 22

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__36.

Verify that your screen matches the following screen capture.

__37.

We have only found the first line item. We need to find ALL the line items. Click on the Find Details button at the bottom of the Details section.

Lab 1 - Accounts Payable Invoice Processing

Page 23

IBM Software

__38.

Note how the details section is updated to show you that you are looking at the first of three line items.

You can click the Next button to see each of the line item details.

Here we see the details for the second line item on the invoice.

Page 24

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__39.

We have completed defining the new invoice fingerprint. We can immediately add it to the fingerprint database by clicking on the New button.

A little fingerprint is displayed to indicate that the fingerprint database has been updated for this new invoice.

__40.

Click the icon to go to the next problem document.

Lab 1 - Accounts Payable Invoice Processing

Page 25

IBM Software

__41.

Now we have another invoice from JWS of Colorado. Taskmaster APT knows that a fingerprint for this invoice exists in the system. This is indicated by the appearance of the Sticky Available button appearing.

Click the Sticky Available button.

Page 26

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__42.

Taskmaster APT will use the fingerprint information from the previous invoice to locate and extract all the necessary data elements from this new invoice.

__43.

Click the next problem icon to advance to the last invoice in our batch.

Note that this is a two page invoice. You can move from page 1 to page 2 by clicking on the buttons labeled with < and > .

Lab 1 - Accounts Payable Invoice Processing

Page 27

IBM Software

__44.

Note that on page 2 of the invoice, Taskmaster APT was able to continue to locate the additional line item details, even though the header information was repeated.

__45.

Click the next problem icon.

We have reached the end of the batch and there no more problems to verify. Click the Yes button to finish the batch.

Click the OK button on the notification message saying that the batch has completed verification. __46. Refresh the Job Monitor screen by pressing the F5 key.

Page 28

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Note that the job is pending export. The export step is where we might integrate with an ECM repository such as IBM CM8 or Filenet P8. Its also the step where our changes to the fingerprint database are confirmed and updates are published for other users to be able to use. So double click on the batch ID and let the Export job complete. IMPORTANT!! Dont miss this step!! If you forget to export the batch, then the fingerprint database wont get updated with the new fingerprint you created. Congratulations!! Youve completed the first Datacap Taskmaster lab. Lets move on to the next lab exercise.

1.3

Exploring the Taskmaster Web Interface

In this lab, we will take a quick look at one of the web based user interfaces for scanning and verifying batches. One of the key strengths of Datacap Taskmaster is the fact that ALL function that can be executed from thick clients can also be done from thin, web based clients without installing any kind of additional software (including web plugins) and without the need for remote servers. Well go through the same APT application, but using a browser interface.

1.3.1
__47.

Processing an Invoice Batch from a Web Browser


Launch the Internet Explorer browser from the tool bar at the bottom of your screen.

After IE starts up, click the link on the top toolbar to go directly to the Login screen for Datacap Taskmaster Web.

Lab 1 - Accounts Payable Invoice Processing

Page 29

IBM Software

__48.

The login panel will be displayed.

Ensure that the application name is APT, the userid/password is admin, and the station is 1. __49. The main end user interface for Taskmaster APT Web is displayed.

There are three basic end user functions: Scan, Upload, and Verify. Click the Scan link. __50. The list of available jobs is displayed.

Click on the link for the Web Demo job.


Page 30 Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__51.

As with the last lab exercise, instead of actual scanning, we will virtually scan existing documents that are on our hard drive.

Click the Browse button on the top, right side of the screen. __52. Browse to C:\Datacap\APT\images\Input and select the first document in that directory.

Select the first of the images and click the Open button.

Lab 1 - Accounts Payable Invoice Processing

Page 31

IBM Software

Ensure that the check box is selected to tell Taskmaster that you want to virtually scan more than a single image. Setting the Expected pages value to zero means that you want to process all the images in that directory location. Click the Scan button. __53. Thumbnails of the images are displayed, along with a confirmation message.

Click the OK button. __54. At this point, you could examine the scanned pages in more detail, rearrange the pages, etc.

For now, lets just click the Done button to move to the next step.

Page 32

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__55.

A confirmation message appears saying that the batch is finished.

Click the OK button.

You now have the option to click the Continue button to scan another batch. We are done with scanning so just click the Stop button. __56. You are returned to the main Operations screen.

At this point, all the images that weve scanned are kept on your local workstation. They are not stored on the Taskmaster server until you upload them. The advantage of this approach is that, for remote users, they can run their scanners at rated speeds without worrying about the bandwidth of their network connection. By delaying the upload to a later point in time, uploads can be scheduled to run when network bandwidth is at its best. For now, click the Upload button to store the scanned documents on the Taskmaster Server.

Lab 1 - Accounts Payable Invoice Processing

Page 33

IBM Software

The documents will be transferred to the server.


Keep in mind that we are running the labs with everything on a single server. In a real environment, the web based scanning and verification could be performed by someone hundreds, if not thousands of miles from the actual Taskmaster server. This is why the Upload step is necessary.

Once the batch is uploaded, youll get a confirmation message.

Click the Stop button since we have no more batches to upload. __57. Return back to your thick Taskmaster APT client.

__58.

Refresh the Job Monitor window by pressing the F5 button.

Page 34

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Note that there is now a web job that is awaiting the Batch Profiler background process. Double click on the batch ID for that job to start the background processor.

Click the Yes button to indicate you want to execute the selected batch.
Please keep in mind that in a real production environment, there would be a separate server that would continually monitor Taskmaster for batches to execute. We are manually processing the batches just for the purposes of this lab exercise.

__59.

The background process will start.

Lab 1 - Accounts Payable Invoice Processing

Page 35

IBM Software

A message window will be displayed when the background processing is complete.

Click the OK button. __60. Refresh the Job Monitor window by pressing the F5 button.

Now our job is ready for verification. So we will switch from our background processing role to the role of a remotely based end user who will verify the batch of invoices. __61. Return back the Internet Explorer browser. You should be at the main Operations screen. If not, click the Operations link at the top of the screen.

Click on the Verify link.

Page 36

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__62.

We are processing the same images that we processed in the first lab exercise.

Since we have seen all these images before, lets just quickly advance through the batch.

The icon for the next problem document is a little different on the web interface. Its the yellow arrow (pointing to the right) with the exclamation mark, as indicated above by the red arrow. Move to the next problem document. __63. Lets keep moving so click on the next problem link again.

Lab 1 - Accounts Payable Invoice Processing

Page 37

IBM Software

Note that the same validation error comes up with the invalid date on the third invoice. Correct the date to 2/28/09 and move on to the next document. __64. The next document is from JWS of Colorado. Note that the invoice is processed without having to click a Sticky Available button. This is because the export process that we ran at the end of Lab 1 has updated the fingerprint database for all users. Now anyone that processes an invoice from this vendor will have access to the updated fingerprint. At some point in going through these documents, select the Disp Snippet check box.

__65.

This causes a larger snippet of the image to be displayed.

This function is available in the thick client also by using the Ctrl+S hot key combination. Again anything that can be done in the thick client can also be done in the thin client! __66. Continue using the next problem icon to move through the rest of the documents in the batch. When youve reached the end, youll be prompted to finish the batch.

Click OK.

Click OK.

Page 38

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click the Stop button to indicate you dont want to verify any more batches.

Congratulations!! Youve completed the Taskmaster Accounts Payable lab.

Lab 1 - Accounts Payable Invoice Processing

Page 39

IBM Software

Lab 2
2.1

Quick and Easy Document Setup with Taskmaster Flex

Overview

The Taskmaster APT application is a highly specialized Accounts Payable application that is in use by hundreds of enterprises for the handling of large amounts of invoices. Lets step back and take a look at a more general type of application. A standard application that comes out of the box with the Taskmaster product is called Taskmaster Flex. Taskmaster Flex is a simple way to setup new document classes and then use Taskmasters Click and Key technology to define the form layouts. Whereas with APT we were only dealing with a single document class invoices we will deal with multiple document classes in Flex. We will start by setting up a basic document type, then move on to examining the client interface.

2.2

Using Taskmaster Flex Manager

In this part of the lab, well examine the part of Taskmaster Flex that allows us to setup new document types. __67. Lets say that were going to start capturing emails with Datacap. Our users have asked that we add a new document type to Datacap called Email. Each email will be indexed by the Subject, To, From, and CC fields. Start the Taskmaster Flex Manager by clicking Start -> All Programs -> Datacap -> Applications -> Flex -> Taskmaster Flex Manager

Page 40

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__68.

The Taskmaster Flex Configuration screen opens.

There are two tabs for this configuration screen: Index Fields and Document Classes. Index fields are fields that you might want to OCR on a document, then use that information later to search a content repository. We are currently on the Index Fields tab. The list of existing fields is shown on the left. The characteristics of the selected field are shown on the right side. __69. Click on the Document Class tab.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 41

IBM Software

All defined document classes are shown in the list on the left hand side. Selecting one of the existing document classes shows all the index fields that are used for that document class. __70. Go back to the Index Fields tab. We will define the four indexes that we want to use for the emails.

Enter From for the Index Field Name. Select Alphanumeric from the Data Type dropdown. Enter a Min Length of 5 and a Max Length of 30. Enter Z for the Picture String. The picture string determines the list of allowable characters. A picture string of Z means any printable ASCII character is allowed. (The complete list of Picture Strings is in the Flex Quick Start manual). Select No from the Required drop down list. Select Yes from the Find by Zone drop down. This means that we will be defining a zone using the Click and Key method. Once the zone is defined, we will perform OCR on that field. Click the Save button when you have finished entering all the field characteristics. __71. Now well define the Subject index field.

Page 42

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Change the Index Field Name to Subject. Leave everything else the same as before, except change the Max Length to 128. Click the Save button. __72. Define the To index field.

Change the Index Field Name to To. Change the Max Length to 30. Click Save. __73. Lastly, we will define the CC field.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 43

IBM Software

You can use all the same parameters that you used for the previous field except change the Index Field Name to CC. Click Save. __74. We are done defining the index fields.

Click on the Document Class tab. Click on the Add button.

Page 44

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__75.

Enter Email as the name of our new document class.

Click OK __76. Email will be added to your list of document classes.

__77.

Now well add our index fields to the document class.

Select From from the list of Available Indexes, then click the Add button. Repeat the process with the To, Subject, and CC indexes.
Lab 2 - Quick and Easy Document Setup with Taskmaster Flex Page 45

IBM Software

Note that you can change the order of the index fields by using the Up/Down buttons. __78. Click Save to save your changes.

A confirmation message will be displayed.

Click OK. __79. Were done with adding our new document class. Close the Flex Configuration client.

Page 46

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

2.3
__80.

Exploring the Taskmaster Flex Application


Start the Flex Client by clicking Start -> All Programs -> Datacap -> Applications -> Flex -> Flex Client

Enter admin for the password and click OK. __81. Like Taskmaster APT, the Flex Client has two sections: Operations and Job Monitor. We are seeing both because we are logged on as an administrator. End users would typically only see the Operations screen.

Double click on the Scan icon to create a new batch. __82. A list of possible job configurations is displayed.

Select the first job (Demo Job) and click the OK button.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 47

IBM Software

__83.

We are doing virtual scanning, which is essentially importing existing images that are on our hard drive. Youll see the progress window.

A notification message will be displayed when the scan is complete.

Click the Stop button to indicate you dont want to create another batch. (You have 10 seconds to click the Stop button before another batch is created. If this happens, just wait for the batch to be completed and youll get this message again. Click Stop. You can delete the unneeded batches in the next step by selecting the unneeded batch in the Job Monitor and pressing the Delete key on your keyboard). __84. Go to the Job Monitor and refresh it by pressing F5.

Note that the job is awaiting the Rule Runner background process. Double click on the job ID (as indicated above by the red arrow) to start the background process.

Page 48

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click the Yes button to confirm that you do want to execute the selected batch. The background Rule Runner process will be launched.

A notification message will be displayed when Rule Runner is complete.

Click OK. __85. Return to the Job Monitor and refresh it by pressing F5.

Note that the job is now ready for verification. You can start the Flex verification client by double clicking on the job ID.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 49

IBM Software

Alternatively, recall that you can also do it by clicking on the Verify icon in the Operations window. __86. The first document in the batch is displayed in the Verification client.

Recall that in Flex, we are handling batches which could contain many different classes of documents. Since we have never seen this document before, Flex does not know what document class to use.

Click on the drop down list to see the available list of document classes. Select the AP Invoice document class.

Page 50

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__87.

The left hand side of the screen changes and displays the different attribute values associated with an AP Invoice.

__88.

Note that there is a button marked Locate just to the right of the attribute fields.

Click the Locate button. The Locate capability is a very powerful capability in Taskmaster. It can use either a specific text pattern or a keyword file to locate information on a form without knowing its physical location. An example of a specific text pattern would be something like a Social Security Number, which is always of the form nnn-nn-nnnn (where n is a numeric digit). Another way to locate information is to look for a specific piece of text (e.g. Invoice Number) and then use the data near that text as the value for our field. We can put all the possible specific text strings in something called a keyword file. .

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 51

IBM Software

__89.

See how two of the fields were automatically found for us.

The date is found by looking through the OCR data for patterns of the type mm/dd/yyyy. The PO number is found through the use of a keyword file. It is looking for specific text strings such as Purchase Order #, PO#, Order Number, etc. __90. Now we will handle the remaining fields.

The Vendor Name is displayed on the invoice as a graphic. As such, it cannot be OCRd and needs to be manually entered. __91. Tab to the Invoice Number field and click on the invoice number located on the top, right hand corner.

Page 52

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__92.

Tab to the Terms field and then draw a box around the payment terms, which are located just above the line items.

__93.

Tab to the Total field and click on the part of the image where the invoice total is located.

__94.

Now click on the icon for the next problem.

__95.

The next document in the batch is displayed. Its an income tax form.

Select Tax Form from the document class drop down list and then click on the Locate button. __96. Note that there is a different set of attributes associated with Tax Forms than there are for AP Invoices. Each document class can have its own metadata structure.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 53

IBM Software

The Locate function was able to automatically find the Social Security Number since it has a well known structure. Now we just have to use the Click and Key feature to show Flex where the Client name is found.

Tab to the Client field and draw a box around the name at the top of the tax form. You may need to zoom in on the image to make drawing the box easier for you. Click the next problem icon to move to the next document in the batch.

Page 54

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__97.

Another invoice from a different customer is displayed. Select AP Invoice from the document class dropdown list.

Click the Locate button to find as many of the fields as possible. Using the same process as you did with the first AP invoice, use the Click and Key feature to populate all the remaining fields. Then click the next problem icon to advance to the last document in our batch. __98. The last document is an email so select Email from the document class dropdown list.

Tab to the From field and draw a box around the name of the person who sent the email (Tom Stuart).

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 55

IBM Software

Tab to the To field and draw a box around the name of the person who sent the email (Thomas Simalchik).

Tab to the CC field and draw a box around the name of the person who was carbon copied on the email (Scott Blau).

Lastly, tab to the Subject field and draw around the email subject line (Taskmaster Flex). __99. Click the next problem icon. There arent any more documents in the batch.

Select Yes to finish the batch.

Page 56

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click the OK button to confirm the completion of the verification task. __100. Weve seen how we can just click on a part of document to easily enter data for indexing purposes. Weve also seen the ability to automatically locate information without even knowing its specific location. Additionally, information about the physical location of zones is maintained in the system. As a result, the next time we see the same forms (e.g. another invoice from Sloss Industries, or a 1040 tax form), Taskmaster will be able to automatically find the data without us having to click on a zone. Congratulations!! Youve completed this last lab exercise for the Taskmaster Flex application.

Lab 2 - Quick and Easy Document Setup with Taskmaster Flex

Page 57

IBM Software

Lab 3
3.1

Reporting with IBM Datacap Taskmaster RV2

Overview

IBM Datacap Taskmaster comes with a robust reporting tool that allows you to view a wide variety of predefined reports, create new reports, dynamically filter reports, and then export results to PDF or Excel format. You can even create a dashboard of multiple reports which refreshes itself automatically, thus giving you a real time view of your enterprises Datacap environment. In this lab, we will examine some of the basic reporting capabilities in IBM Datacap Taskmaster RV2. Note that, at this time, we are unable to lead you through custom reporting creation as this requires a license of Microsoft Visual Studio 2008 which were unable to distribute on VMWare images.

3.2

Viewing Reports

__101. RV2 reports are viewed via web browser so you can see reports at any time from anywhere. Start a Firefox browser by clicking on the Firefox icon on the toolbar.

Start RV2 by clicking on the RV2 Login link on the Firefox Bookmarks toolbar. __102. Login to RV2 using userid/password of admin/admin and station 1

Page 58

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__103. A list of all predefined reports is listed on the left hand side. When you select a report, you can run it for a single application, or multiple applications. Theres also the capability to filter the data in these reports. Well examine filtering shortly. Lets take a look at some of these reports. The first report in the list is Problem Batches. Leave that report selected, select All from the list of applications, and click the Run Report button.

__104. The definition of a problem report is any report that has either aborted, or has been in a running state for over two hours.

__105. Return to the main list of reports by clicking on the Reports link in the upper left corner.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 59

IBM Software

__106. Lets look at a few more reports. Select Current Batches from the list of reports. Select All from the Application list, and click on Run Reports.

This shows you a list of all reports in the system, what stage of their workflow they are in, what their status is, what the priority is, and when they were created. You can sort the report output by any of these criteria by clicking on the column headings.

Page 60

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__107. Go back to the main RV2 page and select the Current Stations report (select All applications and click Run Report).

This shows you a listing of all active stations, and who is logged on. Because were running on a single server, the only station ID weve been using is number 1. Your report may look different from the screen capture it will depend on the number of activities you have running at this time.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 61

IBM Software

__108. Go back to the main RV2 screen by clicking on the Reports link. Lets take a look at the Station Activity report for all applications.

This report shows you how many batches were processed during different points in the day by each station. Again, our output is quite simple because we have only been processing a small amount of data on a single station. But you can use this kind of report to see if any of the background stations are processing batches at an unusual rate. Unusual processing rates can be an indicator of a potential problem. __109. Go back to the main RV2 page and select the Scan Summary from the list of reports. Select All applications before you run the report.

Page 62

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Expand the summary detail for the admin user by clicking on the plus sign next to the user name.

This scan summary is specific to thin client scanning. You could always modify this report definition if you wanted to include thick client scanning. __110. Take some time to review some of the predefined reports. Because of the limited amount of data on these Proof of Technology images, its best to view the data for All the applications. This will give you a better idea of the kind of knowledge that you can get from these reports.

3.3

Creating a Filter for a Report

__111. Lets say that you want to create a filter for the Current Batches report. You dont want to actually modify the report but just selectively filter out certain reports from your batch. For example, lets say that we only want to see current batches that are on hold for some reason.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 63

IBM Software

Select Current Batches from the list of reports on the main RV2 screen and then click on the Manage Filters link at the center-top of the screen. __112. There shouldnt be any filters currently on your image for this report type.

Enter the name Held Batches as the filter name and click the Add button as shown above to add a new filter. __113. You can basically use any column as a filter criteria.

Select Status from the field drop down list. Select Equal To as the condition. Enter Hold to complete the Search Criteria

Page 64

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click the Save button to save your filtering criteria. This will allow us to reuse the filter later. __114. Now lets view the report.

Click the Run Report button.

Now we only see the current reports that are in Hold status. Weve saved this filter so, in the future, you can select this filter when you run the report from the main RV2 screen and dynamically filter the report results as shown above.

3.4

Creating a Dashboard of Reports

__115. Click on the Dashboard link in the upper left corner of the screen.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 65

IBM Software

__116. We currently dont have any reports added to our dashboard.

Select a report type, such as Problem Batches from the Report drop down list. Select All from the Application list. __117. The select report will automatically be displayed.

Note the shaded corner of the report window. You can click and drag this corner to size the window to any size you like. Resize the report window so that it take up the top left half quarter of the screen.

Page 66

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__118. Lets add another report to the dashboard. Click the Add link at the top of the screen.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 67

IBM Software

__119. Another report window appears. It may actually appear on top of your existing report window. You can always move it by clicking on the blue title bar and dragging it to the desired location.

Move the new report window so that its beside the Problem Batches report window. Select the Current Batches report for All applications. Repeat this process for the Scan Summary report (or any report that you want, for that matter) and position it wherever you like. Heres an example of what the dashboard could look like:

Page 68

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__120. Note that there is a drop down list at the top of the screen to set a Refresh rate.

You can set the Refresh rate to update your dashboard as often as every minute, thereby giving you a real time view of your enterprises Taskmaster system.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 69

IBM Software

__121. Return back to the main Reports screen.

__122. Select Current Batches from the list of reports and run the report.

__123. When the report completes running, note that there are links to create PDF and Excel versions of the report.

Click on the PDF link. __124. A PDF version of the report is displayed. You can use standard PDF functions to save/print the PDF file.

Page 70

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Congratulations! Youve completed the RV2 lab. You should now have an idea of the kind of reporting capabilities that come with the IBM Datacap Taskmaster system.

Lab 3 - Reporting with IBM Datacap Taskmaster RV2

Page 71

IBM Software

Lab 4
4.1

Datacap Studio Deep Dive

Overview of Datacap Studio

Datacap Studio is a rich development environment which allows a user to easily develop, modify, and test new Datacap applications without having to have programming or development skills. A Datacap application can be thought of as the processing rules for a batch of documents. The batch can contain one or more document types, each with differing processing requirements. When creating a Datacap application, you typically start by defining the document hierarchy and then create the rulesets that are applied to different elements within the document hierarchy. Rulesets are composed of rules which are, in turn, composed of predefined actions. The document hierarchy describes the structure of the batch. It describes;

different types of documents that can occur in a batch structure of each document type, which includes the different types of pages that can appear in a document. various fields that can occur on a given page

The four elements of a document hierarchy are the batch, the documents in a batch, the pages in a document, and the fields on a page. Rules can be bound to the different elements within a document hierarchy. You may have rules that are only executed once (e.g. connecting to a lookup database when the batch is opened). There may be rules that are executed once per document (for example, uploading a document to an ECM repository). You could have rules that are only executed once per page (for example, examining the page to see if it is a blank page). And finally, there could be rules that are executed once per field (for example, verifying that a field like SSN is in your customer database). You use the actions (which are reusable and located in the Actions library) to create functions. A function can be thought of as a group of actions that work together to perform a specific task. We will look at the details behind rulesets, rules, functions, and actions in more detail shortly. Lets take a quick look at Datacap Studio.

Page 72

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Note that there are three tabs at the top of the Datacap Studio interface. 1. Rulemanager this is where main configuration for your application is done 2. Zones this is where you identify any fingerprints that you might use for classification purposes, as well as any zones you might want to set up for OCR/ICR. 3. Test this is a test environment for your application Well focus for the moment on the Rulemanager page. Notice how the Rulemanager page itself is divided into three main sections. Looking from left to right are the following: 1. On the far left is the Document hierarchy. This is where the structure of the batch is defined. The illustration above is for the APT application, which does invoice processing. The batch (called APT, meaning Accounts Payable Transactions) is made up of three possible document types: Invoices, Separator Pages, and Other (a catch all for anything that cannot be classified). The Invoice document type can have a Main Page, a Trailing Page, an Attachment Separator page, and the Attachment itself. On the Main Page are many fields, such as the Vendor Number and Invoice Total. 2. The middle section is where all of our rulesets are. A ruleset is made up of one or more rules. Rules are bound to different elements within the document hierarchy. For example, we might have a VScan rule which controls the scanning of the batch. You would use PageID and ImageFix rules on individual pages. You might use an Export rule to export invoice data to a line of business database.

Lab 4 - Datacap Studio Deep Dive

Page 73

IBM Software

3. The far right is where the Action library and Task profiles are managed. The Action library (shown in the above illustration) is where all the reusable actions that come with Taskmaster are organized. You simply click on actions to make use of them when creating your rules. The Task profile (not shown in the above illustration) describes the order in which rules are applied to the document hierarchy. For example, you would want to run Scanning rules first, before running PageID, which would have to run before you could run Recognition rules. We will examine all parts of the Datacap Studio in more detail as we go through this lab exercise.

4.2

Rulesets, Rules, Functions, and Actions

The key to Datacap Taskmaster is the Rules paradigm. It is a unique method for configuring Capture applications. It stresses efficient reusability and reduces, if not outright eliminates, the need to do any custom scripting. We will take a moment to examine this important aspect of Taskmaster configuration. Weve introduced the concept of rulesets, rules, functions, and actions. Lets look at these more closely.

Actions
Actions are our most basic elements and they perform very specific tasks. An action may perform OCR, connect to a database, or return information about a field. Actions can return information (e.g. the results of a SQL call) but all actions will return a Boolean value (true or false) indicating the success of the action. An example of an action is PDFDocumentToImage. This action takes a PDF document and converts it to a multipage TIFF image. The action returns true if the conversion completes successfully.

Functions
Functions are groups of actions. The actions within a function are executed in sequence until one action returns false. If all actions return true, then the function returns true. Lets look at an example. Lets say weve created a function that will be used to determine if a field is a proper zip code. The function could look like this: Function: Is_5_Digit_Zipcode IsFieldPercentNumeric(100) MinimumLength(5) MaximumLength(5) The first action returns true if 100% of the characters in the field are numeric. The second action returns true if the field is at least 5 characters long. The third action returns true if the field has a maximum length of 5 characters. The actions are executed one after the other as long as all actions return true. So if the field value is 28010, then all three actions will return true and the function will return true. But lets say the field value was 28O1O (with capital ohs instead of zeros). Then the first action would return false. The remainder of the actions would not be executed, and the entire function would return false.

Page 74

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Rules
Rules are a collection of functions. One big difference between rules and functions is that functions execute actions until an action returns false whereas rules execute functions until a function returns true. Those familiar with programming can think of the logic associated with functions as being equivalent to a series of logical AND conditions. The logic associated with rules is equivalent to a series of logical OR conditions. Lets use an example to see why and how this works. Lets build on our zip code example. We could have a rule that looks like the following:

Rule: Is_A_Valid_ZIP_Code Function1: Is_5_Digit_ZIP_Code IsFieldPercentNumeric(100) MinimumLength(5) MaximumLength(5) Function2: Is_9_digit_ZIP_Code IsFieldPercentNumeric(90) MinimumLength(10) MaximumLength(10) The second function would return true if the field had 9 out of 10 characters be numeric, and be exactly 10 characters long. This means a field value of something like 28010-8990 would return true. We would continue testing the field against possible ZIP code conditions until one of them returns true. Theres no reason to continue testing the ZIP code if the first function returns true so we stop.

Rulesets
Rulesets are groups of related rules. For example, you might associate all the rules that validate your OCR results into a single ruleset. An example of this might look like the following: Ruleset: Validations Rule1: Is_A_Valid_ZIP_Code Rule2: Is_Date_Valid Rule3: Customer_Number_In_Database As weve noted before, rules are linked to different elements within the document hierarchy. Rulesets are linked to tasks in the task profile. Well examine this in more detail when we look into workflow.

Lab 4 - Datacap Studio Deep Dive

Page 75

IBM Software

4.3

Lab Scenario

The scenario we will follow in our lab is a simple one. As part of a larger financing application, we want to verify an applicants income. We will do this by requesting the applicant submit a Verification of Employment form and a copy of their most recent W2 statement. This will then be used by our line of business application to determine if the applicants income meets our requirements. We will need to be able to automatically classify these two document types and the extract some data from them. Because we are doing a short lab, there are some elements of a production configuration that we will not concern ourselves with. The main idea is to understand the capabilities of Datacap Studio in setting up a new application, and the ease with which these capabilities can be used.

4.4

Starting the Datacap Studio Application Wizard

The first step in creating a new application is using the Datacap Studio Application Wizard to create a skeleton application. The wizard will create a simple document hierarchy, define some commonly used rulesets, and bind those rulesets to the typical parts of the document hierarchy. In this part of the lab, well create our skeleton application and take a look at what the wizard has defined for us __125. Start Datacap Studio by clicking Start -> All Programs -> Datacap -> Datacap Studio -> Datacap Studio

Page 76

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__126. Datacap Studio will ask if you want to connect to an existing application. We are creating a new application so just click on the Close button.

__127. Datacap Studio opens with an empty application area. Look in the top right corner of the window. There are some icons there, including one for the Application Wizard.

Click on the icon immediately to the right of Settings as shown above by the red arrow.

Lab 4 - Datacap Studio Deep Dive

Page 77

IBM Software

__128. The Application Wizard opens.

Click Next __129. The wizard gives you the option of creating a new application, copying an existing one or converting one from a previous version of Taskmaster.

Select the option to Create a new RRS application and click Next

Page 78

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__130. Enter Income_Verification as the name of the new application. Leave the directory locations for the next two fields as C:\Datacap

Click Next. __131. At this point, the Application Wizard allows you to enter some preliminary information about your application.

We will define the document hierarchy later so just click Next

Lab 4 - Datacap Studio Deep Dive

Page 79

IBM Software

__132. The Application Wizard asks if you want to define any fingerprints.

Again, click Next to skip this step. Well do it later. __133. The Application Wizard asks if you want to enter any sample images.

Again, click Next to skip this step

Page 80

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__134. The Application Wizard is done collecting information about the new application.

Click the Finish button to have the skeleton application created. __135. A status window appears when the Application Wizard has created the skeleton application.

It creates a folder for the application with the Datacap directory structure. It also creates a folder for all batches in process, as well as a folder where all process information is stored. Databases containing application and batch information are created (by default, they are created as Microsoft Access databases, but you can replace these with Microsoft SQL Server or Oracle databases for production purposes). Click the Close button to shutdown the wizard.

Lab 4 - Datacap Studio Deep Dive

Page 81

IBM Software

__136. Now we will connect to our newly created application.

Click on the Connection icon as shown above by the red arrow. __137. The list of applications is presented.

Select the Income_Verification application and click Next.

Page 82

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__138. You will need to log on to Taskmaster to update the application.

Enter a password of admin and click on Finish.

Lab 4 - Datacap Studio Deep Dive

Page 83

IBM Software

__139. The application is presented. Let us take a look at what is included in the skeleton application. Expand all elements of the Document hierarchy tab on the left side of the screen.

A batch called Income_Verification has been created. Under it is a default document type called Document. This document type is a generic document that includes a single field called Field. There is a default page type called Other, which is assigned to all pages until they have gone through some classification process. Youll notice that there are Open and Close sections for each batch, document, page and field. Rules can be executed when a batch/document/field is opened or closed for a particular task (more on tasks later). In the skeleton application, the VScan, Create Docs, Set Export Params, Set Fingerprint Params, Batch Document Integrity Check, and ImageFix Load Settings rules are all associated at the batch level. Many of these rules have to do with setting global parameters that are used for all content within the batch.

Page 84

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__140. Now lets focus on the other side of the Datacap Studio window. Look on the far right side. The Actions library tab is displayed. Click on the Task profiles tab.

These are the tasks that constitute the workflow for our skeleton application. New tasks can be created and added to the profile if necessary. The Application Wizard automatically creates basic tasks that virtually all applications use. __141. Expand all the tasks in the Task profiles and lets take a look at the specific rulesets that make up the tasks.

Each task is linked to one or more rulesets. Note that the same ruleset can appear in multiple tasks. For example, the Validate ruleset appears in both the Rulerunner task (which is a background task) as well as the Verify task (which is an interactive task that is driven by a user interface). This is because you may have to perform some of the validation rules in the background while others may have to wait until the user actually types in information on the indexing screen.

Lab 4 - Datacap Studio Deep Dive

Page 85

IBM Software

__142. Now lets examine the rules in the rulesets. These are in the middle section of the Studio. Locate the VScan ruleset and expand all the elements within it.

The VScan ruleset has only one rule in it, which is also called VScan (VScan stands for virtual scan). This rule scans all the documents in a directory to create a batch of documents. You can see the actions that are used to create the function. The directory location to be monitored is set, as are the maximum number of files to be scanned. Then the scan actually occurs and the batch is created. __143. Expand the elements of the ImageFix ruleset.

There are two rules here. The first one (ImageFixLoadSettings) is bound at the batch level and is executed once per batch. It tells Taskmaster which image enhancements settings to use. The second rule is bound at the page level and executed on every page with a page type of Other. This rule applies the image enhancement settings and saves the original page with an extension of tio, which means tiff original. Therefore, the original image can be maintained, if necessary. Note: As an aside associating the image enhancement rules with the Other page type ensures that every page gets enhanced. However, lets say that only certain pages need to be enhanced (perhaps for OCR purposes). You could bind the enhancement rule only to those certain page types. This would reduce the amount of processing that Taskmaster would have to do.

Page 86

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__144. Now expand the elements of the PageID ruleset.

The SetFingerprintParams rule is bound to the batch and executed when the batch is first created. It sets the parameters for use of fingerprint matching as the page identification method. The location of the fingerprint database is specified, the area of the image to limit the fingerprint search is defined, and the minimum classification confidence level of 70% is set. The PageID rule is bound to the page level. Every page is initially classified as page type Other. All Other pages execute this rule to see if a valid fingerprint can be found to help identify the correct page type. The first action analyzes the image, and the second action uses the analysis information to search the fingerprint database. __145. The skeleton application has a very basic Recognize rule set associated with it. Expand the Recognize ruleset.

This is just a general purpose recognition rule set. The ReadZones action looks at the fingerprint for the image and determines the physical locations of all the zones that are associated with the page. Then the RecognizePageFieldsOCR_S uses the ScanSoft OCR engine to attempt to read the data in those zones.

Lab 4 - Datacap Studio Deep Dive

Page 87

IBM Software

__146. Expand the rules in the Export ruleset.

The Set Export Params rule is executed at the batch level. It sets the path for the file that will contain all the metadata for each document, as well as writing some basic header information. The Export Page Fields is executed at the page level and will write out all the information for the page. Now that weve examined the skeleton application that is created by the Application Wizard, we can start creating our own application.

4.5

Setting Up the Document Hierarchy

In this part of the lab, we will examine our sample documents and set up the document hierarchy for the documents. Recall that in our lab scenario, we are receiving documents used to verify a customers income. These documents consist of a W-2 tax form, and a standard Verification of Employment letter that is filled out by the customers employer. __147. Lets examine our sample documents. Open Windows Explorer and navigate to C:\Sample Images\Technical Deep Dive

Page 88

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__148. Select any one of the W2 images and take a quick look at it by double clicking on the file.

We will use OCR technology to extract the following fields from the W2 (shown highlighted)

Borrower Employer Insurance Number Income Borrowers Social Security Number

Lab 4 - Datacap Studio Deep Dive

Page 89

IBM Software

__149. Choose one of the Verification letters and take a quick look at it by double clicking on the file.

We will be extracting

Employer Lender Borrower YTD Income Past years income

Close the image.

Page 90

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__150. Now we will start defining the document structure within Datacap Studio. We will need to lock the Document hierarchy in order to modify it. This is a protection mechanism designed to ensure no one makes accidental modifications.

Click on the icon resembling a lock at the top of the document hierarchy. __151. As we previously noted, the batch is called Income_Verification and there are two objects that the Application Wizard has created for us. There is the default page type of Other and a single default document called Document. We will rename the default document type.

Expand the hierarchy if necessary and single click on the document until it changes appearance as shown above. This means that you can update the name. Change the name of the document from Document to W2. __152. The Application Wizard has created a single default page for this document. We will rename it also.

Again, single click on the page name and change it from Page to W2_Main_Page.

Lab 4 - Datacap Studio Deep Dive

Page 91

IBM Software

__153. Expand the W2_Main_Page. We will now rename the default field that the Application Wizard created.

Single click on the default field and change its name to Borrower. __154. Three additional fields will be needed for this page.

Right click on W2_Main_Page and select Add multiple -> Fields. Enter 3 for the number of fields you want to add, then click Enter.

Page 92

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__155. Note that three new fields have been added, named Field1, Field2, and Field3.

__156. Use the same process that you used to change the default field name to Borrower to change the name of the new fields to EIN, Income, and SSN. The hierarchy should look as follows:

Lab 4 - Datacap Studio Deep Dive

Page 93

IBM Software

__157. The Application Wizard creates one default document but our application will need two document types. So now we will add a second document.

Right click on the batch name (Income_Verification) and select Add-> Document. A new document called Document1 will be added to your hierarchy.

__158. Single click on the new document name and change it to Verification_Letter.

Page 94

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__159. There arent any pages yet in our new document. The Verification Letter is a single page document so we need to add one page to this document. Right click on the Verification_Letter document type and click Add-> Page.

__160. A new page will be displayed called Page1 (you may need to expand the hierarchy).

Single click on the page name and change it to Verification_Letter_Main_Page.

Lab 4 - Datacap Studio Deep Dive

Page 95

IBM Software

__161. The main page of the Verification Letter has 5 fields associated with it. Right click on the page name and click Add multiple -> Fields -> 5.

__162. Change the name of the first new field to Borrower. You will see a message box appear telling you that the field name already exists somewhere else.

You can choose to use the properties of the existing field called Borrower. This means that any rules and properties associated with the existing field will be used for this new field. Click Yes.

Page 96

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__163. Change the name of the remaining fields to Employer, Lender, Income_Last, and Income_YTD. Your document hierarchy should now look like this.

Take the time to confirm that everything matches the above picture.

Lab 4 - Datacap Studio Deep Dive

Page 97

IBM Software

__164. Now we are going to define some properties that determine the document integrity of our documents. Document integrity refers to whether a particular object is mandatory, the maximum and minimum number of those objects, and the order that they appear in. Document integrity applies to the document, page, and field level. There are three properties called MAX, MIN, and ORDER that determine document integrity. Right click on the W2 document type and select Manage Variables

__165. A small dialog box appears. The section on Object general information should already be expanded for you.

Leave the variables as they are. Saying that the maximum and minimum number of documents is 0 means that the documents are completely optional within a batch. In other words, you could scan a batch of Income_Verification documents without having any W2s in the batch. Click Done in the upper right corner.

Page 98

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__166. Right click on the W2_Main_Page page type and select Manage Variables. Note that the Max, Min, and Order values are 1.

This means that if a W2 document is scanned in, you must have 1 and only 1 main page (and obviously, its the first page in the document). In other words, you are saying that the W2 is a single page document. Click Done. __167. Right click on the Borrower field type and select Manage Variables. Change the Max and Min values to 1.

This means that there can only be one occurrence of the Borrower field on a single W2. There isnt an order to the fields. Click Done. __168. Repeat the above step for all the remaining fields in the W2 document type. They should all have Max and Min values of 1 and an Order of 0. __169. Change the Max, Min, and Order variables for the Verification_Letter_Main_Page so that they are equal to 1 (in other words, the Verification Letter is a single page document). __170. Change the settings for each of the fields in the Verification_Letter_Main_page. They should all have Max and Min values of 1 and an Order of 0.

Lab 4 - Datacap Studio Deep Dive

Page 99

IBM Software

__171. Its time to save and unlock the document hierarchy.

__i. __ii.

Click on the diskette icon to save your changes. Click on the lock icon to unlock the DCO.

We have finished defining the document hierarchy.

4.6

Configuring Scanning, Page Identification, and Field Extraction

Weve completed defining the structure of our batch, including the types of documents that will comprise the batch, and the page/field characteristics of each document. Now were ready to set up scanning, and then configure how well identify our documents __172. First well make a few minor changes to our VScan ruleset. In order to do so, we have to lock the ruleset.

Click on the VScan ruleset name (shown highlighted above) and click on the lock icon as indicated by the arrow. __173. Expand the VScan ruleset, as well as all its rules and functions. The first action, SetSourceDirectory, tells Taskmaster where to look for images during the virtual scan process. There is a registry variable called vscanimagedir whose value is C:\Datacap\Income_Verification\images. Copy all the sample images from C:\Sample Images\Technical Deep Dive to the C:\Datacap\Income_Verification\images.

Page 100

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__174. Note that the SetMaxImageFiles action is set to 4. This means the maximum size of a batch is only 4 pages. This is pretty small lets change this to a larger value.

__i.

Click on the SetMaxImageFiles action.

Look at the right side of the window. Near the bottom is the Properties tab. This is where we enter the parameters for the actions. __ii. Change the parameter from 4 to 20.

Press the Enter key to make your change effective. The action should now look as follows

Lab 4 - Datacap Studio Deep Dive

Page 101

IBM Software

__175. Save your changes.

__i. __ii.

Click on the diskette icon to save your changes. Click on the lock icon. A drop down appears. Select Publish ruleset. This makes your changes available to everyone.

__176. Now lets test the scanning function. Recall that there is an integrated test environment built into Datacap Studio.

Click on the Test tab in the top left corner of the screen. __177. This will take you to the test environment. Look at the Workflow section, which is in the top left corner.

Page 102

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

By default, the Application Wizard creates three different workflows. We will focus on the Main Job. The rest of the workflows are best left for a more detailed workflow discussion. The Main Job consists of 5 tasks: VScan, PageID, Rulerunner, Verify, and Export. Note that these are the same tasks that were listed in the Task profiles back on the initial Datacap Studio page. __178. Right click on the VScan task and select New.

__179. A new batch is created for us, but we have yet to run any of the rules associated with the VScan task.

Click the green arrow as indicated above. This will start the VScan task. __180. The task should complete quite quickly and a message box will appear.

Click on the button to Advance the batch to the next step.

Lab 4 - Datacap Studio Deep Dive

Page 103

IBM Software

__181. Look at the Runtime batch hierarchy which is on the middle of the left side of the screen.

This shows you the status of the current batch. Note that six pages have been scanned in. They are internally named TMnnnnnn where nnnnnn is a sequential six digit number. Next to the image name is the page type. Recall that the default page type for newly scanned images is Other. This will change this when we run the PageID task. You can click on any of the pages in the batch and the associated image will appear in the middle of the Studio. __182. We are done with this batch for now. Lets just cancel it.

Right click on the batch in the Workflow window and select Cancel.

Page 104

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__183. Theres more to the batch than just the image files. Open Windows Explorer and navigate to C:\Datacap\Income_Verification\batches. There will be a subdirectory in there for your new batch. The name of the subdirectory will be based on the date and time that the batch was created. Open the batch directory.

You will see your six image files. There is also a log and an XML file. Logs and XML files are created for each task in the Task profiles. __184. Open the VScan.xml file (double clicking on it should automatically open it within Internet Explorer. Respond Yes to any messages asking if you want to allow scripts to run).

A lot of good diagnostic information is stored in these XML files. We have only scanned the batch so theres not a lot at this point. But it shows you the original source image filename as well as the current page type and status. Close the XML file.

Lab 4 - Datacap Studio Deep Dive

Page 105

IBM Software

__185. Return back to the Rulemanager tab in Datacap Studio.

__186. Expand the PageID section of the Task profiles (on the right side of the screen).

There are two rulesets that make up the PageID task: ImageFix and PageID. __187. Go to the Rulesets section of the Rulemanager page (in the middle) and expand all of the elements associated with the ImageFix ruleset.

By default, the Application Wizard bound some of these rules to parts of the document hierarchy when it built the skeleton application. Lets see where they were bound.

Page 106

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__188. Instead of looking through the Document hierarchy to see where the rules were bound, we can simply sync the rule to the hierarchy. Lets see how to do this.

__i. __ii.

Click on the ImageFix Load Settings rule. Look at the bar that separates the Rulesets tab from the Document hierarchy tab. There are arrows pointing to the left and to the right. Click on the arrow pointing toward the Document hiearchy.

The Document hierarchy will automatically change its view and show you where the ImageFix Load Settings rule is bound. Notice that the ImageFix Load Settings is bound at the Batch level. In other words, when the batch is created (and after the images are scanned), the settings for image enhancement are loaded.

Lab 4 - Datacap Studio Deep Dive

Page 107

IBM Software

__189. Now click on the Enhance Image rule and select the sync views arrow that points towards the Document hierarchy.

All images initially come in with a page type of Other. This is why we bind the Enhance Image rule to the Other page type. This way, all images will get enhanced as soon as the ImageFix task runs. Well see this when we run our test of this task. __190. Now lets take a look at the PageID ruleset. Expand all elements within the PageID ruleset.

There are two rules in this ruleset. The Set Fingerprint Params tells Taskmaster where the fingerprint directory is stored, what area of the image to search for when looking for the fingerprint, and sets the minimum page identification confidence level. The PageID rule actually does the analysis of the image and looks it up in the fingerprint database.

Page 108

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__191. Lets examine where these rules are bound to the document hierarchy. Select the Set Fingerprint Params rule and then click the sync views arrow pointing towards the Document hierarchy.

This rule is bound at the batch level, which makes sense since you are setting parameters that will apply to page identification for the entire batch. __192. Now click on the Page ID rule, and then click on the sync views arrow pointing towards the Document hierarchy to see where its bound.

This rule is bound at the page level for the Other page type. Recall that all pages are set as Other when they initially are scanned. So binding the page identification rule to the Other document type ensures all images attempt to use fingerprint identification.

Lab 4 - Datacap Studio Deep Dive

Page 109

IBM Software

__193. One nice feature of Datacap Studio is that all the documentation for the Taskmaster actions is contained online. You can click on an action in the action library and retrieve the information for it. However, the library contains literally hundreds of actions and searching through the library could be time consuming. So theres another Sync Views button which can help us. Lets say that we wanted to know the details behind the AnalyzeImage action.

__i. __ii.

Click on the AnalyzeImage action from the Rulesets part of Studio. Click on the Actions library tab on the right side of Studio. On the bar separating the Rulesets tab from the Actions library is another set of sync view arrows. Click on the sync view arrow that points towards the Action Library.

Notice how the Recog_Shared action set is expanded and the Analyze Image action is highlighted. __194. Right click on the Analyze Image action in the Actions library and select Information.

The online documentation for the action appears. Keep this in mind as you go through the rest of this lab. Close the Information window.

Page 110

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__195. Weve reviewed all the rules for the ImageFix and PageID rulesets. No changes were needed. Lets set up some fingerprints so we can test the configuration.

Click on the Zones tab. __196. The top left corner of the Zones tab is the Fingerprint section. Since this is a brand new application, there wont be any fingerprints defined yet. Well add some. Remember that there can be many different fingerprints for a single document type (e.g. a form may change from year to year but it still contains the same essential information). So we will first create a fingerprint class which represents all the fingerprints for a given document type.

Right click on <New > and select Add fingerprint class __197. Enter W2 as the name for the new fingerprint class.

Click OK.

Lab 4 - Datacap Studio Deep Dive

Page 111

IBM Software

__198. Right click on <New> and select Add fingerprint class Enter Verification as the name for the second fingerprint class

Click OK. The Fingerprint section of your Zones tab should like this

__199. Now we will add the individual fingerprints. Right-click on the W2 fingerprint class and select Add fingerprint

Page 112

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__200. When you add a fingerprint, you are essentially adding a sample image. Taskmaster will analyze the properties of the image and store those properties in the fingerprint database.

Navigate to C:\Sample Images\Technical Deep Dive and select one of the W2 images. __201. You will be asked if you want to enhance the image using the default image enhancement settings.

Select Yes. __202. The Image Enhancement window opens. There will be two images (the original on the left, the enhanced version on the right). You can also review all the image enhancement settings along the right side of the window. Maximizing the window will make things easier to see. Scroll through the image enhancement settings. Note that there is a section for line removal.

We will see line removal when we enhance the image.

Lab 4 - Datacap Studio Deep Dive

Page 113

IBM Software

__203. Enhancement has NOT been run yet, which is why the two images look the same.

Click the green arrow which will actually run the image enhancement. __204. Note how the image on the right changes. Notably, all the lines are removed. This will enhance recognition capabilities, especially when the characters are very close to the lines.

Page 114

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__205. Recall that in the Task profiles, we run the ImageFix ruleset prior to running the PageID ruleset. This means that page identification will be run on the enhanced version of the image. Therefore we want to save the enhanced version of the image as the fingerprint. (Go back to the Rulemanager tab if you need to review the order of the rulesets).

Click on the diskette icon and select Save image. __206. Youll get a message saying that the image has been saved.

Click OK. __207. Close the image enhancement window.

__208. The W2 section of the Fingerprints tab should look like the following:

Lab 4 - Datacap Studio Deep Dive

Page 115

IBM Software

Make sure the fingerprint itself is selected (as shown above) __209. Now we have to associate the fingerprint with an actual page type.

From the Type pulldown, select W2_Main_Page. __210. Ensure that the fingerprint now looks like the following

__211. Now we will define the fingerprint for the Verification Letter.

Make sure the Verification fingerprint class is selected. Right click and select Add fingerprint

Page 116

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__212. Navigate to C:\Sample Images\Technical Deep Dive and select any of the verification images.

Click Open __213. We will repeat the process that we did for the W2.

Click Yes.

Lab 4 - Datacap Studio Deep Dive

Page 117

IBM Software

__214. Click the green arrow to run image enhancement.

__215. Click on the diskette icon and select Save image.

Page 118

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click OK

Close the image enhancement window. __216. Ensure that the new fingerprint is selected.

Select Verification_Letter_Main_Page from the Type dropdown. __217. Now the Verification fingerprint should like the following

Lab 4 - Datacap Studio Deep Dive

Page 119

IBM Software

__218. The fingerprints have been defined. While were here, well also identify the zones and associate them with the different fields.

Expand the Document hierarchy section of the Zones page.

__i. __ii.

Click on W2_Main_Page in the Document hierarchy. Click on the W2_Main_Page fingerprint from the list of fingerprint above the Document hierarchy.

The sample image should be displayed on the right side of the screen.

Page 120

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__219. Lock the W2_Main_Page hierarchy by clicking on the lock icon so that you can update it.

__220. Now were going to define the actual zones for each of the fields.

Click on the Borrower field in the hierarchy. __221. Draw a box around the section of the W2 that contains the borrowers name and address.

Lab 4 - Datacap Studio Deep Dive

Page 121

IBM Software

__222. Click on the EIN field in the hierarchy

__223. Draw a box around the part of the W2 that contains the employers identification number

__224. Click on the Income field in the hierarchy

__225. Draw a box around the part of the W2 that contains the income amount for the borrower.

Page 122

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__226. Click on the SSN field in the hierarchy.

__227. Draw a box around the part of the W2 that contains the borrowers SSN.

Lab 4 - Datacap Studio Deep Dive

Page 123

IBM Software

__228. Now we will define the zones for the Verification Letter.

__i. __ii.

Select the Verification_Letter in the hierarchy. Select the Verification_Letter_Main_Page from the list of fingerprints.

__229. Expand the Verification_Letter_Main_Page and click on the Borrower field in the hierarchy.

Draw a box around the part of the Verification Letter that contains the borrowers name.

Page 124

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__230. Click on the Lender field in the hierarchy.

Draw a box around the part of the Verification Letter that contains the lenders name and address. __231. Click on the Income_Last field in the hierarchy. Draw a box around the part of the Verification Letter that contains the last years income.

__232. Click on the Income_YTD field in the hierarchy. Draw a box around the part of the Verification Letter that contains the year to date income.

Lab 4 - Datacap Studio Deep Dive

Page 125

IBM Software

__233. Click on the Employer field in the hierarchy. Draw a box around the part of the Verification Letter that contains the employers name and address.

__234. We are done defining zones. Lets save our work.

__i. __ii.

Click on the diskette icon to save your changes. Click on the lock icon to unlock the hierarchy.

4.7

Creating the Documents and Fields

Lets recap. Up to now, weve done the following


Defined the document hierarchy Created fingerprints and associated them with our two page types Created zones for each of the fingerprints and associated them with the fields for each page type

Page 126

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

One thing that we havent done yet is examine how the documents are constructed from the individual pages that have been scanned and identified. That is our next step. __235. Return back to the Rulemanager tab in Datacap Studio. Select the Task profiles tab and ensure that the Rulerunner task is expanded.

The first ruleset that is executed is the CreateDocs ruleset. __236. Expand the CreateDocs ruleset in the middle section of the Rulemanager screen. The ruleset is made up of two rules: Create Docs and Create Fields. Click on the Create Docs rule (not the ruleset, but the rule, as shown highlighted below). Click on the sync views arrow pointing towards the Document hierarchy.

The Create Docs rule is bound at the batch level. This rule takes the individual pages and creates documents out of them using the page identification information, and the document integrity values of Max, Min, and Order that we saw before.

Lab 4 - Datacap Studio Deep Dive

Page 127

IBM Software

__237. There is a second rule in the CreateDocs ruleset called Create Fields. We identified where zones are located for a fingerprint, and associated those with the fields in our page types. But what we havent done is examine the rule which tells Taskmaster to use that configuration information. The rule that tells Taskmaster to use the configuration information and create the fields is Create Fields. Click on the Create Fields rule. Click on the sync views arrow pointing towards the Document hierarchy.

Note that the rule is associated with the W2_Main_Page. This was done automatically by the Application Wizard when the skeleton application was originally built. But the skeleton application only had one default document and page type. We manually created the second document and page, which is the Verification Letter and Verification_Letter_Main_Page. Expand the Document hierarchy so that you can see the actions associated with opening a Verification_Letter_Main_Page (as shown above). Create Fields is not bound to this new page type. Our next task is to bind Create Fields with our second page type.

Page 128

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__238. Lock the Document hierarchy.

__i. __ii.

Select the Verification_Letter_Main_Page Click on the lock icon to lock it for editing

Note: You may not see the Global section when you are doing the exercise. This is not a problem please continue.

Lab 4 - Datacap Studio Deep Dive

Page 129

IBM Software

__239. Now well bind the Create Fields rule to the page.

__i. __ii. __iii.

Select the Create Fields rule. Select the Open for the Verification_Letter_Main_Page. Click the Add to DCO button on the bar separating the Rulesets from the Document hierarchy.

The Document hierarchy should now look like this.

Page 130

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__240. Go back to the W2_Main_Page in the Document hierarchy

There are four other rules that are associated with the W2_Main_Page that are not bound to the Verification_Letter_Main_Page. Use the process that you used just now to bind the following rules to Verification_Letter_Main_Page. Ruleset Recognize Export Validate Routing Recognize Page Export Page Fields Validate Page Routing Rule 1 Rule

Lab 4 - Datacap Studio Deep Dive

Page 131

IBM Software

__241. Recall that a default field called Field was created by the Application Wizard. This was for the W2_Main_Page and that field was renamed to Borrower. Take a look at that field in the Document hierarchy.

The Application Wizard automatically associated a rule called Fields Clean with the default field. You can see what this rule is doing if you expand the Clean ruleset. There is one action being executed called DeleteAllMiscChars. This action gets rid of most special characters that are often introduced when noise interferes with recognition. This is added by default and may not be suitable for all applications. However we will use this action since all our fields are purely alphanumeric. __242. None of the fields on the W2_Main_Page, except for Borrower, have the Fields Clean rule bound to them. We will do it for the EIN field.

__i. __ii. __iii.

Select the Fields Clean rule from the Clean ruleset Select the Open associated with the EIN field Click the Add to DCO button on the bar separating the Rulesets from the Document hierarchy.

Page 132

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

The EIN field should now look like this:

__243. Repeat this process for all the remaining fields on the W2_Main_Page so that they look like this:

Lab 4 - Datacap Studio Deep Dive

Page 133

IBM Software

__244. Repeat this process for all the fields on the Verification_Letter_Main_Page. The one exception is the Borrower field. Recall that we inherited the properties of the Borrower field on the W2_Main_Page. Thats why the Fields Clean rule is already associated with it. The fields for the Verification_Letter_Main_page should all look as follows:

__245. Save your changes to the Document hierarchy

__i. __ii.

Click the diskette icon to save your changes Click the lock icon to unlock the hierarchy

Page 134

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

4.8

Testing the Configuration


Its time for another recap. So far, we have done the following:

Defined the document hierarchy Created fingerprints and associated them with our two page types Created zones for each of the fingerprints and associated them with the fields for each page type Recognized that the Application Wizard automatically associated some pages and fields with the rules the wizard created. We bound those rules to the pages and fields that we manually added to the hierarchy.

Now lets test the configuration. __246. We need to return to the test environment.

Click on the Test tab. __247. We will create a new test batch.

Right click on VScan and select New. Click the green arrow to run the batch. When the task completes, click the Advance button to move to the next step in the workflow.

Lab 4 - Datacap Studio Deep Dive

Page 135

IBM Software

__248. The batch will be created with six unclassified pages (page type = Other).

Now lets run the page identification task. The task should already be selected (in the drop down list). Click the green arrow to run the Page ID task. Click Advance when the task is complete.

Page 136

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__249. The page types should have been set to either W2_Main_Page or Verification_Letter_Main_Page. Now well run the Rulerunner task.

The task name in the drop down list should be Rulerunner. Click the green arrow to run the Rulerunner task and click Advance after it completes.

Lab 4 - Datacap Studio Deep Dive

Page 137

IBM Software

__250. Note how the batch hierarchy changes after Rulerunner completes.

There are now additional elements for each page. These are the fields that we defined.

Page 138

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__251. Select the first Verification Letter and expand all the elements under it.

All the fields are listed along with the recognition results. Examine all the rest of the pages to see what the OCR results are like. __252. We previously looked at the PageId.xml file to see what kind of diagnostic information is available to us during processing. Now lets look at the XML files that are created after the Rulerunner task completes. Open Windows Explorer and navigate to C:\Datacap\Income_Verification\Batches and look the for most recent batch folder.

Lab 4 - Datacap Studio Deep Dive

Page 139

IBM Software

Open the batch folder, and open TM000001.xml

An XML file is created for each image, which is where we can see the OCR results for each field, as well the character by character confidence levels. Look at some of the XML files for other pages (TM00005.xml is a good one to examine because of the variety of character confidence levels).

Page 140

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__253. Now lets take a look at the Rulerunner.xml file.

There is additional diagnostic information in the Rulerunner.xml file for the batch. Some additional information about the page classification is in here. Youll see a property called Confidence. The confidence of the page classification is rated on a scale of 0 to 1. Youll note that in the example shown here, the first image is classified with perfect confidence not unexpected since this is the image that was used to define the fingerprint. Subsequent images are not perfect matches so youll see confidence levels like 0.9827 or 0.9838 (still highly confident). The Template_ID variable tells you which fingerprint the page was matched to (the template IDs are shown in Datacap Studio). Weve now completed configuring and testing Scanning, Page Classification, and Recognition. Right click on the test batch in Datacap Studio and cancel it. The next step is to move on to visual Verification.

4.9

Configuring and Testing Visual Verification

Of course, in a real environment, we would not be using the test facility of Datacap Studio to process batches. We would use the standard Taskmaster Clients for anything involving a user interface (such as scanning or verification). The verification interface is something that can be customized quite easily. Datacap Taskmaster has two options for building a verification interface: DotEdit and Batch Pilot. We will examine how to use Batch Pilot for building a simple, customized verification user interface.

Lab 4 - Datacap Studio Deep Dive

Page 141

IBM Software

__254. Lets start Batch Pilot.

Click Start -> All Programs -> Datacap -> Batch Pilot -> Batch Pilot __255. Click File -> Open Project.

__256. Navigate to the C:\Datacap\Income_Verification\dco_Income_Verification. Select and open rrs_verify.bpp.

Page 142

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__257. Look at the bottom of the project (you may have to move the toolbar) for the batch view.

Expand the batch structure until you see the W2_Main_Page. Right click on the W2_Main_Page and select AutoForm. The AutoForm function automatically creates a default user interface for you. __258. A simple interface is built for you.

Lab 4 - Datacap Studio Deep Dive

Page 143

IBM Software

There are three parts to the UI: field labels, image snippets, and data entry areas. You can rearrange and resize these elements, as well as change fonts and text for labels. Elements can be rearranged simply by dragging them and dropping them from one location to another. __259. Lets make some simple changes. The Borrower field contains a lot of information so we might want to make the data entry area bigger. Also, lets say that your users have told you that they want the image snippet on top of the data entry area instead of next to it.

Feel free to change the UI so that it looks like the above example, or change it to a layout of your own choosing. Just be careful not to delete any elements. __260. Click File -> Save Form As

__261. Save the new form as rrs_VerifyW2.bpp.dcf.

Page 144

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__262. Now well create a UI for the Verification Letter.

Expand the batch view, right click on the Verification_Letter_Main_Page and select AutoForm.

Lab 4 - Datacap Studio Deep Dive

Page 145

IBM Software

__263. Reorganize and resize the UI elements like you did for the W2. Heres an example of what it might look like:

Page 146

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__264. Click File -> Save Form As.

Save the form as rrs_verifyVerLet.bpp.dcf. __265. Click File -> Exit.

__266. Click Yes when prompted to save the project.

Lab 4 - Datacap Studio Deep Dive

Page 147

IBM Software

__267. Datacap Taskmaster uses industry standard OCR and ICR engines to provide highly confident recognition results. When configured correctly, most pages wont need any manual verification because the confidence levels will be so high. However, as weve seen from the XML files, there may occasionally be exceptions and character recognition wont always produce highly confident results. There are some fields on some pages (like TM00005) where confidence levels for some characters were somewhat moderate. At the same time, we dont want to have to verify every page. Our next task will be to configure confidence levels so that were comfortable skipping most pages, and then configuring visual verification so that only unconfident pages are shown. First, lets take a look at the Routing ruleset. This was executed as part of the Rulerunner task. Return back to the Rulermanager tab in Datacap Studio.

__268. Locate the Routing ruleset and expand all of the elements within.

Routing Rule 1 has a function which calls the ChkConfidence action. This action checks the confidences of all the fields on a page. The first parameter is the minimum acceptable confidence level. If any of the fields on a page have a confidence level lower than that acceptable level, then the page status is set to whatever value is in the second parameter. You can interpret this action as saying if the confidence for any field is lower than 8 out of 10, then set the page status to 1 which indicates that a problem exists. __269. Now were going to update the Taskmaster Client configuration so that it filters out pages that dont have any problems, and only shows pages that have problems. Start the Taskmaster Client for the Income_Verification application by clicking Start -> All Programs -> Datacap -> Applications -> Income_Verification -> Income_Verification Client

Logon with userid/password combination admin/admin.

Page 148

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__270. From the Taskmaster Client, click Settings -> Workflow

__271. Expand the Main Job, and select the Verify task.

Click the Setup button. __272. Click File -> Task Settings.

Lab 4 - Datacap Studio Deep Dive

Page 149

IBM Software

__273. This brings us to the settings for the Verify task.

Click the Filters tab. We will add filters so that only problem pages are shown. __274. Recall that a page status equal to 1 means that theres a problem.

__i. __ii. __iii. __iv.

Select the Verification_Letter_Main_Page from the Type list. Select the STATUS from the Property drop down. Enter 1 as the Problem value. Click the Add button.

Page 150

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__275. Now well do the same thing for the W2.

Select the W2_Main_Page, select STATUS, enter 1 as the problem value, and click the Add button. __276. Now your settings should look like this

Click OK.

Lab 4 - Datacap Studio Deep Dive

Page 151

IBM Software

__277. Click Done.

__278. Click Apply.

Page 152

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__279. Click Done.

Weve finished setting up the filter so now only problem pages will be displayed. __280. Now lets try out the application using the Client instead of the Datacap Studio Test facility.

Lab 4 - Datacap Studio Deep Dive

Page 153

IBM Software

__281. There may be some entries in the Job Monitor that are left over from our testing. Lets clean things up a bit.

Select all the jobs in the Job Monitor (you can use the Shift key to select multiple jobs at once). Press the Delete button on your keyboard. __282. Youll get two messages asking you to confirm your deletion request.

Click Yes.

Click the Yes to All button. Press F5 to refresh your Job Monitor. All the jobs should eventually disappear. __283. Create a new batch by double clicking the VScan icon in the Operations window.

Page 154

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__284. A progress window will appear.

A message box appears when the task is complete.

Click Stop when the message box appears. If you dont click Stop within 10 seconds, another batch will be created. __285. Make the Job Monitor the active window and press F5 to refresh it. You should see the new job in the Job Monitor.

Note that the status is that the job is pending the PageID task __286. Double click the Page ID icon to execute the next task.

Lab 4 - Datacap Studio Deep Dive

Page 155

IBM Software

__287. A status window appears.

A notification message appears when the task is complete.

Click Stop when the message appears. __288. Refresh your Job Monitor. Note that the job is awaiting the Rulerunner task.

Double click the Rulerunner icon.

Page 156

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__289. Another status window appears.

Click Stop when the completion message appears.

__290. Refresh the Job Monitor again.

The job is now pending the Verify task. __291. Double click the Verify icon.

Lab 4 - Datacap Studio Deep Dive

Page 157

IBM Software

__292. The verification user interface you create in Batch Pilot should appear.

Note how the low confidence characters are shown in red. You can simply press the Enter key to move from one field to another. The background color changes from yellow to blue, indicating that the field has been verified. NOTE: The documents that you actually requiring verification may differ from what is in the lab guide. This is because there may be slight differences from the way youve drawn the zones as compared to how we drew our zones when creating the lab guide. __293. You will move to the next problem document automatically

Again, press Enter to verify the low confidence fields and move through the fields. The batch should be complete after showing two problem documents. Remember, we configured Verify to filter out any documents that didnt have problems.

Page 158

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click Yes to complete the batch. __294. A completion message will be displayed.

Click Stop to stop the verification task.

Congratulations! Youve completed the configuration and testing of a Taskmaster application from scratch.

4.10 Creating a Simple Validation Rule


The last step we will go through in creating our sample application is to add a simple validation rule. One very common requirement is to be able to add database lookups to the application. In our lab, we will validate that the SSN that is read on the W2 form actually exists in a database. If the SSN is not found, we will generate a validation error while the user is processing the document in Verify task.

Lab 4 - Datacap Studio Deep Dive

Page 159

IBM Software

__295. Go back to your Datacap Studio session and make sure that you are on the Rulemanager tab. We have a simple customer database on our lab VMware image. The first task will be to create a connection to the database. Well add a rule to the Validate ruleset.

Select the Validate ruleset from the Rulesets section of . __296. Expand the Validate ruleset.

There is one existing rule which is bound at the Page level (you can see this by using the sync views function).

Page 160

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__297. We will add a rule at the page level to connect to our customer database. Expand the Lookup actions in the Actions library. Click the lock icon to lock the Rulesets in the middle pane.

__i. __ii. __iii.

Select the OpenConnection action. Select the Validate: Page Function 1 function. Click the Add to function button on the bar separating the Actions library from the Rulesets pane.

__298. Update the parameter for the OpenConnection action with the following parameter: provider=microsoft.jet.oledb.4.0;data source=C:\CustomerDB.mdb; persist security info=false This should all be on one line.

With this update, we will create a connection to our customer database whenever we process a W2 main page.

Lab 4 - Datacap Studio Deep Dive

Page 161

IBM Software

__299. Right click on the Validate ruleset and select Add Rule A default rule and function will be created for you. Rename the rule to Validate SSN Rename the function to Lookup SSN

__300. Go to the Actions library and expand the Validations set of actions. Select the SetIsOverrideable action.

This action, when bound to a field, will generate an error if the subsequent validation action fails. Users will not be able to complete processing a batch if a non-overrideable field fails validation. __301. Add the SetIsOverrideable action to the function.

__i. __ii. __iii.

Click on the SetIsOverrideable action Click on the Lookup SSN function Click the Add to function button

Page 162

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__302. Update the parameter for SetIsOverrideable to false.

__303. Expand the Lookup set of actions in the Actions library.

__i. __ii. __iii.

Select the SmartSQL action. Click on the Lookup SSN function. Click the Add to function button

__304. Note that there are two parameters for the SmartSQL action. Update the parameters with the following values: First parameter: Select SSN from CustTable Where SSN=+@F+; Second parameter: No

This action will perform the actual lookup on the CustTable table in the Customer database. The lookup will be based on the current SSN field value. If the SSN is found, then the action will return true It will return false if the SSN is not found. The action should look like this now:

Lab 4 - Datacap Studio Deep Dive

Page 163

IBM Software

__305. Save and publish the ruleset. Click on the diskette icon to save your changes. Click on the lock icon and select Publish ruleset

__306. We created a new rule so the rule needs to be bound to the Document hierarchy.

First, click on the lock icon in the Document hierarchy. Expand the W2 document so that you can see all the fields. __307. We will bind the Validate SSN rule to the SSN field.

__i. __ii. __iii.

Click on the Validate SSN rule in the Ruleset section. Click on the SSN field in the Document hierarchy. Click on the Add to DCO button

Page 164

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__308. Click on the diskette icon to save your changes to the Document hierarchy. Click on the lock icon to unlock the hierarchy. __309. We have made all the required changes to perform the lookup. We will test it now. Create and process a new batch from the Income_Verification client by clicking on the VScan, PageID, Rulerunner, and Verify icons. __310. Note what the fields look like in the Verify client when you get to the first W2.

The SSN has a light red background. This means that, while the OCR was OK, the field has failed a validation action. __311. Try to tab through all the fields so you can go to the next W2 document.

You will get an error message saying that the validation failed and it must be corrected. It cannot be overridden.

Lab 4 - Datacap Studio Deep Dive

Page 165

IBM Software

__312. Change the SSN value to 111-22-3333. This is a value that is in our customer database.

Press Enter and youll be able to move to the next document. __313. The next document has a combination of confidence issues and another validation error due to the SSN not being in the customer database.

Change the SSN to 111-22-3333 so that you can complete processing the branch. Note: The error message that is currently displayed is very generic and not meaningful to end users. There are ways to add additional information to the error message box but that is outside the scope of this lab __314. Continue to process the rest of the batch (there is only one more document). You will get the message

Page 166

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click Yes. Weve finished processing the batch. Congratulations! Youve completed the configuration and testing of a Taskmaster application from scratch. NOTE: Weve omitted the configuration of the export function. This is explored in a separate lab, which we encourage you to run through if you are interested.

Lab 4 - Datacap Studio Deep Dive

Page 167

IBM Software

Lab 5
5.1

IBM Datacap NENU Monitoring

Overview

NENU stands for New Enhanced Notification Utility. NENU is a monitoring and notification tool. You can use NENU to monitor Taskmaster applications, query batch information, change batch settings such as order or status, delete batches, send email notifications, and move batches from one application to another. NENU monitoring can be done manually through the NENU Manager, or set to run automatically at specific times using Windows Task Scheduler. Some sample uses of NENU include Notifying the system administrator when something has gone wrong with a Taskmaster component Correcting expecting problems Generating data that will be used later for RV2 reporting Deleting/archiving processed batches.

It is this latter use case that we will cover in our lab. This should give you an idea of how NENU monitoring is implemented and what its capabilities are. We will search for batches that have a status of Job Done. Those batches will be deleted from the application, the batches will be archived to an alternate location on the hard drive, and the database records associated with the batch will be moved to a different application for reference purposes.

5.2

Creating a NENU Configuration Using Datacap Studio

NENU monitoring settings are configured through the use of Datacap Studio. NENU capabilities are implemented as Datacap actions. Well create a Datacap application but since were not processing batches in the typical sense, we will remove much of the scaffolding that is automatically created by Datacap. __315. Start Datacap Studio by clicking Start -> All Programs -> Datacap -> Datacap Studio -> Datacap Studio

Page 168

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__316. We will create a new application instead of connecting to an existing one, so Close the connection screen.

__317. Start the Datacap Application Wizard by clicking on the icon in the upper right corner of the Datacap Studio.

__318. The Application Wizard starts.

Click Next.

Lab 5 - IBM Datacap NENU Monitoring

Page 169

IBM Software

__319. Select the option to Create an RRS Application.

Click Next. __320. Enter a name for the application like NENU_Application.

Click Next.

Page 170

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__321. We arent going to create a typical application, so we will ignore most of the setup screens.

Click Next. __322. Click Next again for the Fingerprints and Sample Images screen.

__323. Click Finish.

Lab 5 - IBM Datacap NENU Monitoring

Page 171

IBM Software

__324. A summary screen will be displayed when the skeleton application is created.

Click Close. __325. Now we will connect to our newly created skeleton application.

Click on the icon for the Connection Wizard in the upper right corner of the screen. __326. Select the NENU_Application.

Page 172

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__327. Enter admin for the userid and password..

__328. The skeleton application is created. It is setup with a simple document hierarchy. Typical rulesets and tasks that most applications use are automatically created.

Lab 5 - IBM Datacap NENU Monitoring

Page 173

IBM Software

__329. We arent actually creating/processing batches of documents. Rather, were going to monitor the progress of existing batches. So none of the typical rulesets are needed. Indeed, were going to have to delete all of them.

Click on the lock icon on the Rulesets section of Datacap Studio. __330. Select the Vscan ruleset and click on the delete icon.

A confirmation message will appear.

Click Yes to confirm that you want to delete the VScan ruleset. __331. The VScan ruleset is used by the sample document hierarchy. Deleting the ruleset means that the document hierarchy contains references to rules that no longer exist. A message appears informing us of that fact.

Click OK to acknowledge this condition.

Page 174

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__332. Repeat this process for ALL the rulesets. All the rulesets should be gone when you are done.

__333. Click the Add Ruleset icon.

A new ruleset will be added.

Change the name of the ruleset to AutoDelete.

Lab 5 - IBM Datacap NENU Monitoring

Page 175

IBM Software

__334. Select the Actions Library tab on the right hand side of the screen. Scroll through the actions. Locate and expand the NENU set of actions.

We will start adding NENU actions to our rule.

__335. Select Function1. Select the SetUser action from the Actions Library. Click the Add to function button that is on the bar between the Rulesets and Actions Library panes.

Page 176

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__336. Function1 should still be selected in the Rulesets pane.

__i. __ii.

Select the SetPassword action from the Actions Library. Click the Add to function button.

__337. Repeat this process to add the following actions to Function1. SetStation SetApplication SetupOpenApplication QuerySetStatus ProcessRunSqlQuery ProcessMoveBatches ProcessMoveDBRecords

Lab 5 - IBM Datacap NENU Monitoring

Page 177

IBM Software

__338. Now we will start updating the parameters for these actions.

__i. __ii.

Select the SetUser action from the Rulesets pane. Enter admin in the parameter area under the Actions Library pane. This will update the action with the correct parameter.

__339. Update the remainder of the parameters as follows:


SetPassword admin SetStation 1 SetApplication APT (we will be monitoring the APT application) QuerySetStatus Job Done (the job status changes to Job Done when the final Export step is complete). ProcessMoveBatches C:\Completed_Batches (this is the directory location where completed batches will be archived) ProcessMoveDBRecords NENU_Application,,,False,admin,admin,1,True (after the batch is archived, well move the database records regarding the batch to this NENU_Application application).

Page 178

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

The parameter list for the ProcessMoveDBRecords is a little difficult to read as a single line so heres the screen capture for that Parameter section of Datacap Studio

__340. Now we will save our changes

__i. __ii.

Click on the diskette icon to save your changes. Clock on the lock icon and select Publish ruleset from the dropdown.

Lab 5 - IBM Datacap NENU Monitoring

Page 179

IBM Software

__341. We will bind our newly created ruleset to the Document hierarchy.

Click the lock icon on the Document hierarchy section of Datacap Studio. You will get another message reminding you that you deleted a whole bunch of rulesets that were linked to the Document hierarchy.

This time, youll get the option to remove all those non-existent rulesets. Click Yes to remove those unnecessary references. __342. Expand the Document hierarchy so that you can see the Open section under NENU_Application (see the picture below).

__i. __ii. __iii.

Select the Open section. Select Rule1 from the Rulesets pane. Click the Add to DCO button on the bar separating the Document hierarchy from the Rulesets.

Page 180

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__343. Click on the diskette icon to save your changes. Click on the lock icon to unlock the Document hierarchy.

__344. Finally, well add a new Task that is specific to our NENU monitoring rules. Select the Task profiles tab on the right side of Datacap Studio

Click on the lock icon to lock the tasks for editing. __345. Click on the Add task icon.

Lab 5 - IBM Datacap NENU Monitoring

Page 181

IBM Software

__346. A window appears allowing you to select typical tasks. We will not use any of these tasks.

Select the Custom task at the bottom of the window, enter a name of AutoDelete, and click OK.

__347. We will link our ruleset to this task.

__i. __ii. __iii.

Select the AutoDelete ruleset. Select the AutoDelete task. Click the Add ruleset to profile button.

__348. Click the icons to Save and Unlock the task profiles.

Page 182

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

5.3

Testing NENU

We will use the APT application to test our configuration. It was recommended that you run through the APT lab before doing anything else since APT is such an easy way to test Taskmaster capabilities. Youll need a completed APT batch. If a completed batch doesnt exist, go to the APT (Accounts Payable) lab and run through the first section to bring a batch to completion. __349. Start the APT Client

__350. Logon with a userid/password of admin

Click OK. __351. Make sure you have a job that has completed APT processing.

The status should read Job done.

Lab 5 - IBM Datacap NENU Monitoring

Page 183

IBM Software

__352. Now we will use NENU Manager to manually start the batch monitoring. Remember that this could also be scheduled to run periodically using the Windows Task Scheduler.

Click Start -> All Programs -> Datacap -> Taskmaster Client -> NENU Manager. __353. NENU Manager will start.

Click Create to create a new monitor setting. __354. Expand the RRS Application settings, if its not already done.

Change the lib parameter to NENU_Application Change the tprofile parameter to AutoDelete Click the Save button.

Page 184

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__355. Now we will run the monitoring task.

Click the Run Profile button. A confirmation message appears when the task is complete.

Click OK. __356. Return to your APT Client .

Make sure the Job Monitor is the active pane and click F5 to refresh it. The completed job should be gone from the Job Monitor. __357. Open Windows Explorer and navigate to C:\Completed_Batches.

The batch directory has been moved to this archive location. __358. Recall that we also moved the database records for this batch to the NENU_Application. Open the client for that application to see the records.

Lab 5 - IBM Datacap NENU Monitoring

Page 185

IBM Software

Click Start -> All Programs -> Datacap -> Applications -> NENU_Application -> NENU_Application Client. __359. Logon with a userid/password of admin.

__360. Look at the Job Monitor section of the client.

Note that the database records have moved to this application (your screen may look a little different based on the number of batches you have archived). Congratulations! Youve completed the NENU Monitoring lab. Now you should have a better idea of how to use the NENU tool to monitor batches and take actions on them.

Page 186

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Lab 6

Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

6.1

Overview

In this lab, we will see how easy it is to take a Datacap Taskmaster application and update it so that content is stored in the Filenet P8 content repository. We will start with the existing 1040EZ application that comes automatically installed with Datacap.

6.2
__1.

Updating the Application Configuration in Datacap Studio


Open Datacap Studio by clicking Start -> All Programs -> Datacap -> Datacap Studio -> Datacap Studio

__2.

Select the 1040EZ application and click Next

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 187

IBM Software

__3.

Enter admin for the userid/password.

__4.

The main Datacap Studio page is opened. There are three sections: the Document Hierarchy on the left, the Actions Library/Task Profiles on the right, and the Rulesets in the middle.

__5.

We are going to start by adding a new Ruleset. This new ruleset will define all the actions needed to logon to a P8 repository and upload documents to it.

__i. __ii.

Select the 1040EZ overall rulesets Click the Add ruleset icon

Page 188

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__6.

We will need two rules in our new ruleset. One will act at the batch level and login to the respository (since we only need to login once per batch). The other will act at the page level and upload each document to the repository. So we will need to add a rule to the new ruleset.

Right-click on the new ruleset and select Add Rule __7. Now that we have the necessary number of rulesets, rules, and functions, lets rename them to something more meaningful. Single-click on the name Ruleset1 so that you can rename it (similar to how you would do this in Windows Explorer). Change the name of the ruleset to Export to P8. Single-click on the name Rule1 and change the name to Batch Level Rule. Single-click on the name Rule2 and change the name to Page Level Rule. Under Rule1, Single-click on the name Function1 and change the name to Login to P8 Under Rule2, Single-click on the name Function1 and change the name to Upload Tax Form Your ruleset should now look like the following:

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 189

IBM Software

__8.

Click on the Actions Library tab on the right side of the screen

Locate and expand the FileNetP8 actions. These are all the actions that Taskmaster can perform on a P8 repository. __9. Now well add the action to our Login rule that sets the URL for the P8 repository. This is the URL for the P8 web services interface.

__i. __ii. __iii.

On the Rulesets pane, select the Login to P8 function On the Actions Library pane, select the FNP8_SetURL action. Click the Add to function button on the bar separating the Rulesets and Actions Library panes.

Page 190

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__10.

Repeat the above process to add the following actions to the Login to P8 function: __i. __ii. __iii. FNP8_SetTargetObjectID FNP8_SetTargetClassID FNP8_Login

Your rule should now look like the following:

__11.

Now set the parameter values for the actions.

Select the FNP8_SetURL action in your rule. On the right side of the screen (under the Actions Library) is where you can set the parameter values. Enter http://hqdemo1:9080/wsi/FNCEWS35DIME as the URL value. Press Enter to confirm your change to the parameter. The action should now look like this:

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 191

IBM Software

__12.

Follow the above process to change the parameters for the next two actions. __i. __ii. __iii. Change the parameter for FNP8_SetTargetObjectID to ECM Change the parameter for FNP8_SetTargetClassID to Objectstore Change the parameter for FNP8_Login to administrator,filenet

So what weve done here is specify the web services interface URL for P8, say that we are going to store content in the ECM objectstore, and then login to P8 with userid administrator, and password filenet. __13. Now well add the actions for the Upload Tax Form function.

__i. __ii. __iii. __14.

Select the Upload Tax Form function Select the FNP8_SetDocClassID action from the Actions Library. Click the Add to function button

Use the same process to add the following additional actions to the function. FNP8_SetDocTitle FNP8_SetProperty FNP8_Upload

Page 192

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Then change the parameter values as follows: Change FNP8_SetDocClassID parameter to TaxForm Change FNP8_SetDocTitle parameter to 1040EZ Change FNP8_SetProperty parameter to SSN,@TaxpayerSSN

Your rule should now look like this:

What the Upload Tax Form function will do is upload the scanned image to P8, using a document class of TaxForm and setting the SSN property to whatever was OCRd in the TaxpayerSSN field. __15. Well save our changes.

__i. __ii. __16.

Click the diskette icon to save our changes Click the lock icon and select Publish ruleset to make our changes available

Now lets associate or bind the rules to appropriate levels of the document hierarchy.

First well lock the Document Hierarchy for editing. Click on the lock icon

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 193

IBM Software

__17.

Expand the 140EZ document hierarchy so that you can see the global rules that get executed when a new batch is opened.

__i. __ii. __iii.

Click on global in the document hierarchy Select the Batch Level Rule from your Rulesets pane Click on the Add to DCO button

Weve bound the batch level rule (which simply logs into the correct object store within P8) to the main 1040EZ batch. This will get executed once, whenever a batch of 1040EZ docs are scanned. __18. The document hierarchy should be updated to look like this now:

Page 194

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__19.

Now lets take care of binding the rule that does the actual upload of docs. We could bind this to either the document or page level because the 1040EZ is a single page document. Binding at the page level is a little easier so, for the purposes of this lab, well do it that way.

__i. __ii. __iii. __20.

Expand the document hierarchy for the Page_1040EZ page and select the global open. Select the Page Level Rule from the Rulesets pane. Click the Add to DCO button.

The document hierarchy for the 1040EZ page should now look like this:

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 195

IBM Software

__21.

Now well save our changes to the document hierarchy.

__i. __ii. __22.

Click the diskette icon to save our changes Click the lock icon to unlock the hierarchy.

Finally, we need to update one of the task profiles. Tasks are elements of the capture workflow and theyre executed in an orchestrated fashion (as opposed to a simply sequential one). Tasks can be repeated or executed conditionally.

Click on the Task Profiles tab and click the lockicon to lock the profiles for editing.

Page 196

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__23.

Expand the Export task. This task is executed after all the documents have been scanned, processed, and manually verified. Right now, the only thing that is being executed are rules that export OCRd fields to a database.

__i. __ii. __iii. __24.

Select the Export to P8 ruleset from the Rulesets pane. Select the Export task in the Task Profiles. Click the Add ruleset to profile button.

Save your changes to the task profile.

__i. __ii.

Click the diskette icon to save your changes. Click the lock icon to unlock the profiles.

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 197

IBM Software

6.3

Testing the updated application

Weve completed making the necessary changes to the application. Now we can test our changes. __25. Start the 1040EZ client by clicking Start -> All Programs -> Datacap -> Applications -> 1040EZ -> 1040EZ Client

__26.

Login with userid/password of admin

Page 198

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__27.

The client for the 1040EZ application opens.

__28.

Start the virtual scanning process by clicking on the VScan icon

Select the Main Job

Click OK to start the virtual scan.

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 199

IBM Software

A progress window will be displayed.

When the Vscan is over, youll get a status message.

Click the Stop button to stop the scanning process. __29. Now we will run the background processes which do things like document integrity checking, image enhancement, page classification, and OCR. Click the Background icon.

A status message will appear indicating that Page Identification is running

Page 200

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Youll get a completion message.

Click the Stop button. Another status message will appear indicating the recognition is running.

Youll get a completion message.

Click the Stop button __30. Now we need to manually verify our results.

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 201

IBM Software

Click on the Verify/Fixup icon and the Verification client will start.

You can tab through the results to see how the OCR did. Fields with a yellow background indicate a low confidence read occurred. Fields with a blue or teal background with recognized with a high confidence. Click the Next Problem button on the taskbar to go to the next problem document.

There are no other documents in the batch which had recognition issues so youll be asked if you want to close the batch.

Select Yes.

Page 202

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

A completion message will be displayed

Click OK __31. Finally, we will run the Export task. This is where our new rulesets will get executed.

Click the Export icon. A status window appears

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 203

IBM Software

If we have made our changes correctly, the completion message will indicate that the task finished successfully.

Click the Stop button. __32. The last thing we need to do is see if the documents were uploaded to P8 properly. Start the Filenet P8 Workplace XT client using the icon on the desktop.

__33.

Login to Workplace XT using a userid/password combination of administrator/filenet

Page 204

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__34.

Click on the Search icon from the main Workplace XT screen, located in the top left corner.

Then select Advanced Search

__35.

Click on the Class link from the Advanced Search screen.

This will result in a drop down list of possible document classes.

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 205

IBM Software

Select the Tax Form document class and click OK. __36. Change the search criteria to search for documents where the Document title starts with 1040EZ

Click the Search button. __37. If we did our work correctly, well see a 1040EZ was uploaded with the current date and time.

Page 206

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Right click on the document and select Properties

You should see that the SSN field was updated with the OCRd value.

Congratulations!! Youve completed the process of integrating Taskmaster with Filenet P8.

Lab 6 - Integrating IBM Datacap Taskmaster with the Filenet P8 ECM Repository

Page 207

IBM Software

Lab 7
7.1

IBM Datacap Taskmaster and Email Integration

Overview of Datacap Connector for Email and Electronic Documents

The IBM Datacap Taskmaster Capture Connector for Email and Electronic Documents is an optional component of the IBM DatacapTaskmaster product. This connector allows for easy capture of emails and/or electronic attachments, including Microsoft Word, Excel, PDF and zipped files. This optional connector allows customers to capture a wide variety of electronic content with a unified approach. The same classification, data extraction, quality control, and export capabilities can used across a wide spectrum of content.

7.2

Lab Overview

In this lab, we will explore how we integrate email systems with Datacap Taskmaster. We will modify the APT (Accounts Payable) application for invoice processing. In the out of the box application, invoices are scanned in or imported from a file system. In this lab, you will modify the application configuration to monitor a Microsoft Outlook mail box for inbound emails. Many customers utilize email boxes as an alternate method of receiving content. Customer correspondence can be directed to custserv@xyzcorp.com. Or invoices can be sent to AcctPayable@abcEnterprises.com. Emails will be retrieved from the mailbox and attached invoices will be processed as though they were scanned input. Once processed, emails are filed into other mailboxes for later archival. Problem emails are directed towards a different mailbox where an administrator can review them. For simplicity sake, we will assume that all attachments are received as TIFF images. In a product environment, the attachments could be a variety of formats Word, PDF, and JPG are possible alternatives. Datacap Taskmaster is fully capable of handling these document types with the Datacap Connector for Email and Electronic Documents.

7.3
__1.

Getting Started
There is a folder called Start ECM on the desktop of your VMware image.

Open this folder.

Page 208

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__2.

There are several utilities in this folder.

Open the folder called Email Exchange __3. Double-click on the program called step 1 Start Exchange. This will start the MS Exchange server.

A window will open showing the progress of the server startup. Wait for the window to disappear this means that Exchange has started. __4. Now well open up Datacap Studio so we can make a copy of the APT (accounts payable) application.

Click Start -> All Programs -> Datacap Studio -> Datacap Studio

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 209

IBM Software

__5.

We are not going to log onto any of the existing applications.

Click on the Close button __6. We will use Datacap Studios Application Wizard to copy an existing application.

Click on the Application Wizard (as shown by the red arrow above), located in the top right hand corner of Datacap Studio.

Page 210

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__7.

The Application Wizard starts.

Click Next. __8. Select the option to Copy an existing RRS application

Click Next

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 211

IBM Software

__9.

Select the APT application from the drop down list. Select the box to rename the copy. Enter APT_Email as the name of the new application

Click Next __10. Click Finish

Page 212

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__11.

Wait until the Summary window is displayed.

Click Close __12. Now we will connect to the copy of the APT application that we created.

Click on the Connection icon in the top right corner of Datacap Studio (as shown by the arrow above)

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 213

IBM Software

__13.

The list of applications is displayed.

Select the APT_Email application Click Next __14. Enter admin for both the userid and password

Page 214

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__15.

There are three sections to the Datacap Studio. The leftmost section is called the Document Hierarchy and it describes the structure of the batch. The rightmost section is where the Actions Library and Task Profiles are located. The Actions Library is a reusable library of commonly used capture functions. The Task Profile is the list of tasks that make up the capture workflow. In the middle are the Rulesets, which are groupings of actions and rules. This is where the heart of an applications configuration is located. Expand the VScan ruleset so we can can examine what is happening here.

VScan stands for virtual scan and it works by scanning a directory for images. You can see what is happening here. It sets the directory its going to scan, sets the maximum number of images in a directory, says that it will accept multi-page TIFFs, and then starts scanning the directory for TIFF images to ingest. __16. Click on the Task Profiles on the right side of Datacap Studio and expand the VScan task.

The only ruleset that is being executed by this task is VScan. We will replace the VScan ruleset with a new ruleset that will monitor the Exchange mailbox.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 215

IBM Software

__17.

In the Rulesets section of Datacap Studio, right click on the APT_Email.

Select Add Ruleset __18. A new ruleset will be added to the bottom of the page.

__19.

You can rename any of these elements by single clicking on it (similar to how youd rename a file from Windows Explorer).

Rename the ruleset to Email Scan Rename the rule to Monitor Mailbox Leave the name of the function as Function1.

Page 216

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__20.

Now we will start adding actions to our new rule. Select the Actions Library tab on the right side of Datacap Studio

Locate and expand the Ewsmail set of actions. These are all the reusable actions that relate to Exchange eMail.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 217

IBM Software

__21.

The ex_Types action defines the allowable attachment types that Taskmaster will process. The default is PDF but you can set it any type that you want to process. We will add this action to our function.

__i. __ii. __iii. __22.

Select the ex_Types action from the Actions Library. Select Function1 from the Rulesets pane Click on the Add to function button, which is on the bar that separates the Rulesets pane from the Action Library pane.

The action is added to the function.

Make sure the ex_Types action is selected. __23. Just under the Actions Library is where you can enter parameter values for the action. For our lab, we are going to limit the types of allowable attachments to TIFFs. You would monitor for a much wider scope of attachment types in a production environment.

Enter tif as the parameter value and press Enter

Page 218

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__24.

Now your action is updated with the appropriate attachment type.

__25.

We will add another action to our function. The function is the ex_done_folder.

After Taskmaster is finished processing an email, it will file the email in the folder specified by the ex_done_folder action. __i. __ii. __iii. __26. Select ex_done_folder from the Actions Library Select Function1 from the Rulesets pane Click the Add to function button

The function is updated with the action.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 219

IBM Software

__27.

There is an Exchange folder called Processed. We want to file processed emails in there.

Select the ex_done_folder from the Ruleset pane and enter the parameter value of Processed __28. The function should now look like this

__29.

There are six more actions that need to be added to the function. Use the process youve used previously in this lab to add the following actions to Function1 and assign the appropriate parameter value

ex_problem_folder (parameter value is Problems). This is where emails that Taskmaster cannot process are filed. ex_wait_time (parameter value is 10). This is the time, in seconds, to wait for a full batch of emails. ex_ews_version (parameter value is 1). This specifies the exact version of Exchange we are using (1 means Exchange 2007 SP1) ex_max_docs (parameter value is 20). This is the maximum number of emails in a single batch. ex_login (parameter values are https://hqdemo1dom.filenet.com/EWS/Exchange.asmx, datacap_Service@hqdemo1dom.filenet.com, filenet). This is the URL for the Exchange server, and the userid/password to logon to the server with. ex_scan (no parameter required). This is the action to perform the actual mailbox scan

Page 220

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__30.

Save the new ruleset and make it available for processing.

__i. __ii. __iii.

Select the Email Scan ruleset you created. Click on the diskette icon to save your changes Click on the lock icon and select Publish ruleset from the dropdown

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 221

IBM Software

__31.

Now we will update the VScan task to use our new ruleset.

__i. __ii. __32.

Select the VScan task from the Task Profiles tab. Click on the lock icon to lock the task for editing.

Delete the VScan ruleset from the VScan task.

__i. __ii.

Select the VScan ruleset (make sure its the ruleset and not the task). Click the delete icon.

Page 222

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__33.

Now add the ruleset to the VScan profile.

__i. __ii. __iii. __34.

Select the Email Scan ruleset you created. Select the VScan profile from the Task Profiles tab Click the Add ruleset to profile

Make sure that the VScan task has been updated correctly. It should look like this

We need to save the changes you made to the Task Profile. Click on the diskette icon to save the changes. Click on the lock icon to unlock the task. __35. Lastly, we need to bind the new rule to the appropriate part of the Document Hierarchy

Select the top of the Document Hierarchy and click on the lock icon to lock it for editing.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 223

IBM Software

__36.

The rule to monitor the mailbox for emails is bound at the batch level. That is because this is the rule that actually creates the batch.

__i. __ii. __iii.

Expand the Document Hierarchy as shown above. Select the (global) part of the hierarchy. Select the Monitor Mailbox rule that you created. Click on the Add to DCO button that is on the bar separating the Document Hierarchy from the Rulesets.

Click on the diskette icon to save your changes. Click on the lock icon to unlock the hierarchy. Were done making changes in Datacap Studio! Now to test our configuration. __37. Start the Outlook client.

Click on the Outlook icon on the taskbar.

Page 224

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__38.

We will logon as the Administrator and send an email with an invoice attached to it.

Select Administrator from the Profile Name pulldown and click OK. __39. The Outlook client opens with you logged on as the Administrator.

Click on the icon to compose a new email. __40. We will compose our email.

Send the email to datacap_Service@hqdemo1dom.filenet.com. This is the owner of the mailbox that we are monitoring.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 225

IBM Software

Click on the icon to attach files to the email (as indicated by the red arrow). __41. Navigate to My Documents

Select Invoice_0001.tif and Invoice_0002.tif (you can select multiple documents by holding down the Shift key). Click Insert __42. Make sure that the invoices are attached. Add a subject line if you like (its not mandatory).

Click the button to Send the email. Close the Outlook client.

Page 226

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__43.

Now well log back on to the Outlook client and see whats in the datacap_Service mailbox. Start the Outlook client by clicking on its icon on the taskbar.

Select datacap_Service from the profile dropdown and click OK __44. You will be prompted to enter a password. Type in filenet and click OK. Note: The reason you werent asked to enter the Administrators password is that youre already logged on to Windows as Administrator.

__45.

The email you sent should be in the inbox.

Note that in the Inbox are two subfolders called Problems and Processed. Close the Outlook client.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 227

IBM Software

__46.

Find the icon for the APT_Email Client on the desktop.

Double click on the icon to start the Taskmaster Client. __47. Logon with userid and password admin.

__48.

Start the mailbox scanning task by clicking on the Scan icon from the Operations window.

__49.

Select the Demo job.

Click OK.

Page 228

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__50.

VScan will start running. You will get a status window.

Youll get a completion message when the task is complete (remember that the task will wait a full 10 seconds to see if any additional emails are received, based on the value you used for the ex_wait_time action).

Click Stop

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 229

IBM Software

__51.

Refresh the Job Monitor window by clicking on it (to make it active) and pressing F5. You should see your job listed there, pending the Batch Profiler step.

Double click on the Background icon to start the background processor, which will take care of the Batch Profiler step. As with VScan, youll get a status window and a completion message. __52. Refresh the Job Monitor by clicking F5 after the background process has completed. The job should now be waiting for verification.

Page 230

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__53.

Double click on the Verify icon to view the process results.

So we see that Taskmaster successfully extracted the attachment from the email and processed it just as if it were a scanned document. __54. Click the Next Problem icon (as shown above by the red arrow) to move to the next problem document. View the results and click the Next Problem icon again.

The batch will be complete so click Yes to finish with the batch.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 231

IBM Software

__55.

A completion message appears.

Click Stop __56. Lets take one more look at the datacap_Service email box. Start the Outlook Client by clicking on the icon on the taskbar.

Select datacap_Service from the dropdown and click OK. __57. Enter filenet for the password.

Page 232

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__58.

Note that the inbox is now empty.

__59.

Open the Processed folder.

The email that was in the Inbox has been moved to the Processed folder, as we configured Taskmaster to do. Close the Outlook client. Congratulations! Youve completed the Email Integration lab. Now you have a better understanding of how easy it us to integrate email systems with Datacap Taskmaster.

Lab 7 - IBM Datacap Taskmaster and Email Integration

Page 233

IBM Software

Lab 8
8.1

Batch Splitting

Lab Overview

One of the key advantages of Datacap Taskmaster is its fluid workflow. Many capture solutions force you into a fixed processing sequence. Problems with a single document in a given batch mean that the entire batch is held back until the issue is resolved. There isnt any way to do conditional processing for example, taking additional validation steps if recognition confidence levels arent optimal or have special problematic documents go to an administrator for additional research. In this lab, we will update a very simple sample application to illustrate the batch splitting and workflow capabilities of Datacap Taskmaster. As it stands, the application does some basic check processing. The amount of the check is read and low confidence reads on the check amount are sent to a verification client. Lets say that there is a requirement for checks over a certain amount ($1,000.00 in this example) to go to a supervisor for their review. Well add that logic to our application.

8.2

The Sample Check Processing Application

Lets take a look at the basic application that weve created for our lab. __1. Start the sample client. We have called the application SplitBatch. Open Windows Explorer and navigate to C:\Datacap\SplitBatch Locate the link for SplitBatch Client and double click on that to start the client application. __2. Logon with userid and password admin

Page 234

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__3.

The main client window will be displayed.

__4.

Create a batch by clicking on the VScan icon in the Operations window.

A status window will appear

Which will be followed by a completion message

Click Stop to end the scanning process.

Lab 8- Batch Splitting

Page 235

IBM Software

__5.

Now run Page Identification by clicking on the Page ID icon in the Operations window.

A status window will appear

Followed by a completion message.

Click Stop to end the page identification task. __6. Now we will run the Rulerunner task. This task performs the character recognition, among other activities.

Double click on the Rulerunner icon. Another status window appears.

Page 236

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

And then another completion message is displayed

Click Stop __7. Finally, we will run the Verification client to see our processing results.

Double click on the Verify/FixUp icon in the Operations window and the Verification user interface is displayed.

This is a very simple sample application. We are only recognizing the check amount here. Also, the client is configured to only show problem images. Anything that was recognized with a high confidence is not displayed for verification. Click the Next Problem icon (its the blue arrow with the red question mark above it on the toolbar). Its indicated on the above screen capture by the red arrow in the top left corner.

Lab 8- Batch Splitting

Page 237

IBM Software

A message window is displayed saying that there are no more problem documents. Click No so that we can look at the other documents that were processed in the batch. Click Ctl+Shift+P to go to the previous document. Do this until you get to the beginning of the batch, when youll get the following message:

Click Cancel. Note the check image that is at the beginning of the batch.

We have one check in the batch which is for an amount greater than $1000.00. We are going to update the application so that checks greater than $1000.00 go to a special processing queue. Click the Ctl+N to go through the four checks again so that when you get to the end, youll get the message

Click Yes and youll get the completion message

Page 238

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click Stop to end the verification process.

8.3

Updating the application

Now that weve seen how the sample application works, lets modify it to add in our exception handling logic. __8. Start Datacap Studio by clicking Start -> All Programs -> Datacap -> Datacap Studio -> Datacap Studio

Select the SplitBatch application and click Next

Lab 8- Batch Splitting

Page 239

IBM Software

Logon with a userid/password of admin

The main Datacap Studio page is shown. See how Datacap Studio is essentially split into three sections: theres the Document Hierarchy on the left, the Action Library and Task Profiles on the right, and the Rulesets in the middle. The Rulesets are where our processing logic is defined. __9. We could create a separate ruleset for our logic or we can just add some rules to an existing one. Either option is fine. For our lab, well just add some rules to an existing ruleset. Well modify the Routing ruleset.

__i. __ii.

Select the Routing ruleset Click on the lock icon to lock the ruleset for editing.

Page 240

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__10.

We will add a new rule to this ruleset.

Right click the Routing ruleset and select Add Rule __11. Change the name of the newly created rule and the function underneath it by single clicking on the name of the rule (or function), then typing in the new name. This is similar to the way you would rename objects using Windows Explorer.

Change the name of the new rule to Split Based on Amount and change the name of the function to Check Amount. This function will check the value of the field to which it is bound. If the field is greater than or equal to $1000.00, then we will set a flag for the system to place that document in a separate batch. __12. Click on the Actions Library tab on the left hand side. Locate the Validations actions and expand it. Locate the action called IsFieldGreaterOrEqual

Lab 8- Batch Splitting

Page 241

IBM Software

__13.

We will add this action to our rule.

__i. __ii. __iii. __14.

Click on the IsFieldGreaterOrEqual action in the Action Library pane Click on the Check Amount function in the Ruleset pane Click on the Add to function button which is on the bar separating the two panes

Now locate the DCO actions set in the Actions Library pane. Expand those and locate the SetPageStatus action.

__i. __ii. __iii.

Click on the SetPageStatus action in the Actions Library pane Click on the Check Amount function in the Rulesets pane. Click on the Add to function button

Page 242

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__15.

Finally, well add one more action to our function. Locate and expand the rrunner set of actions. Find the rrSet action (dont confused this with the rr_Set action. That is an older action and has been replaced with rrset).

__i. __ii. __iii.

Click on the rrSet action in the Actions Library pane Click on the Check Amount function in the Rulesets pane Click on the Add to function button.

Your rule should now look like this

__16.

Now we will set the parameter values for these actions. Click on the IsFieldGreaterOrEqual action in the Ruleset pane.

Lab 8- Batch Splitting

Page 243

IBM Software

Click on the Parameter field on the right side of the screen, under the Actions Library pane. Enter a value of 1000.00. Press Enter. Your action should now appear as follows:

__17.

Repeat this process for the other two actions. Set the parameter value for SetPageStatus to 1. There are two parameter values for rrSet. The first parameters should be Yes. The second parameter should be @D.Split

What will happen is that the field value will be examined. If it is greater than or equal to $1000.00, the status for the page will be set to 1 (which means that there is a problem with the page) and a document level variable called Split will be set to Yes. __18. We will now add a second rule to the Routing ruleset.

Right click on the Routing ruleset and select Add Rule Change the name of the rule to Split Batch. Change the name of the function in the new rule to Perform Split. It should now look like the following:

Page 244

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__19.

Locate the Split action in the Actions Library

Select the SplitBatch action and add it to the Perform Split function, using the same process that you used to update the Check Amount function. Change the parameter value for the SplitBatch action to be @D.Split

This rule will be bound at the batch level. When the batch is finished, the SplitBatch action will cause all the documents in the batch to be examined. Any document whose Split variable is set to Yes will be sent to a separate queue. __20. Save the changes made to the rulesets.

__i. __ii.

Click the diskette icon to save your changes. Click the lock icon and select Publish ruleset.

Lab 8- Batch Splitting

Page 245

IBM Software

__21.

Now we will bind these newly created rules to the proper parts of the document hierarchy.

Click the lock icon on the Document Hierarchy pane so that it is locked for editing. Expand the Document Hierarchy as shown below so that you can see the actions executed when the Amount field is opened.

__i. __ii. __iii.

Select the Split Based on Amount rule from the Rulesets pane Select the Open part of the Document Hierarchy associated with the Amount field. Click on the Add to DCO button on the bar separating the Rulesets and Document Hierarchy panes.

Page 246

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

The hierarchy of the Amount field will now look like this:

__22.

Expand the very bottom of the hierarchy so that you see the actions executed when the batch is Closed

__i. __ii. __iii.

Select the Split Batch rule from the Rulesets pane Select the Close part of the hierarchy associated with the batch. Click on the Add to DCO button

Lab 8- Batch Splitting

Page 247

IBM Software

Your document hierarchy should like the above screen capture. __23. Save your changes to the document hierarchy.

__i. __ii.

Click the diskette icon to save your changes Click on the lock icon to unlock the document hierarchy.

8.4
__24.

Testing the updated application


Return to the SplitBatch client and create a new batch. Use the same process you used earlier in the lab to create the new batch by clicking on the icons for __a. __b. __c. VScan PageID Rulerunner

Page 248

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

__25.

Do NOT click the Verify icon just yet. After Rulerunner is finished, make the Job Monitor the active window and click on F5 to refresh it.

Note that the one batch that you created has now been split into two. Some documents are in the Main verification queue (as usual) and some documents have gone to the Supervisor verification queue. __26. Click on the Supervisor Verify icon in the Operations pane.

One document is in this batch.

It is the check whose amount is $1,210.20. Click the Next Problem icon (indicated above by the red arrow).

Lab 8- Batch Splitting

Page 249

IBM Software

There arent any more documents in the batch so click Yes. A completion message is displayed.

Click the Stop button __27. Select the normal Verify/Fixup icon from the Operations pane.

There are only three documents in the batch, and the only one that is displayed is the one with low confidence characters

Click the Next Problem icon to move to the next document in the batch. There shouldnt be any further documents to process.

Page 250

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

Click Yes to finish the batch. The completion message is displayed.

Click Stop. Congratulations! Youve completed the Batch Splitting lab. You should now understand some of the powerful workflow capabilities of Datacap Taskmaster. You are no longer tied to a fixed, sequential process. Instead, you can use logic to determine which tasks you want to perform, and which users will handle manual processing tasks.

Lab 8- Batch Splitting

Page 251

IBM Software

Appendix A. Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: IBM World Trade Asia Corporation Licensing 2-31 Roppongi 3-chome, Minato-ku Tokyo 106-0032, Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have

Page 252

Exploring IBM Datacap Taskmaster A Solution Showcase

IBM Software

been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. All references to fictitious companies or individuals are used for illustration purposes only. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

Appendix

Page 253

IBM Software

Appendix B. Trademarks and copyrights


The following terms are trademarks of International Business Machines Corporation in the United States, other countries, or both:
IBM Cube Views Informix Rational System z AIX DB2 Lotus Redbooks Tivoli CICS developerWorks Lotus Workflow Red Brick WebSphere ClearCase DRDA MQSeries RequisitePro Workplace ClearQuest IMS OmniFind System i System p Cloudscape IMS/ESA

Adobe, Acrobat, Portable Document Format (PDF), and PostScript are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, other countries, or both. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. See Java Guidelines Microsoft, Windows, Windows NT, and the Windows logo are registered trademarks of Microsoft Corporation in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Other company, product and service names may be trademarks or service marks of others.

Page 254

Exploring IBM Datacap Taskmaster A Solution Showcase

NOTES

Copyright IBM Corporation 2011. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. This information is based on current IBM product plans and strategy, which are subject to change by IBM without notice. Product release dates and/or capabilities referenced in these materials may change at any time at IBMs sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. IBM, the IBM logo and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at Copyright and trademark information at ibm.com/legal/copytrade.shtml Other company, product and service names may be trademarks or service marks of others.

Вам также может понравиться