You are on page 1of 9

Chapter 1.

9 Handling of Data in Information Systems

1.9 (a) Manual and Automatic methods of Data Entry

All computer systems need to have data input to them otherwise they have nothing to
process. The methods of collecting the data can be divided into two types: automatic and
manual data collection.

Automatic Data Collection


The most obvious type of automatic data collection is in a control system where the
computer collects its view of the outside world from sensors that give information about
the physical environment. The data collection done by a sensor is continuous, but the
reading of the data is within a fixed time period (the processor does not want to know the
temperature in the room all the time, but perhaps every 5 minutes. This gives the previous
decision long enough to have had some effect.) The use of only some of the available
data is known as sampling. Many sensors that measure physical values are analogue
sensors while the data required by the processor needs to be digital. Analogue data is
physical data that creates a signal which consists of continuously changing voltage (for
example, a thermistor increases the voltage output as the temperature which it is
measuring increases.) This signal must be changed into the stream of 0s and 1s that the
computer can recognise. This is done by an analogue to digital converter. When data is
collected off line, often by sensors in remote locations, and then stored until ready for
input to the system at a time that is convenient to the system, it is known as data logging.
A typical data logger will be in the form of a tape recorder, on which the data is stored
until a set of data has been collected and the data can be entered into the system in one
go. Obviously, this would not be suitable data input for a system which was controlling
the central heating in a house, but a remote weather station on a mountain top where
different readings are taken every 10 minutes and then radioed back to the weather centre
once every 24 hours would need just such a device to store the data until it was required.
Less obvious forms of automatic data collection are barcodes in a supermarket. The code
is translated into a series of dark coloured bars on a light background so that the data can
be input to the machine without any further preparation.
Automatic data collection can be considered to be any data collection that does the two
stages of data collection and data input to the system without going through the
intermediate phase of data preparation to make it suitable for computer use. Another good
example is the school register which is taken by making marks on a sheet of paper and
that can then be read directly into the computer with no human intervention by an optical
mark reader (OMR). An OMR reads information by translating the position of the mark
on the paper into a meaning, so that two marks side by side on the paper mean different
things because of where they are rather than what they look like.
Other forms of automatically entering data are by voice recognition, which is rather
unreliable, but is an attempt by the computer to understand human communication, and
by the use of magnetic stripes. These are seen on the back of credit cards and bank cards.
The stripe contains information about the owner of the card in a form that the computer
can use directly. Another form of data input used by banks is the magnetic ink characters
that are printed on the bottom of cheques before being sent to the account holder. The
magnetic ink is particularly easy for the computer to read and contains enough
information to identify the bank and the account at that bank. All of this can be done with
no further human intervention after the original printing of the cheque book. However,
the data that is written on the cheque by the customer (who it is made out to and for how
much) is not ready for input and hence requires some human intervention to make it
useable.

Manual data entry


The most obvious is the form that has been designed to collect data, which needs to be
input to the computer. An operator reads the data on the form and then types it into the
computer via a keyboard. An extra stage has been added here, the data has had to be
typed in. In other words the original data was not in a form acceptable to the computer.
Computer systems are available that will read individual characters and input them
without the data having to be transcribed. This would count as automatic data collection
(it is known as optical character reading (OCR)).

Questions on this part of the syllabus will be suggesting suitable input methods for
particular situations, and offering advantages and disadvantages for particular forms of
data input in different situations.
1.9 (b) Methods of Image Capture

Scanner:
A scanner is a device that shines a strong light at a source document and then reads the
intensity of the reflected light. The surface of the document is divided into small
rectangles, or pixels, and the light intensity is measured of each pixel, it is then reported
to the computer as a bit map. Scanners can be of different sizes, typical is an A4 sized flat
bed scanner where the document is placed on a sheet of glass which is then scanned line
by line, or a hand held scanner which can be rolled across the image a number of times
collecting a band each time, these bands of image can then be matched up by the software
to produce the complete document.

Video capture card


A video picture is made up of a series of images which are changed approximately 26
times per second in order to fool the brain into thinking that the images are moving. A
video capture card is an interface board which fits into one of the expansion slots in a
processor that allows the processor to store the values of the screen pixels for a specific
picture. In other words it allows the action to be frozen. A typical example of the use of a
video capture card is the market stall that uses a video camera to take an image of a
customer and then to select one image to print onto a T shirt.

Digital camera
Works in a similar way to an optical camera but does not store the image on film. Instead,
the image is stored electronically enabling the user to download it into a computer and
manipulate the image and print out the images if desired.

Each of these image capture systems results in an electronic image being stored in the
computer system. Image manipulation software can then be used to alter or edit the image
in any way that is required. While this allows the user to use their imagination and to tidy
up pictures or crop them to miss out unwanted parts of the image, it also allows
unscrupulous people to produce pictures with very little foundation in reality. It used to
be said that: “The camera never lies”, this is certainly no longer true, witness the film
Forrest Gump.
1.9 (c) Validation and Verification

When data is input to a computer system it is only valuable data if it is correct. If the data
is in error in any way then no amount of care in the programming will make up for the
erroneous data and the results produced can be expected to be unreliable. There are three
types of error that can occur with the data on entry. The first is that the data, while
reasonable, is wrong. If your birthday is written down on a data capture form as 18th of
November 1983, it will (except in very rare cases) be wrong. It can be typed into the
computer with the utmost care as 181183, it can be checked by the computer to make
sure that is a sensible date, and will then be accepted as your date of birth despite the fact
that it is wrong. There is no reason for the computer to imagine that it may be wrong,
quite simply when you filled out the original form you made a mistake. The second type
of error is when the operator typing in the data hits the wrong key and types in 181193, or
the equivalent. In this case an error has been made that should be able to be spotted if a
suitable check is made on the input. This type of data checking is called a verification
check. The third type of error is when something is typed in which simply is not sensible.
If the computer knows that there are only 12 months in a year then it will know that
181383 must be wrong because it is not sensible to be born in the thirteenth month.
Checks on the sensibility of the data are called validation checks.

Faulty data
There is very little that can be done about faulty data except to let the owner of the data
check it visually on a regular basis. The personal information kept on the school
administration system about you and your family may well be printed off at regular
intervals so that your parents can check to ensure that the stored information is still
correct.

Verification
Verification means checking the input data with the original data to make sure that there
have been no transcription errors. The standard way to do this is to input the data twice to
the computer system. The computer then checks the two sets of data (which should be the
same) and if there is a difference between the two sets of data the computer knows that
one of the inputs is wrong. It won’t know which on is wrong but it can now ask the
operator to check that particular input.

Validation
The first thing is to dispel a common misinterpretation of validation. In section 1.6.f
checking of data was mentioned. Specifically, the use of parity bits to check data. This is
NOT validation. Parity bits and echoing back are techniques that are used to check that
data has been transmitted properly within a computer system (e.g. from the disk drive to
the processor), validation checks are used to check the input of data to the system in the
first place.
Validation is a check on DATA INPUT to the system by comparing the data input with a
set of rules that the computer has been told the data must follow. If the data does not
match up with the rules then there must be an error. There are many different types of
validation check that can be used to check input in different applications.
1. Range check. A mathematics exam is out of 100. A simple validation rule that the
computer can apply to any data that is input is that the mark must be between 0 and 100
inclusive. Consequently, a mark of 101 would be rejected by this check as being outside
the acceptable range.
2. Character check. A person’s name will consist of letters of the alphabet and sometimes
a hyphen or apostrophe. This rule can be applied to input of a person’s name so that
dav2d will immediately be rejected as unacceptable.
3. Format check. A particular application is set up to accept a national insurance number.
Each person has a unique national insurance number, but they all have the same format of
characters, 2 letters followed by 6 digits followed by a single letter. If the computer
knows this rule then it knows what the format of a NI number is and would reject
ABC12345Z because it is in the wrong format, it breaks the rule.
4. Length check. A NI number has 9 characters, if more or fewer than 9 characters are
keyed in then the data cannot be accurate.
5. Existence check. A bar code is read at a supermarket check out till. The code is sent to
the main computer which will search for that code on the stock file. As the stock file
contains details of all items held in stock, if it is not there then the item cannot exist,
which it obviously does, therefore the code must have been wrongly read.
6. Check digit. When the code is read on the item at the supermarket, it consists of
numbers. One number is special, it is called the check digit. If the other numbers have
some arithmetic done to them using a simple algorithm the answer should be this special
digit. When the code is read at the check out till, if the arithmetic does not give the check
digit it must have been read wrongly, it is at this point that the beeping sound would
normally be heard if everything is alright.
1.9 (d) Output Formats

When data has been processed by a computer system it is necessary to report the results
of the processing. There are a number of different ways that the results can be reported to
the user.

Graphs
Graphs show trends very clearly. Different types of graph can illustrate different
characteristics, and when two variables need to be compared, a visual representation can
be very useful. However the importance of the scales is paramount because otherwise a
very misleading picture can be given. Also, the specific values are not easily read from a
graph, indeed, in a continuous distribution, it is simply not possible to take reliable
readings to any degree of accuracy.

Reports
A report is a hard copy printout of the values of variables. This has the advantage of
producing the actual figures according to the values specified by the user. However, the
figures themselves may need skill to interpret their significance and the value of figures
in a vacuum is often hard to justify.

Interactive presentations
The previous forms have relied on the format of the report being decided without the
luxury of being able to see what the figures look like in the first place. If the system
allows the user to decide the type and range of output required during the run then there
is some positive user involvement leading to an interactive presentation where the user
can adjust the output to suit their needs.

Sound
Many applications do not lend themselves to a standard, visual, printout. Sound can be
used for output from some systems. Obvious examples would be voice synthesis for
reporting to blind people and an alarm system to protect property against burglars.

Video
Video is a visually satisfying form of output that takes large amounts of memory to
produce because the nature of the medium requires large quantities of pictures to produce
the feel of continuous motion. Video is useful for demonstration of techniques where
there is little value in pages of instruction if a simple video can illustrate something
better.

Images
‘Images’ can refer to any of the forms of output mentioned when they are shown on a
monitor screen, as opposed to the hard copy produced from a printer.

Animations
Provide a good stimulus for an audience and lead from one slide to another when making
a slide based presentation. Animation takes considerably less processing power than other
forms of motion, unless the image being animated is complex. Animation is used so often
that it can come across as being a boring technique that has just been added for ‘gloss’.
1.9 (e) Output According to Target Audience

Imagine an intensive care ward at a hospital. There are six beds, each with a patient who
is being monitored by a computer. The outputs are available for a variety of users. There
is a nurse at a desk at one end of the ward. The nurse has other duties, but is expected to
make the rounds of the patients to check on their progress at regular intervals. Doctors
come round the ward twice a day to check on the patients and make any adjustments to
their medication.
If a patient is sensed by the computer system to have suffered a relapse while the nurse is
sitting at the desk, a sensible output would be sound, some sort of alarm to bring the
notice of the nurse to the fact that something is wrong. This may be accompanied by a
flashing light, or some other device, to quickly draw attention to the patient needing
attention. When the nurse goes around the patients to make a visual check of their
conditions it is not necessary to know exact figures of heart rate or blood sugar, a quick
glance at a screen showing a scrolling graph of the state of the patient’s vital signs over
the last 20 minutes will be perfectly adequate. If the graph looks in any way abnormal it
may be necessary to get a printout of the actual values of the variables for that patient to
determine what action, if any, needs to be taken. The doctor may well want to see a
printout of all the variable values for the last twenty four hours, particularly if there is
something happening to the patient which is difficult to understand, such historical data
can hold the clue to present symptoms. The doctor may change the medication or the
parameters within which the patient can be considered to be stable, this will involve the
nurse resetting values on the scales of the graphical output, or even resetting the
parameters for setting off the audible alarm. This involves the nurse in using an
interactive presentation with the system. Once a week the nurse takes a first aid class at
the local sixth form college. There are too many students for a one to one presentation all
the time so the college computer system has been loaded with demonstration software
showing an animation of the technique for artificial respiration.

When considering output always consider the importance of timeliness and relevance.
Data tends to have a limited life span, which can be different for the same data in
different situations. The data on heart rate from 3 hours ago is not going to be of
importance to the nurse looking after the patient, but it may be of great value to the
doctor in providing a clue as to the reason for a sudden change in condition. Some data is
not relevant to particular situations, however up to date it is. The colour of the patient’s
eyes has no bearing on their physical state and consequently should not be considered
relevant to this example, although it may well be in other circumstances.
Example Questions

1. a) State two methods of data entry used by banks in their cheque system. (2)
b) Explain why banks find the use of your two examples suitable for this application.
(4)
2. A small stall is to be opened, as part of a fairground, where the customer can have
their likeness printed on to the front of a sweatshirt. Describe two possible methods
of capturing the image to be printed. (4)
3. A mail order firm receives orders from customers on paper order forms. These are
keyed into the computer system by operators. The data that is to be keyed in
includes the 5 digit article number, the name of the customer and the date that the
order has been received.
a) Explain how the data input would be verified. (3)
b) Describe three different validation routines that could be performed on the data. (6)
4. A reaction vessel in a chemical plant is monitored, along with many others, by a
computer system using a number of sensors of different types. Describe three
different types of output that would be used by such a system, stating why such a
use would be necessary. (6)
5. Explain what is meant by the timeliness and relevance of data. (2)