Вы находитесь на странице: 1из 25

4345 Assembly Language

Strings and Arrays


Dr. Esam Al_Qaralleh CE Department Princess Sumaya University for Technology

String Instructions
String is a collection of bytes, words, or long-words that can be up to 64KB in length String instructions can have at most two operands. One is referred to as source string and the other one is called destination string
Source string must locate in Data Segment and SI register points to the current element of the source string Destination string must locate in Extra Segment and DI register points to the current element of the destination string DS : SI 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R ES : DI 02A8:2000 02A8:2001 02A8:2002 02A8:2003 02A8:2004 02A8:2005 02A8:2006 53 48 4F 50 50 49 4E S H O P P I N

Source String

Destination String

String Primitive Instructions


MOVSB, MOVSW, and MOVSD CMPSB, CMPSW, and CMPSD SCASB, SCASW, and SCASD STOSB, STOSW, and STOSD LODSB, LODSW, and LODSD

MOVSB, MOVSW, and MOVSD

(1 of 2)

The MOVSB, MOVSW, and MOVSD instructions copy data from the memory location pointed to by DS:SI to the memory location pointed to by ES:DI.

.data source DWORD 0FFFFFFFFh target DWORD ? .code mov si,OFFSET source mov di,OFFSET target movsd

MOVSB, MOVSW, and MOVSD

(2 of 2)

SI and DI are automatically incremented or decremented:


MOVSB increments/decrements by 1 MOVSW increments/decrements by 2 MOVSD increments/decrements by 4

Direction Flag
Direction Flag (DF) is used to control the way SI and DI are adjusted during the execution of a string instruction
DF=0, SI and DI will auto-increment during the execution; otherwise, SI and DI auto-decrement Instruction to set DF: STD; Instruction to clear DF: CLD Example:

CLD MOV CX, 5 REP MOVSB


At the beginning of execution, DS=0510H and SI=0000H

DS : SI 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R SI CX=5 SI CX=4 SI CX=3 SI CX=2 SI CX=1 SI CX=0

Source String

Repeat Prefix Instructions


REP String Instruction
The prefix instruction makes the microprocessor repeatedly execute the string instruction until CX decrements to 0 (During the execution, CX is decreased by one when the string instruction is executed one time). For Example:

MOV CX, 5 REP MOVSB


By the above two instructions, the microprocessor will execute MOVSB 5 times. Execution flow of REP MOVSB:: While (CX!=0) { CX = CX 1; MOVSB; } Check_CX: If CX!=0 Then CX = CX 1; MOVSB; goto Check_CX; end if

OR

Using a Repeat Prefix


Example: Copy 20 doublewords from source to target

.data source DWORD 20 DUP(?) target DWORD 20 DUP(?) .code cld mov cx,LENGTHOF source mov si,OFFSET source mov di,OFFSET target rep movsd

; direction = forward ; set REP counter

String Instructions
MOVSB (MOVSW) Example
DS : SI 0510:0000 MOV AX, 0510H MOV DS, AX MOV SI, 0 MOV AX, 0300H MOV ES, AX MOV DI, 100H CLD MOV CX, 5 REP MOVSB 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R Destination String ES : DI 0300:0100

Source String

Repeat Prefix Instructions


REPZ String Instruction
Repeat the execution of the string instruction until CX=0 or zero flag is clear

REPNZ String Instruction


Repeat the execution of the string instruction until CX=0 or zero flag is set

REPE String Instruction


Repeat the execution of the string instruction until CX=0 or zero flag is clear

REPNE String Instruction


Repeat the execution of the string instruction until CX=0 or zero flag is set

Your turn . . .
Use MOVSD to delete the first element of the following double-word array. All subsequent array values must be moved one position forward toward the beginning of the array:
array DWORD 1,1,2,3,4,5,6,7,8,9,10 .data array DWORD 1,1,2,3,4,5,6,7,8,9,10 .code cld mov cx,(LENGTHOF array) - 1 mov si,OFFSET array+4 mov di,OFFSET array rep movsd

CMPSB, CMPSW, and CMPSD


The CMPSB, CMPSW, and CMPSD instructions each compare a memory operand pointed to by DS:SI to a memory operand pointed to by ES:DI.
CMPSB compares bytes CMPSW compares words CMPSD compares doublewords

Repeat prefix often used


REPE (REPZ) REPNE (REPNZ)

String Instructions
CMPSB (CMPSW) Example

Assume:

ES = 02A8H DI = 2000H DS = 0510H SI = 0000H

DS : SI 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R

ES : DI 02A8:2000 02A8:2001 02A8:2002 02A8:2003 02A8:2004 02A8:2005 02A8:2006

53 48 4F 50 50 49 4E

S H O P P I N

CLD MOV CX, 9 REPZ CMPSB


Whats the values of CX after The execution?

Source String

Destination String

Comparing Arrays
Use a REPE (repeat while equal) prefix to compare corresponding elements of two arrays.
.data source DWORD COUNT DUP(?) target DWORD COUNT DUP(?) .code mov ecx,COUNT mov esi,OFFSET source mov edi,OFFSET target cld repe cmpsd

; repetition count

; direction = forward ; repeat while equal

(1 of 3) This program compares two strings (source and destination). It displays a message indicating whether the lexical value of the source string is less than the destination string.
.data source BYTE "MARTIN " dest BYTE "MARTINEZ" str1 BYTE "Source is smaller",0dh,0ah,0 str2 BYTE "Source is not smaller",0dh,0ah,0

Example: Comparing Two Strings

Source is smaller

Screen output:

Example: Comparing Two Strings

(2 of 3)

.code main PROC cld ; direction = forward mov si,OFFSET source mov di,OFFSET dest mov cx,LENGTHOF source repe cmpsb jb source_smaller mov dx,OFFSET str2 ; "source is not smaller" jmp done source_smaller: mov dx,OFFSET str1 ; "source is smaller" done: call WriteString exit main ENDP END main

Example: Comparing Two Strings

(3 of 3)

The following diagram shows the final values of SI and DI after comparing the strings:
Before Source: M ESI Before Dest: M EDI A R T I N E Z M A R After T I N E Z EDI A R T I N M A R After T I N ESI

Your turn . . .
Modify the String Comparison program from the previous two slides. Prompt the user for both the source and destination strings. Sample output:
Input first string: ABCDEFG Input second string: ABCDDG The first string is not smaller.

SCASB, SCASW, and SCASD


The SCASB, SCASW, and SCASD instructions compare a value in AL/AX/EAX to a byte, word, or doubleword, respectively, addressed by DI. Useful types of searches:
Search for a specific element in a long string or array. Search for the first element that does not match a given value.

SCASB Example
Search for the letter 'F' in a string named alpha:
.data alpha BYTE "ABCDEFGH",0 .code mov di,OFFSET alpha mov al,'F' mov cx,LENGTHOF alpha cld repne scasb jnz quit dec di

; search for 'F'

; repeat while not equal ; DI points to 'F'

What is the purpose of the JNZ instruction?

STOSB, STOSW, and STOSD


The STOSB, STOSW, and STOSD instructions store the contents of AL/AX/EAX, respectively, in memory at the offset pointed to by DI. Example: fill an array with 0FFh
.data Count = 100 string1 BYTE Count DUP(?) .code mov al,0FFh mov di,OFFSET string1 mov cx,Count cld rep stosb

; ; ; ; ;

value to be stored ES:DI points to target character count direction = forward fill with contents of AL

LODSB, LODSW, and LODSD


LODSB, LODSW, and LODSD load a byte or word from memory at SI into AL/AX/EAX, respectively. Example:
.data array BYTE 1,2,3,4,5,6,7,8,9 .code mov si,OFFSET array mov cx,LENGTHOF array cld L1: lodsb ; load byte into AL or al,30h ; convert to ASCII call WriteChar ; display it loop L1

Array Multiplication Example


Multiply each element of a doubleword array by a constant value.
.data array DWORD 1,2,3,4,5,6,7,8,9,10 multiplier DWORD 10 .code cld ; direction = up mov si,OFFSET array ; source index mov di,esi ; destination index mov cx,LENGTHOF array ; loop counter L1: lodsd mul multiplier stosd loop L1 ; copy [SI] into EAX ; multiply by a value ; store EAX at [DI]

Arrays

Arrays
One-Dimensional Arrays Array declaration in HLL (such as C)
int test_marks[10];

specifies a lot of information about the array:


Name of the array (test_marks) Number of elements (10) Element size (2 bytes) Interpretation of each element (int i.e., signed integer) Index range (0 to 9 in C)

You get very little help in assembly language!

Arrays (contd)
In assembly language, declaration such as
test_marks DW 10 DUP (?)

only assigns name and allocates storage space.


You, as the assembly language programmer, have to properly access the array elements by taking element size and the range of subscripts.

Accessing an array element requires its displacement or offset relative to the start of the array in bytes

Arrays (contd)
To compute displacement, we need to know how the array is laid out
Simple for 1-D arrays

Assuming C style subscripts


displacement = subscript * element size in bytes

Multidimensional Arrays
We focus on two-dimensional arrays
Our discussion can be generalized to higher dimensions

A 53 array can be declared in C as


int class_marks[5][3];

Two dimensional arrays can be stored in one of two ways:


Row-major order
Array is stored row by row Most HLL including C and Pascal use this method

Column-major order
Array is stored column by column FORTRAN uses this method

Multidimensional Arrays (contd)

Multidimensional Arrays (contd)


Why do we need to know the underlying storage representation?
In a HLL, we really dont need to know In assembly language, we need this information as we have to calculate displacement of element to be accessed

In assembly language,
class_marks DW 5*3 DUP (?)

allocates 30 bytes of storage There is no support for using row and column subscripts
Need to translate these subscripts into a displacement value

Multidimensional Arrays (contd)


Assuming C language subscript convention, we can express displacement of an element in a 2-D array at row i and column j as
displacement = (i * COLUMNS + j) * ELEMENT_SIZE

where
COLUMNS = number of columns in the array ELEMENT_SIZE = element size in bytes Example: Displacement of class_marks[3,1]

element is (3*3 + 1) * 2 = 20

Examples

Reverse an array
Reverse and array of N elements, BX has the number of elements N, SI points to the array
MOV DI, SI DEC BX ADD DI, BX INC BX SHR BX, 1 LOOP:
MOV AX, [SI] XCHG AX, [DI] MOV [SI], AX INC SI DEC DI DEC BX JNZ LOOP

; DI = SI ; BX = N 1 ;DI Points to the last element ; BX = N ;BX = N/2

TWO DIMENTIONAL ARRAY EXAMPLE


Suppose A is a 5x7 word array:
1) clear row 3. 2) clear column 4 1) the starting address of the row 3 is
A+[(3-1) x 7 ] x 2 = A+28 MOV BX, 28 ; BX indexes row 3 XOR SI, SI ; SI will index the columns MOV CX, 7 ; number of elements CLEAR: MOV A[BX][SI], 0 ADD SI, 2 DEC CX JNZ CLEAR

TWO DIMENTIONAL ARRAY EXAMPLE


Suppose A is a 5x7 word array:
1) clear row 3. 2) clear column 4 2) the starting address of column 4 is
A+[(4-1)] x 2 = A+6 XOR BX, BX ; BX indexes rows MOV SI, 6 ; SI will index column 4 MOV CX, 5 ; number of elements CLEAR: MOV A[BX][SI], 0 ADD BX, 14 ;go to next row (7 elements * 2bytes) DEC CX JNZ CLEAR

String Copy Example


Copy string1 into string2
String1 DB HELLO String2 DB 5DUP(0)
LEA SI, STRING1 LEA DI, STRING2 MOV CX, 5 CLD REP MOVSB

String Copy Example


Copy string1 into string2 in reverse order
String1 DB HELLO String2 DB 5DUP(0)
LEA SI, STRING1+4 LEA DI, STRING2 CLD MOV CX, 5 MOVE: MOVSB ADD DI,2 DEC CX JNZ MOVE

Two-Dimensional Table Example


Imagine a table with three rows and five columns. The data can be arranged in any format on the page:
table BYTE 10h, 20h, 30h, 40h, 50h BYTE 60h, 70h, 80h, 90h, 0A0h BYTE 0B0h, 0C0h, 0D0h, 0E0h, 0F0h NumCols = 5

Alternative format:
table BYTE 10h,20h,30h,40h,50h,60h,70h, 80h,90h,0A0h, 0B0h,0C0h,0D0h, 0E0h,0F0h NumCols = 5

Two-Dimensional Table Example


The following code loads the table element stored in row 1, column 2:
RowNumber = 1 ColumnNumber = 2 mov bx,NumCols * RowNumber mov si,ColumnNumber mov al,table[bx + si]
150 155 157

10

20

30

40

50

60

70

80

90

A0

B0

C0

D0

E0

F0

table

table[ebx]

table[ebx + esi]

Searching and Sorting Integer Arrays


Bubble Sort
A simple sorting algorithm that works well for small arrays

Binary Search
A simple searching algorithm that works well for large arrays of values that have been placed in either ascending or descending order

Each pair of adjacent values is compared, and exchanged if the values are not ordered correctly:
One Pass (Bubble Sort) 3 1 7 5 2 9 4 3 1 3 7 5 2 9 4 3 1 3 7 5 2 9 4 3 1 3 5 7 2 9 4 3 1 3 5 2 7 9 4 3 1 3 5 2 7 9 4 3 1 3 5 2 7 4 9 3 1 3 5 7 2 4 3 9

Bubble Sort

(shaded values have been exchanged)

Bubble Sort Pseudocode


N = array size, cx1 = outer loop counter, cx2 = inner loop counter:
cx1 = N - 1 while( cx1 > 0 ) { si = addr(array) cx2 = cx1 while( cx2 > 0 ) { if( array[si] < array[si+4] ) exchange( array[si], array[si+4] ) add si,4 dec cx2 } dec cx1 }

Bubble Sort Implementation


BubbleSort PROC USES eax ecx esi, pArray:PTR DWORD,Count:DWORD mov cx,Count dec cx ; decrement count by 1 L1: push cx ; save outer loop count mov si,pArray ; point to first value L2: mov ax,[si] ; get array value cmp [si+4],ax ; compare a pair of values jge L3 ; if [si] <= [di], skip xchg ax,[si+4] ; else exchange the pair mov [si], ax L3: add si,4 ; move both pointers forward loop L2 ; inner loop pop cx ; retrieve outer loop count loop L1 ; else repeat outer loop L4: ret BubbleSort ENDP

Binary Search
Searching algorithm, well-suited to large ordered data sets Divide and conquer strategy Each "guess" divides the list in half Classified as an O(log n) algorithm:
As the number of array elements increases by a factor of n, the average search time increases by a factor of log n.

Binary Search Estimates

Binary Search Pseudocode


int BinSearch( int values[], const int searchVal, int count ) { int first = 0; int last = count - 1; while( first <= last ) { int mid = (last + first) / 2; if( values[mid] < searchVal ) first = mid + 1; else if( values[mid] > searchVal ) last = mid - 1; else return mid; // success } return -1; // not found }

Binary Search Implementation (1 of 3)


BinarySearch PROC uses ebx edx esi edi, pArray:PTR DWORD, ; pointer to array Count:DWORD, ; array size searchVal:DWORD ; search value LOCAL first:DWORD, last:DWORD, mid:DWORD mov first,0 mov eax,Count dec eax mov last,eax mov edi,searchVal mov ebx,pArray L1: mov eax,first cmp eax,last jg L5 ; ; ; ; ; first position last position midpoint first = 0 last = (count - 1)

; DI = searchVal ; EBX points to the array ; while first <= last

; exit search

Binary Search Implementation (2 of 3)


; mid = (last + first) / 2 mov eax,last add eax,first shr eax,1 mov mid,eax ; EDX = values[mid] mov esi,mid shl esi,2 mov edx,[ebx+esi]

base-index addressing

; scale mid value by 4 ; EDX = values[mid]

; if ( EDX < searchval(EDI) ) ; first = mid + 1; cmp edx,edi jge L2 mov eax,mid ; first = mid + 1 inc eax mov first,eax jmp L4 ; continue the loop

Binary Search Implementation (3 of 3)


; else if( EDX > searchVal(EDI) ) ; last = mid - 1; L2: cmp edx,edi ; (could be removed) jle L3 mov eax,mid ; last = mid - 1 dec eax mov last,eax jmp L4 ; continue the loop ; else return mid L3: mov eax,mid jmp L9 L4: jmp L1 L5: mov eax,-1 L9: ret BinarySearch ENDP

; value found ; return (mid) ; continue the loop ; search failed

Summary
String primitives are optimized for efficiency Strings and arrays are essentially the same Keep code inside loops simple Use base-index operands with two-dimensional arrays Avoid the bubble sort for large arrays Use binary search for large sequentially ordered arrays

Вам также может понравиться