Академический Документы
Профессиональный Документы
Культура Документы
Theory:
LEX
Lex is a computer program that generates lexical analyzers ("scanners" or "lexers"). Lex is
commonly used with the YACC parser generator. Lex, originally written by Mike Lesk and
became the standard lexical analyzer generator on many Unix systems, and a tool exhibiting
its behaviour is specified as part of the POSIX standard.
Lex is a program generator designed for lexical processing of character input streams. It
accepts a high-level, problem oriented specification for character string matching, and
produces a program in a general purpose language which recognizes regular expressions. The
regular expressions are specified by the user in the source specifications given to Lex. The
Lex written code recognizes these expressions in an input stream and partitions the input
stream into strings matching the expressions. The Lex source file associates the regular
expressions and the program fragments. As each expression appears in the input to the
program written by Lex, the corresponding fragment is executed.
Lex is not a complete language, but rather a generator representing a new language feature
which can be added to different programming languages, called ``host languages.'' Just as
general purpose languages can produce code to run on different computer hardware, Lex can
write code in different host languages. The host language is used for the output code
generated by Lex and also for the program fragments added by the user. Compatible run-time
libraries for the different host languages are also provided. This makes Lex adaptable to
different environments and different users. Each application may be directed to the
combination of hardware and host language appropriate to the task, the user's background,
and the properties of local implementations. At present, the only supported host language is
C. Lex itself exists on UNIX, GCOS, and OS/370; but the code generated by Lex may be
taken anywhere the appropriate compilers exist.
Lex
reads
an
the
lexical
analyzer
and
outputs source
The definition section defines macros and imports header files written in C. It is also
possible to write any C code here, which will be copied verbatim into the generated
source file.
The rules section associates regular expression patterns with C statements. When the
lexer sees text in the input matching a given pattern, it will execute the associated C
code.
The C code section contains C statements and functions that are copied verbatim to
the generated source file. These statements presumably contain code called by the
rules in the rules section. In large programs it is more convenient to place this code in
a separate file linked in at compile time.
Lex and parser generators, such as YACC or Bison, are commonly used together. Parser
generators use a formal grammar to parse an input stream, something which Lex cannot do
using simple regular expressions (Lex is limited to simple finite state automata).
It is typically preferable to have a (YACC-generated, say) parser be fed a token-stream as
input, rather than having it consume the input character-stream directly. Lex is often used to
produce such a token-stream.
Conclusion: Hence we studied LEX.
YACC
YACC stands for yet another compiler compiler. It is a tool that translate any grammar that
describes a language into a parser for that language. Yacc provides a general tool for
imposing structure on the input to a computer program.
The YACC user prepares a specification of the input process; this includes rules describing
the input structure, code to be invoked when these rules are recognized, and a low-level
routine to do the basic input. YACC then generates a function to control the input process.
This function, called a parser, calls the user-supplied low-level input routine (the lexical
analyzer) to pick up the basic items (called tokens) from the input stream. These tokens are
organized according to the input structure rules, called grammar rules; when one of these
rules has been recognized, then user code supplied for this rule, an action, is invoked; actions
have the ability to return values and make use of the values of other actions.
YACC is a parser generator. It is to parsers what Lex is to scanners. YACC is probably the
most common of the LALR tools.
YACC is written in Backus Naur Form(BNF) and a YACC file has the suffix .y.
There are four steps involved in creating compiler in YACC:
1. Generate a parser from YACC by running YACC over the grammar file
2. Specify the grammar
a. Write the grammar in a .y file
b. Write lexical analyzer to process input and pass tokens to the parser.
c. Write a function that starts parsing by calling yyparse().
d. Write error handling routines.
3. Compile the code produced by YACC as well as any other relevant source files.
4. Link the object files to appropriate libraries for the executable parser.
Definitions section: All code between %{ and %} is copied to the C file. The
definitions section is where we configure various parser features such as defining
token codes, establishing operator precedence and association, and setting up the
global variable used to communicate between the scanner and the parser.
Rules section: the required productions section is where we specify the grammar
rules.
Additional C Code: It is used for ordinary C code that we want copied verbatim to the
generated C file.
Applications of YACC
YACC has been extensively used in numerous practical applications, including lint, the
Portable C Compiler, and a system for typesetting mathematics.
Compiler
Compiler)
is
an open
source parser
analyzer generator for the Java programming language. JavaCC is similar to yacc in that it
generates a parser from a formal grammar written in EBNF (Extended Backus Nour
Form) notation, except the output is Java source code. Unlike yacc, however, JavaCC
generates top-down parsers, which limits it to the LL(k) class of grammars (in particular, left
recursion cannot be used). JavaCC also generates lexical analyzers in a fashion similar to lex.
The tree builder that accompanies it, JJTree, constructs its trees from the bottom up. JavaCC
is licensed under a BSD license.
Java compiler is most popular parser generator for use with Java applications. A parser
generator is a tool that reads a grammar specification and converts it to a java program that
can recognize matches to the grammar. In addition to the parser generator itself, JavaCC
provides other standard capabilities related to parser generation such as tree building.
JavaCC works with any Java VM version 1.2 or higher. It has been certified to be 100% pure
Java. JavaCC has been tested on countless different platforms without any special porting
requirements. A Java compiler is a compiler for the Java programming language. The most
common form of output from a Java class files containing platform neutral Java bytecode.
There are also certain compilers which generates optimized native machine code for a
particular hardware/operating system combination.
Tokens in the grammar file follows the same conventions as for the Java programming
language. Hence identifiers, strings, characters etc. used in the grammars are the same as Java
identifiers, Java strings, Java characters etc.
White space in the grammar files also follows the same conventions as for the Java
programming language. This includes the syntax for comments. Most comments present in
the grammar files are generated into the generated parser/lexical analyzer. Grammar files are
pre-processed for Unicode escapes just as Java files.
List of software built using JavaCC:
Apache Derby
BeanShell
FreeMarker
PMD
Debugger
A debugger or debugging tool is a computer program that is used to test and debug other
programs (the "target" program). The code to be examined might alternatively be running on
an instruction set simulator (ISS), a technique that allows great power in its ability to halt
when specific conditions are encountered but which will typically be somewhat slower than
executing the code directly on the appropriate (or the same) processor. Some debuggers offer
two modes of operationfull or partial simulationto limit this impact.
When the program "crashes" or reaches a preset condition, the debugger typically shows the
location in the original code if it is a source-level debugger or symbolic debugger, commonly
now seen in integrated development environments.
If it is a low-level debugger or a machine-language debugger it shows the line in the
disassembly (unless it also has online access to the original source code and can display the
appropriate section of code from the assembly or compilation).
Features of Debuggers:
Typically, debuggers also offer more sophisticated functions such as running a program step
by step (single-stepping or program animation), stopping (breaking) (pausing the program to
examine the current state) at some event or specified instruction by means of a breakpoint,
and tracking the values of variables. Some debuggers have the ability to modify program
state while it is running. It may also be possible to continue execution at a different location
in the program to bypass a crash or logical error.
The same functionality which makes a debugger useful for eliminating bugs allows it to be
used as a software cracking tool. It often also makes it useful as a general verification
tool, test coverage and performance analyzer, especially if instruction path lengths are shown.
Some debuggers operate on a single specific language while others can handle multiple
languages transparently. For example if the main target program is written in COBOL but
calls assembly language subroutines and PL/1 subroutines, the debugger may have to
dynamically switch modes to accommodate the changes in language as they occur.
Some debuggers also incorporate memory protection to avoid storage violations such
as buffer overflow. This may be extremely important in transaction processing environments
where memory is dynamically allocated from memory 'pools' on a task by task basis.
Most modern microprocessors have at least one of these features in their CPU design to make
debugging easier:
An
instruction
set
that
meets
the Popek
and
Goldberg
virtualization
requirements makes it easier to write debugger software that runs on the same CPU as the
software being debugged; such a CPU can execute the inner loops of the program under
test at full speed, and still remain under debugger control.
Hardware support for code and data breakpoints, such as address comparators and
data value comparators or, with considerably more work involved, page fault hardware.
JTAG access
to
hardware
debug
interfaces
such
as
those
on ARM
architecture processors or using the Nexus command set. Processors used in embedded
systems typically have extensive JTAG debug support.
Some of the most capable and popular debuggers implement only a simple command line
interface (CLI)often to maximize portability and minimize resource consumption.
Developers typically consider debugging via a graphical user interface (GUI) easier and more
productive. This is the reason for visual front-ends, that allow users to monitor and control
subservient CLI-only debuggers via graphical user interface. Some GUI debugger front-ends
are designed to be compatible with a variety of CLI-only debuggers, while others are targeted
at one specific debugger.
List of some well-known debugger:
1.
Turbo Debugger
2.
3.
4.
LLDB
5.
6.
Valgrind
7.
WinDbg
There are many ways to start a jdb session. The most frequently used way is to
have jdb launch a new Java Virtual Machine (VM) with the main class of the application to
be debugged. This is done by substituting the command jdb for java in the command line. For
example, if your application's main class is MyClass, you use the following command to
debug it under JDB:
C:\> jdb MyClass
When started this way, jdb invokes a second Java VM with any specified parameters, loads
the specified class, and stops the VM before executing that class's first instruction.
Turbo Debugger
Turbo Debugger (TD) is a machine-level debugger for MS-DOS executables, intended
mainly for debugging Borland Turbo Pascal (TP), and later Turbo C (TC) programs, sold
by Borland. This tool was a full-screen debugger displaying both TP or TC source and
GNU Debugger
The GNU Debugger, usually called just GDB and named gdb as an executable file, is the
standard debugger for the GNU operating system. However, its use is not strictly limited to
the GNU operating system; it is a portable debugger that runs on many Unix-like systems and
works
for
many programming
languages,
Features
GDB offers extensive facilities for tracing and altering the execution of computer programs.
The user can monitor and modify the values of programs' internal variables, and even
call functions independently of the program's normal behaviour.
GDB do supports a huge number of the computer processor. It do works easily on nearly any
machine.
GDB is still actively developed. As of version 7.0 new features include support for Python
scripting. Since version 7.0, support for "reversible debugging" allowing a debugging
session to step backward, much like rewinding a crashed program to see what happened is
available.
Remote debugging
GDB offers a 'remote' mode often used when debugging embedded systems. Remote
operation is when GDB runs on one machine and the program being debugged runs on
another. GDB can communicate to the remote 'stub' which understands GDB protocol via
Serial or TCP/IP. A stub program can be created by linking to the appropriate stub files
provided with GDB, which implement the target side of the communication
protocol. Alternatively, gdbserver can be used to remotely debug the program without
needing to change it in any way.
The same mode is also used by KGDB for debugging a running Linux kernel on the source
level with gdb. With KGDB, kernel developers can debug a kernel in much the same way as
they debug application programs. It makes it possible to place breakpoints in kernel code,
step through the code and observe variables. On architectures where hardware debugging
registers are available, watchpoints can be set which trigger breakpoints when specified
memory addresses are executed or accessed. KGDB requires an additional machine which is
connected to the machine to be debugged using a serial cable or ethernet. On FreeBSD, it is
also possible to debug using Firewire direct memory access (DMA).
The debugger does not contain its own graphical user interface, and defaults to a commandline interface. Several front-ends have been built for it, such as Xxgdb, Data Display
Debugger (DDD),Nemiver, KDbg, Xcode debugger, GDBtk/Insight and the HP Wildebeest
Debugger
GUI (WDB
Programming
Studio (GPS),
KDevelop, Qt
Creator, MonoDevelop, Eclipse, NetBeans and VisualStudio (see VS AddIn Gallery) can
interface with GDB. GNU Emacs has a "GUD mode" and several tools for VIMexist. These
offer facilities similar to debuggers found in IDEs.
Some other debugging tools have been designed to work with GDB, such as memory
leak detectors.
run v
Bt
info registers