Вы находитесь на странице: 1из 50

Llnguagens de rogramao

lernando Magno CulnLo erelra


C que so llnguagens de programao?
or que elas exlsLem?
Como compuLadores eram programados anLes
das llnguagens de programao?
A 1orre de 8abel
LxlsLem enLre 3.000 e 6.000
llnguas faladas em nosso
planeLa.
Cerca de 200 ldlomas
possuem mals de um mllho
de falanLes.
Como descrever um ldloma?
Cue elemenLos esLo
presenLes na descrlo de
uma llnguagem?
CompuLadores Lambem !"#$%&'()
Como e a llnguagem
+(,(-( pelos
compuLadores?
Cue slmbolos ela usa?
Cuals palavras?
Como serla a gramuca
dessa llngua eleLrnlca?
vamos falar zero-um-ns?
CompuLadores possuem
cordas vocals mulLo slmples:
ou emlLem som, ou no
emlLem
L posslvel haver uma
llnguagem com apenas dols
slmbolos?
orque somenLe dols
slmbolos?
ulaleLos do zero-um-ns
P mulLas llnguagens de
zeros e uns dlferenLes, asslm
como h mulLas llnguagens
dlferenLes usando
caracLeres launos: lngls,
porLugus, espanhol, eLc.
Cuem me d exemplos de
zero-um-ns dlferenLes?
./0% 1""2 3' "# 40% 4(1,%5
Cada lnsLruo em zero-um-ns possul um
nome, chamado "6!"-%, e operandos.
lnsLrues mudam o esLado do compuLador.
Cue upos de lnsLrues poderlam exlsur?
lalar zero-um-ns deve ser fcll, no e?
Mas no e no.
AnugamenLe programar
compuLadores era mulLo
dlncll.
Cual o problema com zero-
um-ns?
Alguem al conhece carLes
perfurados?
Como delxar zero-um-ns
mals fcll de usar?
L velo a ueusa
alavras so mals fcels
de lembrar que
sequnclas de zeros e
uns.
or exemplo: qual
lnsLruo e mals fcll de
ler: mov $1, AL, ou
10110000
01100001?
C que esLe programa faz?
movl $5, %eax
movl $1, %edx
.L4:
imull %eax, %edx
decl %eax
testl %eax, $0
jg .L4
C que esLe programa faz?
movl $5, %eax
movl $1, %edx
.L4:
imull %eax, %edx
decl %eax
testl %eax, $0
jg .L4
Coloque 3 em eax
Coloque 1 em edx
Mulupllque eax por edx e
coloque o resulLado em edx
SubLrala 1 de eax
1esLe se eax e 0
#7"
C MonLador
As pessoas falavam
(''%)1,8, mas os
compuLadores alnda
falavam zero-um-ns.
Lra preclso um LraduLor.
C que um LraduLor
desLe upo deverla ser
capaz de fazer?
A ueusa no fol suclenLe
rogramar em (''%)1,8 alnda era dlncll.
Cs programadores querlam que os
compuLadores fossem capazes de falar llnguas
alnda mals parecldas com llnguagens
humanas.
Cuals foram as prlmelras llnguagens de
programao?
Cuem foram os pals dessas llnguagens?
Surge lorLran
9"0# :(!2;' esLava com
pregula de escrever
programas em (''%)1,8.
l8M 1933/34
rogramar cou umas 20
vezes mals fcll
Mas as pessoas alnda
esLavam reluLanLes.
orque?
Lxemplo de programa em lorLran
nfact=1
do i=1, 5
nfact = nfact*I
enddo
movl $5, %eax
movl $1, %edx
.L4:
imull %eax, %edx
decl %eax
testl %eax, $0
jg .L4
lorLran Assembly
Cue novldades surglram
com lorLran?
L Surge LlS
1938, <(''(!0;'%=' >#'?4;4% "+ /%!0#","@8
rofessor 9"0# <!A(&408.
uma noLao slmples, baseada em funes
maLemucas.
MulLos parnLeses,
L llsLas.
Lxemplo de rograma em LlS
(defun factorial (n)
(if (<= n 1)
1
(* n (factorial (- n 1)))))
nfact=1
do i=1, n
nfact = nfact*I
enddo
lorLran
LlS
L quando, nos anos 70,
os sovleucos consegulram
as ulumas 300 llnhas do
slsLema de mlssels
amerlcanos.
8ecurso!
ALCCL - um ume de esLrelas
reclsava-se de um padro para
algorlLmos.
um comlL fol formado em 1938.
!ohn 8ackus
C. A. 8. Poare
!ohn McCarLhy, eLc
uesse comlL nasceu ALCCL 38.
1alvez a mals lnuenLe
llnguagem de programao.
ALCCL - exemplo
!"#$%$& (&)*$+,&$ lacLorlal(m)- !"#$%$& .-
/$%!"
!"#$%$& l-
l 01 !2 m=1 #3$" 1 $45$ m*lacLorlal(m-1)-
lacLorlal := l
$"+-
vocs [ vlram algo parecldo com lsLo?
L CC8CL
CC8CL fol felLa para negclos:
ConLadores, economlsLas, eLc
Como deverla ser uma
llnguagem asslm?
1938: CC8CL fol crlada por
um comlL.
lndusLrla, governo e academla
Alnda usada em mulLas
companhlas, aLe em 8P!
Lxemplo de programas em CC8CL
ADD YEARS TO AGE.
MULTIPLY PRICE BY QUANTITY GIVING
COST.
SUBTRACT DISCOUNT FROM COST GIVING
FINAL-COST.
CuanLas llnguagens de programao exlsLem?
Cuals as llnguagens mals populares?
CuanLas so?
A edlLora C'8ellly dlz que
exlsLem 2.300 llnguagens
de programao
documenLadas.
A wlklpedla documenLa
630.
LxlsLem mulLas.
Mas, porque LanLas?
ropslLos dlferenLes
lorLran servla para clculos
clenucos.
Llsp era usada em Leorla da
compuLao.
CC8CL fol felLa para apllcaes
comerclals.
Algol e uma llnguagem
acadmlca.
L as ouLras llnguagens que
conhecemos?
Cuals so as llnguagens pop?
uados reurados de
www.tiobe.com
!ava: 18.71
C: 16.89
P: 10.39
Coogle code: C, !ava, C++,
P
CralgsllsL: P, C, SCL
Cue ouLras medldas?
Alguem al fala !avans?
ue acordo com mulLos crlLerlos, !ava e a a
llnguagem mals popular.
ara que serve !ava?
Como essa llnguagem surglu?
C que ela Lem de mals?
um exemplo de [avans:
public class Fact {
public static void main(String a[]) {
int n = 5;
int fact = 1;
while (n > 1) {
fact *= n;
n--;
}
System.out.println(fact);
}
}
e A, e 8, e 6.
C surglu em 1972, e fol, duranLe mulLos
anos, a llnguagem de programao mals
popular.
orque C Lem esLe nome?
C que a genLe faz com C?
orque C fol Lo popular?
Cuals os problemas com C?
C Leve grande lnuncla.
lalando em C.
int main() {
int n = 5;
int fact = 1;
while (n > 1) {
fact *= n;
n--;
}
printf("%d\n", fact);
}
Alguem [ vlu lsLo anLes?
C Leve grande lnuncla.
int n = 5;
int fact = 1;
while (n > 1) {
fact *= n;
n--;
}
int n = 5;
int fact = 1;
while (n > 1) {
fact *= n;
n--;
}
Figure 1. Web application architecture.
effectively under a heavy load of requests. Finally, some runtime
techniques [23, 24] require a modied runtime system, which con-
stitutes a practical limitation in terms of deployment and upgrading.
Static analyses to nd SQLCIVs have also been proposed, but
none of them runs without user intervention and can guarantee the
absence of SQLCIVs. String analysis-based techniques [3, 20] use
formal languages to characterize conservatively the set of values a
string variable may assume at runtime. They do not track the source
of string values, so they require a specication, in the form of a
regular expression, for each query-generating point or hotspot in
the program a tedious and error-prone task that few program-
mers are willing to do. Static taint analyses [12, 18, 31] track the
ow of tainted (i.e., untrusted) values through a program and re-
quire that no tainted values ow into hotspots. Because they use
a binary classication for data (tainted or untainted), they classify
functions as either being santitizers (i.e., all return values are un-
tainted) or being security irrelevant. Because the policy that these
techniques check is context-agnostic, it cannot guarantee the ab-
sence of SQLCIVs without being overly conservative. For exam-
ple, if the escape quotes function (which precedes quotes with
an escaping character so that they will be interpreted as charac-
ter literals and not as string delimiters) is considered a sanitizer, an
SQLCIV exists but would not be found in an application that con-
structs a query using escaped input to supply an expected numeric
value, which need not be delimited by quotes. Additionally, static
taint analyses for PHP typically require user assistance to resolve
dynamic includes (a construct in which the name of the included
le is generated dynamically).
1.2 Our Approach
We propose a sound, automated static analysis algorithm to over-
come the limitations described above. It is grammar-based; we
model string values as context free grammars (CFGs) and string
operations as language transducers following Minamide [20]. This
string analysis-based approach tracks the effects of string opera-
tions and retains the structure of the values that ow into hotspots
(i.e., where query construction occurs). If all of each string in the
language of a nonterminal comes from a source that can be inu-
enced by a user, we label the nonterminal with one of two labels.
We assign a direct label if a user can inuence the source di-
rectly (as with GET parameters) and a indirect label if a user can
inuence the source indirectly (as with data returned by a database
query). Such labeling tracks the source of string values. We use
a syntax-based denition of SQL injection attacks [25], which re-
quires that input from a user be syntactically isolated within a gen-
erated query. This policy does not need user-provided specica-
tions. Finally, we check policy conformance by rst abstracting the
labeled subgrammars out of the generated CFG to nd their con-
texts. We then use regular language containment and context free
language derivability [28], to check that each subgrammar derives
only syntactically isolated expressions.
We have implemented this analysis for PHP, and applied it to
several real-world web applications. Our tool scales to large code
bases it successfully analyzes the largest PHP web application
...
01 isset ($ GET['userid']) ?
02 $userid = $ GET['userid'] : $userid = '';
03 if ($USER['groupid'] != 1)
04 {
05 // permission denied
06 unp msg($gp permserror);
07 exit;
08 }
09 if ($userid == '')
10 {
11 unp msg($gp invalidrequest);
12 exit;
13 }
14 if (!eregi('[0-9]+', $userid))
15 {
16 unp msg('You entered an invalid user ID.');
17 exit;
18 }
19 $getuser = $DB->query("SELECT * FROM `unp user`"
20 ."WHERE userid='$userid'");
21 if (!$DB->is single row($getuser))
22 {
23 unp msg('You entered an invalid user ID.');
24 exit;
25 }
...
Figure 2. Example code with an SQLCIV.
previously analyzed in the literature (about 100K loc). It discovered
many vulnerabilities, some previously unknown and some based on
insufcient ltering, and generated few false positives.
2. Overview
In order to motivate our analysis, we rst present the policy that
denes SQLCIVs, and then give an overview of how our analysis
checks web applications against that policy.
2.1 SQL Command Injection Vulnerabilities
This section illustrates SQLCIVs and formally denes them.
2.1.1 Example Vulnerability
Figure 2 shows a code fragment excerpted from Utopia News Pro,
a real-world news management system written in PHP; we will
use this code to illustrate the key points of our algorithm. This
code authenticates users to perform sensitive operations, such as
managing user accounts and editing news sources. Initially, the
variable $userid gets assigned data from a GET parameter, which
a user can easily set to arbitrary values. The code then performs two
checks on the value of $userid before incorporating it into an SQL
query. The query should return a single row for a legitimate user,
and no rows otherwise. From line 14 it is clear that the programmer
intends $userid to be numeric, and from line 20 it is clear that
the programmer intends that $userid evaluate to a single value
in the SQL query for comparison to the userid column. However,
because the regular expression on line 14 lacks anchors (^ and $
for the beginning and end of the string, respectively), any value for
$userid that has at least one numeric character will be included
into the generated query. If a user sets the GET parameter to 1';
DROP TABLE unp user; --, this code will send to the database
the folloing query:
SELECT * FROM `unp user` WHERE userid='1';
DROP TABLE unp user; --'
A lnLerneL resplra P
Alguem aqul [ programou em P?
C que esse nome quer dlzer?
Como deve ser uma llnguagem para
desenvolvlmenLo web?
um exemplo de Ps:
$id = $_GET[user];
if ($id == '') {
echo "Invalid user: $id"
} else {
$getuser = $DB->query
(SELECT * FROM 'table' WHERE id=$id);
echo $getuser;
}
Alguem noLou um pouqulnho de C al?
Cual o upo da varlvel $id?
CompuLadores falam zero-um-ns, ns
falamos llnguagens de programao. quem
Lraduz esLas colsas?
L como essa Lraduo e felLa?
Complladores so ponLes
C prlmelro compllador fol,
provavelmenLe, o A-0 de
B&(!% C"66%& (1949).
Llnguagens de
programao dlferenLes
possuem dlferenLes
complladores.
Mas o mesmo compllador
Lambem pode compllar
llnguagens dlferenLes.
AnaLomla de um compllador
lronL
Lnd
Cumlzador
8ack
Lnd
lorLran
CC8CL
Llsp
.
A8M
x86
owerC
.
Mqulnas vlrLuals
uma mqulna vlrLual e um
0(&-D(&% lmplemenLado em
'"ED(&%.
orque lsso e lnLeressanLe?
Cue llnguagens execuLam
em mqulnas vlrLuals?
Alnda e necessrlo um
LraduLor?
As vezes, Ludo e lnLerpreLado
um lnLerpreLador no produz cdlgo de mqulna.
Ao conLrrlo, ele l o cdlgo do programa fonLe, e
lnLerpreLa cada comando enconLrado.
Cuals as vanLagens de um lnLerpreLador?
Cuals llnguagens so
lnLerpreLadas?
Ser que h alguma
llnguagem que
necessarlamenLe Lenha de
ser lnLerpreLada?
Lssas colsas so eclenLe?
lazemos F;'4G3#G?)%
Algumas llnguagens so complladas enquanLo
esLo sendo lnLerpreLadas.
9($(H!&364, por exemplo.
L de onde vem a eclncla?
Ser que d para fazer
melhor que um compllador
Lradlclonal?
LxlsLe uma llnguagem de programao mals
poderosa" que Lodas as ouLras?
Se exlsLe, que llnguagem e essa?
Mas como medlr esse poder"?
lcll ou ulncll
1. LnconLre a rede de esLradas mals curLa que
llga Lodas as cldades de Mlnas Cerals.
2. LnconLre a menor roLa passando por Lodas as
cldades, sem repeur.
3. uado um programa I para
resolver (2), verlque se a
prlmelra colsa que I
lmprlme e J"$( K&(.
P que sermos humlldes
A mqulna de 1urlng e um modelo Leorlco que
dene Lodos os problemas que so
compuLvels.
LsLado, La, lelLor, slmbolos,
lnsLrues.
Se no h soluo
na Mqulna de
1urlng, enLo no
Lem [elLo mesmo...
Llnguagens 1urlng-CompleLas
Se uma llnguagem e equlvalenLe a Mqulna de
1urlng, enLo ela e /;&3#@GA")6,%4(.
Cuase Loda L e 1urlng-CompleLa.
Mas exlsLem llnguagens que no o so. Algum
exemplo?
8raln-fuc*
um arran[o mulLo grande, conLendo numeros.
ClLo comandos:
> move uma poslo para dlrelLa
move uma poslo para esquerda
+ soma um a poslo correnLe (C)
- subLral um da C
. lmprlme conLeudo da C
, l enLrada e armazena na C
val para comando aps se C e zero
volLa para comando aps se C no e zero.
C que esLes programas fazem?
[-] ou [ > + < - ]
Lssas llnguagens Lodas que a genLe vlu. !ava,
P, C, lorLran, CC8CL, Algol, eLc, eLc. elas
so mulLo parecldas: varlvels, loops,
comandos. Ser que no exlsLe nenhum
ouLro paradlgma no?
Llnguagens lmperauvas e ueclarauvas
Llnguagens lmperauvas:
C programa lnsLrul como mudar o esLado da
mqulna.
varlvels, ,""6', sequnclas de comandos.
LfelLos colaLerals. LxlsLe funo que reLorna
valores dlferenLes dados parmeLros lguals?
Llnguagens declarauvas:
C programa descreve uma verdade.
Ausncla de efelLos colaLerals.
L""6' vla chamada de funes recurslvas.
SML
C programa e um con[unLo de funes.
rogramas so provas por lnduo.
rlnclpals esLruLuras de dados so llsLas e
Luplas.
fun sum [] = 0
| sum (h::t) = h + sum t
fun filter [] _ = []
| filter (h::t) f =
if (f h)
then h :: (filter f t)
else (filter f t)
Sorung
fun leq a b = a <= b
fun grt a b = a > b
fun filter _ nil = nil
| filter f (h::t) =
if f h then h :: filter f t else filter f t
fun qsort nil = nil
| qsort (h::t) =
(qsort (filter (grt h) t))
@ [h] @
(qsort (filter (leq h) t))
rolog
C programa e um con[unLo de resLrles:
Se A e verdade, e A!8 e verdade, enLo 8 e
verdade.
parent(kim, holly).
parent(margaret, kim).
parent(margaret, kent).
parent(esther, margaret).
parent(herbert, margaret).
parent(herbert, jean).
bisavo(GGP, GGC) :-
parent(GGP, GP), parent(GP, P), parent(P, GGC).
ancestor(X, Y) :- parent(X, Y).
ancestor(X, Y) :- parent(Z, Y), ancestor(X, Z).
C que produzlr
bisavo(X, Y)?
um problema -compleLo
sum([],0).
sum([Head|Tail],X) :-
sum(Tail,TailSum),
X is Head + TailSum.
subList([], []).
subList([H|T], [H|R]) :- subList(T, R).
subList([_|T], R) :- subList(T, R).
intSum(L, N, S) :- subList(L, S), sumList(S, N).
uada uma llsLa L de numeros lnLelros, exlsLe uma
subllsLa S cu[a soma se[a ?

Вам также может понравиться