You are on page 1of 151

..

Postgresql
,

2011
Creative Commons Attribution-Noncommercial 2.5

(, )
:
PostgreSQL: . (Sad
Spirit) borz_off@cs.msu.su,
http://www.phpclub.ru/detail/store/pdf/postgresql-performance.pdf
PostgreSQL Slony-I, Eugene Kuzin eugene@kuzin.net,
http://www.kuzin.net/work/sloniki-privet.html
Londiste , Sergey Konoplev gray.ru@gmail.com,
http://gray-hemp.blogspot.com/2010/04/londiste.html
pgpool-II, Dmitry Stasyuk,
http://undenied.ru/2009/03/04/uchebnoe-rukovodstvo-po-pgpool-ii/
PostgreSQL PL/Proxy,
dmitry.chirkin@gmail.com,
http://habrahabr.ru/blogs/postgresql/45475/
Hadoop, wordpress@insight-it.ru,
http://www.insight-it.ru/masshtabiruemost/hadoop/
Up and Running with HadoopDB, Padraig OSullivan,
http://posulliv.github.com/2010/05/10/hadoopdb-mysql.html
PostgreSQL: Skype, ,
http://postgresmen.ru/articles/view/25
Streaming Replication,
http://wiki.postgresql.org/wiki/Streaming_Replication
, , - ?, Den
Golotyuk,
http://highload.com.ua/index.php/2009/05/06/-/

Contents
Contents

2
8
2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
. . . . . . . . . . . 8
. . . . . . . . . . . . 9
. . . . . . . . 9
2.2 . . . . . . . . . . . . . . . . . . . . . . . . . 10
. . . . . . . . . . . . . . . . . . . . . . . 10
. . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 . . . . . . . . . . . . . . . . . . . 16
. . . . . . . 16
2.4 . . . . . . . . . . . . . . . . . . . . . . . . . 17
17

(1), 2 . . . . . . . . . . . . . . . . . . . . . 17
Web , 2
. . . . . . . . . . . . . . . . . . . . . . . . . 17
Web , 8
. . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 : pgtune . . 18
2.6 . . . . . . . . . . . . . . . . . 18
. . . . . . . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . 21
. . . . . . . . . . . . . . . 22
pgFouine . . . . . . . . . . 23
2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Contents
3
3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
constraint_exclusion
3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.

25
25
25
26
26
28
30
30
32

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

33
33
35
35
35
35
40
42
44
44
45
46
49
51
51
51
52
52
57
57
57
58
58
60
61
61
61
62
64
65

5
5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 PL/Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66
66
67
68

4
4.1 . . . . . . . . . . . . . . . . . . . . .
4.2 Slony-I . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
4.3 Londiste . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
4.4 Streaming Replication ( )
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
4.5 Bucardo . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
4.6 RubyRep . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
4.7 . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Contents

5.3
5.4

. . . . . . . .
? . . .
HadoopDB . . . . . . . .

. . . . . . .
. . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

6 PgPool-II
6.1 . . . . . . . . . . . . . . . . . . . . .
6.2 ! . . . . . . . . . . . . . . . . .
pgpool-II . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
PCP . . . . . . . . . . . . .
. . . . . . . . .
/ pgpool-II . . . . . . . . . .
6.3 . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . .
6.4 . . . . . . .
. . . . . . .
SystemDB . . . . . . . . . . . . . .
. . .
. . . . . . . . .
. . . . . . .
6.5 Master-slave . . . . . . . . . . . . . . .
Streaming Replication ( )
6.6 . . . . . . . . . . . . . .
Streaming Replication ( )
6.7 . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

68
71
72
75
86
86

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

87
87
88
88
89
89
90
90
91
91
92
93
93
94
96
97
98
98
99
100
101
102

7
103
7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 PgBouncer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3 PgPool-II vs PgBouncer . . . . . . . . . . . . . . . . . . . . . . 104
8 PostgreSQL
8.1 . . . . . . . . . .
8.2 Pgmemcache . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

105
105
105
106
107
108
111

9
112
9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5

Contents
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.10

PostGIS . . .
PostPic . . .
OpenFTS . .
PL/Proxy . .
Texcaller . . .
Pgmemcache .
Prefix . . . .
pgSphere . . .
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

10 PostgreSQL
10.1 . . . . . . . . . . . . . . . .
10.2 SQL . . . . . . . . . . . . . . . .
SQL . . .
10.3 . .
10.4
. . . . . . . . . . . . . . . .
10.5 . . . . . . . . . . . . . . .
11
11.1 . . . . . . . . . . . . .
. . . . . . . . . .
11.2 . . . .
. . . . . . . . .
11.3 . . . .
. . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

112
112
112
113
113
113
113
113
114

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

115
. 115
. 115
. 116
. 117
. 118
. 119
. 119

PostgreSQL
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

123
. 123
. 123
. 123
. 124
. 125
. 126
. 126
. 127
. 129
. 130
. 130

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

12 (Performance Snippets)
12.1 . . . . . . . . . . . . . . . . . . . . . . . . . .
12.2 . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . .
count . . . . . . . . . . . . . . . . . . . . . .
- . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. .
, . . . . . . .
LIKE . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

120
120
120
121
121
121
121

,

,
.

PostgreSQL.
PostgreSQL,
. ,
.

2


,
.

2.1

, ,
. ,
.
;
.
,
.
, ,
?
, ?
,
.

PostgreSQL. ,
PostgreSQL
, , FAQ.
postgresql-performance,
. ,
, .
: .

2.1.


PostgreSQL ,

.
. :
, PostgreSQL,

. (
) , .
PostgreSQL, ,
. , , ..
PostgreSQL :

;
, ;
;
, (,
).


PostgreSQL,
, .
.
7.1 ,
.
7.2 :
VACUUM, ;
ANALYZE,
,
;
.
7.4 (
IN/NOT IN).
8.0 ,
, CHECKPOINT VACUUM .
8.1
, MIN() MAX(),
pg_autovacuum (),
.
8.2 SQL ,
.

2.2.
8.3 , SQL/XML
,
.
8.4 ,
, ,
EXISTS/NOT EXISTS .
9.0 , VACUUM/VACUUM FULL
, .
,
8.4.


, ,
,
. ,
,
.
,
, ,
? , ,

.
,
, .
, , :
.
(
).
(, ,
).

.
(.
3.4).

2.2

,
.
postgresql.conf
.
10

2.2.


: shared_buffers
PostgreSQL
. ,
,
, .
,
. ,
, .
, ,
.

,
, .
:
, PostgreSQL,
PostgreSQL ,
.
, PostgreSQL
, ,
. , ,
,
.
, shared_buffers,
, ,
,
.
8
2 . ,
, , ,
,
. , ,
. ,
.
:
4 (512)
256512 : 1632
(20484096)
14 : 64256
(819232768)

.
11

2.2.
ipcs (, free
vmstat). 1,2 2
, . ,
,
. ,
.
PostgreSQL , :
http://developer.postgresql.org/docs/postgres/kernel-resources.html
, :
Laptop, Celeron processor, 384 RAM, 25 : 12
Athlon server, 1 RAM,
10 : 200
Quad PIII server, 4 RAM, 40 , 150 ,
: 1
Quad Xeon server, 8 RAM, 200 , 300 ,
: 2
: work_mem
sort_mem, ,
,
, .
, work_mem (
).
: (
, , ,
shared_buffers)
, .
,
.
, .
work_mem postgresql.conf.
1 . 1024.
24%
. -
work_mem, , ,
512 2048 . ,

work_mem 500 .
, , ,
, . ,
14 32128 MB.

12

2.2.
VACUUM:
maintenance_work_mem
PostgreSQL 7.x vacuum_mem.
, VACUUM, ANALYZE,
CREATE INDEX, .
, ,
.
50 75%
, , 32 256 .
, work_mem.
. , 14
128512 MB.
Free Space Map: VACUUM FULL
(
PostgreSQL) :
, , ,
, ( );
( UPDATE DELETE)
1 ( ).
, PostgreSQL VACUUM (
3.1.1).
7.2 VACUUM .
7.2, VACUUM ,
SELECT, INSERT, UPDATE DELETE .
VACUUM FULL.

, , , ,
.
:
max_fsm_relations
,
.
VACUUM. max_fsm_relations

( ).
1

13

2.2.
max_fsm_pages
,
,
.
, ,

VACUUM. , VACUUM
VERBOSE ANALYZE ,
. max_fsm_pages ,
.

FSM, VACUUM
, VACUUM FULL,
.
! 8.4 fsm ,
Free Space Map , .

temp_buffers
, .
16 .
max_prepared_transactions
(PREPARE
TRANSACTION). 5.
vacuum_cost_delay
,
, ,
I/O VACUUM, .
, vacuum_cost_delay
0. 50 200 .
vacuum_cost_page_hit
vacuum_cost_page_limit. VACUUM,
.
(Jan Wieck) , delay 200, page_hit
6 100 VACUUM 80%,
.
max_stack_depth
,
, . ,
, .
24 MB.
14

2.2.
max_files_per_process
,
. ,
Too many open files.


PostgreSQL :
( )
, ,

.

:
. ,
:
( checkpoint_segments, 3)
,
( checkpoint_timeout, , 300).
,
, .
:
checkpoint_segments
,
2 .
- .

(checkpoint_segments).
( 16 )
.
, ,

.
12 256 ,
(warning) ,
, . , ,
(checkpoint_segments * 2 + 1) * 16 , ,
. ,
32, 1 .
.
checkpoint_warning ( ):
, .
2

15

2.2.
,
,
.
fsync

off fsync.
,
. : ,
, !
,

.
.

commit_delay ( , 0 ) commit_siblings (5 )

.
commit_siblings ,
commit_delay.
, ,
.
, .
wal_sync_method
,
. fsync=off, .
:
open_datasync open()
O_DSYNC
fdatasync fdatasync() commit
fsync_writethrough fsync() commit,

fsync fsync() commit
open_sync open() O_SYNC
.
, .
full_page_writes
off, fsync=off. ,
on, PostgreSQL
. ,
16

2.2.
,
. ,
.
,
. full_page_writes
,
(
checkpoint_interval).
wal_buffers
SHARED MEMORY
3 . 256512 ,
. ,
14 2561024 .




. 3 ,
:
default_statistics_target
, ANALYZE (. 3.1.2).
,
, .
ALTER TABLE . . . SET STATISTICS.
effective_cache_size
PostgreSQL
,
4 .
1,5 , shared_buffers
32 , effective_cache_size 800 .
700 , PostgreSQL ,

merge joins. effective_cache_size 200
,
, .

,

3
4

17

2.2.
effective_cache_size
2/3 ;
RAM
, .
random_page_cost
,
.
3.0, 2.5
2.0.
, .

. , ,
(sequential scans)
(index scans), . ,
,
, .
.
random_page_cost 2.0;
, random_page_cost ,
.


PostgreSQL ,
.
,
,
. ,
true/false:
track_counts . ,
autovacuum . ,
( autovacuum).
track_functions
.
track_activities
.
. ,

, ,
.
, ,
.
18

2.3.
, :
,
(
autovacuum ).

2.3

,
.
, , ,
.
PostgreSQL
, ,
. ,

,
.

, , ,
noatime5 .



, .
6 ,
.
, ,
,

, .
:
(!).
pg_clog pg_xlog,
, .
.
.
,
, ,

, , :

5
6

19

2.4.
, , ,
.

2.4

.
PostgreSQL
.
RAM ;

shared_buffers = 1/8 RAM ( 1/4);


work_mem 1/20 RAM;
maintenance_work_mem 1/4 RAM;
max_fsm_relations * 1.5;
max_fsm_pages max_fsm_relations * 2000;
fsync = true;
wal_sync_method = fdatasync;
commit_delay = 10 100 ;
commit_siblings = 5 10;
effective_cache_size = 0.9 cached,
free;
random_page_cost = 2 cpu, 4 ;
cpu_tuple_cost = 0.001 cpu, 0.01 ;
cpu_index_tuple_cost = 0.0005 cpu, 0.005 ;
autovacuum = on;
autovacuum_vacuum_threshold = 1800;
autovacuum_analyze_threshold = 900;


(1), 2

maintenance_work_mem = 128MB
effective_cache_size = 512MB
work_mem = 640kB
wal_buffers = 1536kB
shared_buffers = 128MB
max_connections = 500

20

2.5. : pgtune

Web
, 2

maintenance_work_mem = 128MB;
checkpoint_completion_target = 0.7
effective_cache_size = 1536MB
work_mem = 4MB
wal_buffers = 4MB
checkpoint_segments = 8
shared_buffers = 512MB
max_connections = 500

Web
, 8

2.5

maintenance_work_mem = 512MB
checkpoint_completion_target = 0.7
effective_cache_size = 6GB
work_mem = 16MB
wal_buffers = 4MB
checkpoint_segments = 8
shared_buffers = 2GB
max_connections = 500


: pgtune

PostgreSQL Gregory Smith


pgtune7
.
Linux . ,
. :
Listing 2.1: Pgtune
1 pgtune i $PGDATA/ p o s t g r e s q l . c o n f \
2 o $PGDATA/ p o s t g r e s q l . c o n f . pgtune

i, inputconfig postgresql.conf, o,
outputconfig postgresql.conf.
.
M, memory ,
. , pgtune
.
7

http://pgtune.projects.postgresql.org/

21

2.6.
T, type . : DW, OLTP, Web,
Mixed, Desktop.
c, connections .
, .
, pgtune
PostgreSQL. ,
, ,

.

2.6


:
1. , .
:
a) .
.
b) , .
2. .
3. .
4. .


,
.
( cron)
.
ANALYZE
.

.
VACUUM ANALYZE.
, ,
,
ANALYZE.
.

22

2.6.
REINDEX
REINDEX .
:
;
.
. , ,
. PostgreSQL
,
. ,
.
- ,
REINDEX. :
REINDEX, VACUUM FULL, ,
, .


,
.
, , ,
. . ,
, :
, ,
. , ,
.
.
,
.
,
, , ,
, .
EXPLAIN [ANALYZE]
EXPLAIN [] , PostgreSQL
. EXPLAIN ANALYZE []
8 ,
.
, .
:
8

EXPLAIN ANALYZE DELETE . . .

23

2.6.
(seq scan).

(nested loop).
EXPLAIN ANALYZE:
?
,
.
,
. , ,
, - ,
,
. 80%
,
.
EXPLAIN ANALYZE
,
. ,
SET enable_seqscan=false;
,
, , .
postgresql.conf!
,
!


. :
pg_stat_user_tables

, ,
,
, .
pg_stat_user_indexes
,
, ,
(
, ,
).
pg_statio_user_tables
,
, , (.
24

2.6.
2.1.1),
, , TOAST.
,
(

).
.
, , , ,
PRIMARY KEY UNIQUE.
.
,
, .
PostgreSQL

/ , ,
. , , foo foo_name,
foo_name = ,
.
CREATE INDEX foo_name_first_idx
ON foo ((lower(substr(foo_name, 1, 1))));

SELECT * FROM foo
WHERE lower(substr(foo_name, 1, 1)) = ;
.
(partial indexes)
WHERE. , ,
scheta uplocheno boolean. ,
uplocheno = false , uplocheno = true,
.
CREATE INDEX scheta_neuplocheno ON scheta (id)
WHERE NOT uplocheno;

SELECT * FROM scheta WHERE NOT uplocheno AND ...;
, ,
WHERE, .
25

2.6.


PostrgeSQL
, PostgreSQL ,
.
,
, 9 . ,
,
;
.
,
: ,
.


,
,
. ,

, .
SELECT count(*) FROM < >
count() : ,
,
.
(
!)
, ,
,
.

Listing 2.2: SQL
1 SELECT count ( ) FROM f o o ;

foo,
.
, , .
- :
RULE PostgreSQL SQL, ,
,
9

26

2.6.
1. , 10 ,
,
ANALYZE:
Listing 2.3: SQL
1 SELECT r e l t u p l e s FROM p g _ c l a s s WHERE relname = f o o ;

2. ,
, ,
. ,

. ,

.
3. ,
(cron).
DISTINCT
DISTINCT .
GROUP BY DISTINCT. GROUP BY
, ,
DISTINCT.
Listing 2.4: DISTINCT
1 p o s t g r e s=# s e l e c t count ( ) from ( s e l e c t d i s t i n c t i from g ) a ;
2
count
3
4
19125
5 ( 1 row )
7 Time : 5 8 0 , 5 5 3 ms
10 p o s t g r e s=# s e l e c t count ( ) from ( s e l e c t d i s t i n c t i from g ) a ;
11
count
12
13
19125
14 ( 1 row )
16 Time : 3 6 , 2 8 1 ms

Listing 2.5: GROUP BY


1

p o s t g r e s=# s e l e c t count ( ) from ( s e l e c t i from g group by i ) a ;


10000 ,
50000 !
10

27

2.6.
2
count
3
4
19125
5 ( 1 row )
7 Time : 2 6 , 5 6 2 ms
10 p o s t g r e s=# s e l e c t count ( ) from ( s e l e c t i from g group by i ) a ;
11
count
12
13
19125
14 ( 1 row )
16 Time : 2 5 , 2 7 0 ms

pgFouine
pgFouine11 log- PostgreSQL,
log- PostgreSQL. pgFouine
, . pgFouine PHP
,
GNU General Public License.
, log-
.
pgFouine PostgreSQL
log-:
syslog
Listing 2.6: pgFouine
1
2
3

log_destination = syslog
redirect_stderr = off
silent_mode = on

, n :
Listing 2.7: pgFouine
1
2
3

log_min_duration_statement = n
log_duration = o f f
l o g _ s t a t e m e n t = none

log_min_duration_statement
0. , -1.
11

http://pgfouine.projects.postgresql.org/

28

2.7.
pgFouine .
HTML- :
Listing 2.8: pgFouine
1

p g f o u i n e . php f i l e your / l o g / f i l e . l o g > yourr e p o r t . html

10
:
Listing 2.9: pgFouine
1

p g f o u i n e . php f i l e your / l o g / f i l e . l o g top 10 format t e x t

, ,
http://pgfouine.projects.postgresql.org.

2.7

, PostgreSQL .
,

.
.

29

- ,

.
, ,

.

3.1

(partitioning, )
(, ) .
, .
(
).

(.. , ).
(
5. . . 10 ) 40. . . 50% ,
( ), , .

,
( ) ,
( ). ,
, .
SELECT * FROM articles ORDER BY id DESC LIMIT 10
,
.
, :
(, ,
) .
30

3.2.

(DROP TABLE ,
DELETE).
.

3.2

PostgreSQL
:
(range)
,
, .
, .
(list)
.
,
:
, .
.
,
.
,
.
,
. ,
. :
Listing 3.1:
1 CHECK ( o u t l e t I D BETWEEN 100 AND 200 )
2 CHECK ( o u t l e t I D BETWEEN 200 AND 300 )

,
200.
( ),
.
,
.
, constraint_exclusion postgresql.conf. ,
.

31

3.3.

3.3

. ,
,
.
. , , ()
(). ,
3 .
, . ,
, , ,
( ).
.
.

, :
Listing 3.2:
1 CREATE TABLE my_logs (
2
id
SERIAL PRIMARY KEY,
3
user_id
INT NOT NULL,
4
logdate
TIMESTAMP NOT NULL,
5
data
TEXT,
6
some_state
INT
7 );

,
.
.
my_logs,
. ():
Listing 3.3:
1 CREATE TABLE my_logs2010m10
2
CHECK ( l o g d a t e >= DATE
20101101 )
3 ) INHERITS ( my_logs ) ;
4 CREATE TABLE my_logs2010m11
5
CHECK ( l o g d a t e >= DATE
20101201 )
6 ) INHERITS ( my_logs ) ;
7 CREATE TABLE my_logs2010m12
8
CHECK ( l o g d a t e >= DATE
20110101 )
9 ) INHERITS ( my_logs ) ;
10 CREATE TABLE my_logs2011m01
11
CHECK ( l o g d a t e >= DATE
20100201 )

(
20101001 AND l o g d a t e < DATE
(
20101101 AND l o g d a t e < DATE
(
20101201 AND l o g d a t e < DATE
(
20110101 AND l o g d a t e < DATE

32

3.3.
12

) INHERITS ( my_logs ) ;

my_logs2010m10, my_logs2010m11
.., ( ).
CHECK ,
( ,
!).
logdate,
:
Listing 3.4:
1
2
3
4

CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX

my_logs2010m10_logdate
my_logs2010m11_logdate
my_logs2010m12_logdate
my_logs2011m01_logdate

ON
ON
ON
ON

my_logs2010m10
my_logs2010m11
my_logs2010m12
my_logs2011m01

( logdate ) ;
( logdate ) ;
( logdate ) ;
( logdate ) ;

,
.
Listing 3.5:
1 CREATE OR REPLACE FUNCTION m y _ l o g s _ i n s e r t _ t r i g g e r ( )
2 RETURNS TRIGGER AS $$
3 BEGIN
4
IF ( NEW. l o g d a t e >= DATE 20101001 AND
5
NEW. l o g d a t e < DATE 20101101 ) THEN
6
INSERT INTO my_logs2010m10 VALUES (NEW. ) ;
7
ELSIF ( NEW. l o g d a t e >= DATE 20101101 AND
8
NEW. l o g d a t e < DATE 20101201 ) THEN
9
INSERT INTO my_logs2010m11 VALUES (NEW. ) ;
10
ELSIF ( NEW. l o g d a t e >= DATE 20101201 AND
11
NEW. l o g d a t e < DATE 20110101 ) THEN
12
INSERT INTO my_logs2010m12 VALUES (NEW. ) ;
13
ELSIF ( NEW. l o g d a t e >= DATE 20110101 AND
14
NEW. l o g d a t e < DATE 20110201 ) THEN
15
INSERT INTO my_logs2011m01 VALUES (NEW. ) ;
16
ELSE
17
RAISE EXCEPTION Date out o f r a n g e . Fix t h e
my_logs_insert_trigger ( ) function ! ;
18
END IF ;
19
RETURN NULL;
20 END;
21 $$
22 LANGUAGE p l p g s q l ;

: logdate,
.
.
:
Listing 3.6:
33

3.3.
1 CREATE TRIGGER i n s e r t _ m y _ l o g s _ t r i g g e r
2
BEFORE INSERT ON my_logs
3
FOR EACH ROW EXECUTE PROCEDURE m y _ l o g s _ i n s e r t _ t r i g g e r ( ) ;

my_logs:
Listing 3.7:
1 INSERT INTO my_logs ( user_id , l o g d a t e , data , some_state ) VALUES( 1 ,
20101030 , 3 0 . 1 0 . 2 0 1 0 data , 1 ) ;
2 INSERT INTO my_logs ( user_id , l o g d a t e , data , some_state ) VALUES( 2 ,
20101110 , 1 0 . 1 1 . 2 0 1 0 data2 , 1 ) ;
3 INSERT INTO my_logs ( user_id , l o g d a t e , data , some_state ) VALUES( 1 ,
20101215 , 1 5 . 1 2 . 2 0 1 0 data3 , 1 ) ;

:
Listing 3.8:
1 p a r t i t i o n i n g _ t e s t=# SELECT FROM ONLY my_logs ;
2
i d | u s e r _ i d | l o g d a t e | data | some_state
3 ++++
4 ( 0 rows )

.
:
Listing 3.9:
1
2
3
4
5
6
7

p a r t i t i o n i n g _ t e s t=# SELECT FROM my_logs ;


id | user_id |
logdate
|
data
|
some_state
++++
1 |
1 | 20101030 0 0 : 0 0 : 0 0 | 3 0 . 1 0 . 2 0 1 0 data |
1
2 |
2 | 20101110 0 0 : 0 0 : 0 0 | 1 0 . 1 1 . 2 0 1 0 data2 |
1
3 |
1 | 20101215 0 0 : 0 0 : 0 0 | 1 5 . 1 2 . 2 0 1 0 data3 |
1
( 3 rows )

. ,
:
Listing 3.10:
1 p a r t i t i o n i n g _ t e s t=# Select from my_logs2010m10 ;
2
id | user_id |
logdate
|
data
| some_state
3 ++++

34

3.3.
4
5

1 |
( 1 row )

1 | 20101030 0 0 : 0 0 : 0 0 | 3 0 . 1 0 . 2 0 1 0 data |

7
8

p a r t i t i o n i n g _ t e s t=# Select from my_logs2010m11 ;


id | user_id |
logdate
|
data
|
some_state
9 ++++
10
2 |
2 | 20101110 0 0 : 0 0 : 0 0 | 1 0 . 1 1 . 2 0 1 0 data2 |
1
11 ( 1 row )

! .
my_logs :
Listing 3.11:
1
2

p a r t i t i o n i n g _ t e s t=# SELECT FROM my_logs WHERE u s e r _ i d = 2 ;


id | user_id |
logdate
|
data
|
some_state
3 ++++
4
2 |
2 | 20101110 0 0 : 0 0 : 0 0 | 1 0 . 1 1 . 2 0 1 0 data2 |
1
5 ( 1 row )
7
8
9
10
11
12

p a r t i t i o n i n g _ t e s t=# SELECT FROM my_logs WHERE data LIKE %0.1% ;


id | user_id |
logdate
|
data
|
some_state
++++
1 |
1 | 20101030 0 0 : 0 0 : 0 0 | 3 0 . 1 0 . 2 0 1 0 data |
1
2 |
2 | 20101110 0 0 : 0 0 : 0 0 | 1 0 . 1 1 . 2 0 1 0 data2 |
1
( 2 rows )



.
. ,
2008 , 10 . :
Listing 3.12:
1 DROP TABLE my_logs2008m10 ;

DROP TABLE ,
DELETE. ,
, ,
,
:
35

3.3.
Listing 3.13:
1 ALTER TABLE my_logs2008m10 NO INHERIT my_logs ;

,
.

constraint_exclusion

constraint_exclusion ,
. ,
:
Listing 3.14: constraint_exclusion OFF
1
2

p a r t i t i o n i n g _ t e s t=# SET c o n s t r a i n t _ e x c l u s i o n = o f f ;
p a r t i t i o n i n g _ t e s t=# EXPLAIN SELECT FROM my_logs WHERE l o g d a t e >
20101201 ;

4
QUERY PLAN
5
6
R e s u l t ( c o s t = 6 . 8 1 . . 1 0 4 . 6 6 rows=1650 width =52)
7
> Append ( c o s t = 6 . 8 1 . . 1 0 4 . 6 6 rows=1650 width =52)
8
> Bitmap Heap Scan on my_logs ( c o s t = 6 . 8 1 . . 2 0 . 9 3
rows=330 width =52)
9
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
10
> Bitmap Index Scan on my_logs_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
11
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
12
> Bitmap Heap Scan on my_logs2010m10 my_logs
( c o s t = 6 . 8 1 . . 2 0 . 9 3 rows=330 width =52)
13
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
14
> Bitmap Index Scan on my_logs2010m10_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
15
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
16
> Bitmap Heap Scan on my_logs2010m11 my_logs
( c o s t = 6 . 8 1 . . 2 0 . 9 3 rows=330 width =52)
17
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
18
> Bitmap Index Scan on my_logs2010m11_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
19
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
20
> Bitmap Heap Scan on my_logs2010m12 my_logs
( c o s t = 6 . 8 1 . . 2 0 . 9 3 rows=330 width =52)
21
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )

36

3.3.
22
23
24
25
26
27
28

> Bitmap Index Scan on my_logs2010m12_logdate


( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
> Bitmap Heap Scan on my_logs2011m01 my_logs
( c o s t = 6 . 8 1 . . 2 0 . 9 3 rows=330 width =52)
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
> Bitmap Index Scan on my_logs2011m01_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
( 2 2 rows )

EXPLAIN,
, ,
logdate > 2010-12-01 ,
, . constraint_exclusion:
Listing 3.15: constraint_exclusion ON

1 p a r t i t i o n i n g _ t e s t=# SET c o n s t r a i n t _ e x c l u s i o n = on ;
2 SET
3 p a r t i t i o n i n g _ t e s t=# EXPLAIN SELECT FROM my_logs WHERE l o g d a t e >
20101201 ;
4
QUERY PLAN
5
6
R e s u l t ( c o s t = 6 . 8 1 . . 4 1 . 8 7 rows=660 width =52)
7
> Append ( c o s t = 6 . 8 1 . . 4 1 . 8 7 rows=660 width =52)
8
> Bitmap Heap Scan on my_logs ( c o s t = 6 . 8 1 . . 2 0 . 9 3
rows=330 width =52)
9
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
10
> Bitmap Index Scan on my_logs_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
11
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
12
> Bitmap Heap Scan on my_logs2010m12 my_logs
( c o s t = 6 . 8 1 . . 2 0 . 9 3 rows=330 width =52)
13
Recheck Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
14
> Bitmap Index Scan on my_logs2010m12_logdate
( c o s t = 0 . 0 0 . . 6 . 7 3 rows=330 width =0)
15
Index Cond : ( l o g d a t e > 20101201
0 0 : 0 0 : 0 0 : : timestamp w i t h o u t time zone )
16 ( 1 0 rows )

, ,
, . constraint_exclusion
, ,
CHECK , ,
37

3.4.
. 8.4 PostgreSQL constraint_exclusion on, off partition. (
) constraint_exclusion on, off, partition, CHECK
.

3.4


.
,
. ,
, ( )
50%
.

38

,
.
,
.

4.1

(. replication)
(, ).
,
. ,
, .
.
, ,

. ,
.
(,
). ,
,
- ( ,
, ).

, .
, , ,

( ,
).

,
.
39

4.1.
,
,
.
,
( ). ,
, ,
, ,
.
(, , ).
,
, ( )
. ,
, . ,
,
X, B ,
, Y X.
Y, Y (, -
), B Y ,
, ,
. ,
Y , X .

. , ,
,
( )
.

: ,
.
, ,
.
,
.
,
. ,
,
,
.
,
.
PostgreSQL , ,
.
(, ). :

40

4.2. Slony-I
Slony-I1 Master-Slave , (cascading)
(failover). Slony-I PostgreSQL INSERT/ DELETE/UPDATE
.
PGCluster2 Multi-Master .
, .
pgpool-I/II3 PostgreSQL (
II ). :
( ,
stand-by );
online-;
pooling ;
;
SELECT- postgresql-;

.
Bucardo4 , MultiMaster Master-Slave ,
.
Londiste5 Master-Slave .
Skytools6 . , Slony-I.
Mammoth Replicator7 Multi-Master .
Postgres-R8 Multi-Master .
RubyRep9 Ruby, Multi-Master ,
PostgreSQL MySQL.
, , ,
PostgreSQL.

4.2

Slony-I

Slony ,
PostgreSQL . Slony
Postgre INSERT/ DELETE/UPDATE
.
http://www.slony.info/
http://pgfoundry.org/projects/pgcluster/
3
http://pgpool.projects.postgresql.org/
4
http://bucardo.org/
5
http://skytools.projects.postgresql.org/doc/londiste.ref.html
6
http://pgfoundry.org/projects/skytools/
7
http://www.commandprompt.com/products/mammothreplicator/
8
http://www.postgres-r.org/
9
http://www.rubyrep.org/
1
2

41

4.2. Slony-I
Slony
, slony
slonik. slonik-,
slon .
, slon , .
slonik-e
slonik stdin.
slonik-a ,
, , slonik
syntax error, .
. .

Ubuntu :
Listing 4.1:
1

sudo a p t i t u d e i n s t a l l s l o n y 1 b i n

customers
( , ).

: customers
master_host: customers_master.com
slave_host_1: customers_slave.com
cluster name ( ): customers_rep

master-
Postgres,
Slony. , ,
slony.
Listing 4.2: master-
1
2
3

pgsql@customers_master$ c r e a t e u s e r a d s l o n y
pgsql@customers_master$ p s q l d t e m p l a t e 1 c " a l t e r \
u s e r s l o n y with password slony_user_password ; "


slony, slon.
, ( slon)
.
42

4.2. Slony-I
slave-
,
Internet ( ),
PostgreSQL -,
. , :
Listing 4.3: slave-
1 a nyu ser @cu sto mers _s lave $ p s q l d c u s t o m e r s \
2 h customers_master . com U s l o n y

- ( , ).
- , firewalla, pg_hba.conf, $PGDATA.
slave- PostgreSQL.
, Postgres up and ready,
- ,
(
postmaster):
Listing 4.4: slave-
1
2
3
4
5
6

p g s q l @ c u s t o m e r s _ s l a v e $ rm r f $PGDATA
p g s q l @ c u s t o m e r s _ s l a v e $ mkdir $PGDATA
p g s q l @ c u s t o m e r s _ s l a v e $ i n i t d b E UTF8 D $PGDATA
p g s q l @ c u s t o m e r s _ s l a v e $ c r e a t e u s e r a d s l o n y
p g s q l @ c u s t o m e r s _ s l a v e $ p s q l d t e m p l a t e 1 c " a l t e r \
u s e r s l o n y with password slony_user_password ; "

postmaster.
!
. !
Listing 4.5: slave-
1
2
3

p g s q l @ c u s t o m e r s _ s l a v e $ c r e a t e u s e r a d customers_owner
p g s q l @ c u s t o m e r s _ s l a v e $ p s q l d t e m p l a t e 1 c " a l t e r \
u s e r customers_owner with password customers_owner_password ; "

customers_master,
-h customers_slave,
slave.
slave, master, Slony.
plpgsql slave
slony.
(slony_user_password).
:
43

4.2. Slony-I
Listing 4.6: plpgsql slave
1 slony@customers_master$ c r e a t e d b O customers_owner \
2 h c u s t o m e r s _ s l a v e . com c u s t o m e r s
3 slony@customers_master$ c r e a t e l a n g d c u s t o m e r s \
4 h c u s t o m e r s _ s l a v e . com p l p g s q l

! , replication set
primary key. -
, primary
key ALTER TABLE ADD PRIMARY KEY.
primary key ,
serial (ALTER TABLE ADD COLUMN),
. table add
key slonik-a.
. slave:
Listing 4.7: plpgsql slave
1
2

slony@customers_master$ pg_dump s c u s t o m e r s | \
p s q l U s l o n y h c u s t o m e r s _ s l a v e . com c u s t o m e r s

pg_dump -s .
pg_dump -s customers , psql -U
slony -h customers_slave.com customers (slony_user_pass).
: -
Slony ( make install), sl_*,
. , :
(
5)
:-) slave :
Listing 4.8: plpgsql slave
1
2
3
4
5
6
7

s l o n i k <<EOF
c l u s t e r name = c u s t o m e r s _ s l a v e ;
node Y admin c o n n i n f o = dbname=c u s t o m e r s
h o s t=customers_master . com
p o r t =5432 u s e r=s l o n y password=slony_user_pass ;
u n i n s t a l l node ( i d = Y) ;
echo okay ;
EOF

Y . . : , cluster
name - , T1 (default).
uninstall.
( ),
uninstall ( master slave).
44

4.2. Slony-I

PgSQL
, - ,
.
- :
Listing 4.9:
1 #! / b i n / sh
3 CLUSTER=customers_rep
5 DBNAME1=c u s t o m e r s
6 DBNAME2=c u s t o m e r s
8 HOST1=customers_master . com
9 HOST2=c u s t o m e r s _ s l a v e . com
11 PORT1=5432
12 PORT2=5432
14 SLONY_USER=s l o n y
16
17
18
19
20
21
22
23

s l o n i k <<EOF
c l u s t e r name = $CLUSTER ;
node 1 admin c o n n i n f o = dbname=$DBNAME1 h o s t=$HOST1 p o r t=$PORT1
u s e r=s l o n y password=slony_user_password ;
node 2 admin c o n n i n f o = dbname=$DBNAME2 h o s t=$HOST2
p o r t=$PORT2 u s e r=s l o n y password=slony_user_password ;
i n i t c l u s t e r ( i d = 1 , comment = Customers DB
replication cluster ) ;

25

echo C r e a t e set ;

27 c r e a t e set ( i d = 1 , o r i g i n = 1 , comment = Customers


28 DB r e p l i c a t i o n set ) ;
30

echo Adding t a b l e s t o t h e s u b s c r i p t i o n set ;

32
33
34
35

echo Adding t a b l e p u b l i c . c u s t o m e r s _ s a l e s . . . ;
set add t a b l e ( set i d = 1 , o r i g i n = 1 , i d = 4 , f u l l q u a l i f i e d
name = p u b l i c . c u s t o m e r s _ s a l e s , comment = Table
public . customers_sales ) ;
echo done ;

37
38
39
40
41

echo Adding t a b l e p u b l i c . customers_something . . . ;


set add t a b l e ( set i d = 1 , o r i g i n = 1 , i d = 5 , f u l l q u a l i f i e d
name = p u b l i c . customers_something ,
comment = Table p u b l i c . customers_something ) ;
echo done ;

43

echo done adding ;

45

4.2. Slony-I
44
45
46
47
48
49
50

s t o r e node ( i d = 2 , comment = Node 2 , $HOST2 ) ;


echo s t o r e d node ;
s t o r e path ( s e r v e r = 1 , c l i e n t = 2 , c o n n i n f o = dbname=$DBNAME1
h o s t=$HOST1
p o r t=$PORT1 u s e r=s l o n y password=slony_user_password ) ;
echo s t o r e d path ;
s t o r e path ( s e r v e r = 2 , c l i e n t = 1 , c o n n i n f o = dbname=$DBNAME2
h o s t=$HOST2
p o r t=$PORT2 u s e r=s l o n y password=slony_user_password ) ;

52 s t o r e l i s t e n ( o r i g i n = 1 , p r o v i d e r = 1 , r e c e i v e r = 2 ) ;
53 s t o r e l i s t e n ( o r i g i n = 2 , p r o v i d e r = 2 , r e c e i v e r = 1 ) ;
54 EOF

, ,
. : ,
, id ,
primary key.
: replication set .
set.
:
. unsubscribe subscribe .
slave- replication set
:
Listing 4.10: slave- replication set
1 #! / b i n / sh
3 CLUSTER=customers_rep
5 DBNAME1=c u s t o m e r s
6 DBNAME2=c u s t o m e r s
8 HOST1=customers_master . com
9 HOST2=c u s t o m e r s _ s l a v e . com
11 PORT1=5432
12 PORT2=5432
14 SLONY_USER=s l o n y
16
17
18
19
20
21

s l o n i k <<EOF
c l u s t e r name = $CLUSTER ;
node 1 admin c o n n i n f o = dbname=$DBNAME1 h o s t=$HOST1
p o r t=$PORT1 u s e r=s l o n y password=slony_user_password ;
node 2 admin c o n n i n f o = dbname=$DBNAME2 h o s t=$HOST2
p o r t=$PORT2 u s e r=s l o n y password=slony_user_password ;

23

echo s u b s c r i b i n g ;

46

4.2. Slony-I
24

s u b s c r i b e set ( i d = 1 , p r o v i d e r = 1 , r e c e i v e r = 2 , f o r w a r d = no ) ;

26 EOF


, .
Listing 4.11:
1
2

slony@customers_master$ s l o n customers_rep \
"dbname=c u s t o m e r s u s e r=s l o n y "

Listing 4.12:
1
2

s l o n y @ c u s t o m e r s _ s l a v e $ s l o n customers_rep \
"dbname=c u s t o m e r s u s e r=s l o n y "

.
COPY, slave DB
.
slave-
10- . slon
, .



4.2.1 4.2.2.
id = 3. customers_slave3.com,
- PgSQL.
( 4.2.2) :
Listing 4.13:
1
2
3
4
5
6
7

s l o n i k <<EOF
c l u s t e r name = c u s t o m e r s _ s l a v e ;
node 3 admin c o n n i n f o = dbname=c u s t o m e r s
h o s t=c u s t o m e r s _ s l a v e 3 . com
p o r t =5432 u s e r=s l o n y password=slony_user_pass ;
u n i n s t a l l node ( i d = 3 ) ;
echo okay ;
EOF

, ,
.
.
:
47

4.2. Slony-I
Listing 4.14:
1 #! / b i n / sh
3 CLUSTER=customers_rep
5 DBNAME1=c u s t o m e r s
6 DBNAME3=c u s t o m e r s
8 HOST1=customers_master . com
9 HOST3=c u s t o m e r s _ s l a v e 3 . com
11 PORT1=5432
12 PORT2=5432
14 SLONY_USER=s l o n y
16
17
18
19
20
21

s l o n i k <<EOF
c l u s t e r name = $CLUSTER ;
node 1 admin c o n n i n f o = dbname=$DBNAME1 h o s t=$HOST1
p o r t=$PORT1 u s e r=s l o n y password=slony_user_pass ;
node 3 admin c o n n i n f o = dbname=$DBNAME3
h o s t=$HOST3 p o r t=$PORT2 u s e r=s l o n y password=slony_user_pass ;

23

echo done adding ;

25
26
27
28
29
30
31

s t o r e node ( i d = 3 , comment = Node 3 , $HOST3 ) ;


echo s o r e d node ;
s t o r e path ( s e r v e r = 1 , c l i e n t = 3 , c o n n i n f o = dbname=$DBNAME1
h o s t=$HOST1 p o r t=$PORT1 u s e r=s l o n y password=slony_user_pass ) ;
echo s t o r e d path ;
s t o r e path ( s e r v e r = 3 , c l i e n t = 1 , c o n n i n f o = dbname=$DBNAME3
h o s t=$HOST3 p o r t=$PORT2 u s e r=s l o n y password=slony_user_pass ) ;

33
34
35

echo again ;
store l i s t e n ( origin = 1 , provider = 1 , receiver = 3 ) ;
store l i s t e n ( origin = 3 , provider = 3 , receiver = 1 ) ;

37 EOF

id 3, 2 .
3 replication set:
Listing 4.15:
1 #! / b i n / sh
3 CLUSTER=customers_rep
5 DBNAME1=c u s t o m e r s
6 DBNAME3=c u s t o m e r s
8 HOST1=customers_master . com
9 HOST3=c u s t o m e r s _ s l a v e 3 . com

48

4.2. Slony-I

11 PORT1=5432
12 PORT2=5432
14 SLONY_USER=s l o n y
16
17
18
19
20
21

s l o n i k <<EOF
c l u s t e r name = $CLUSTER ;
node 1 admin c o n n i n f o = dbname=$DBNAME1 h o s t=$HOST1
p o r t=$PORT1 u s e r=s l o n y password=slony_user_pass ;
node 3 admin c o n n i n f o = dbname=$DBNAME3 h o s t=$HOST3
p o r t=$PORT2 u s e r=s l o n y password=slony_user_pass ;

23
24

echo s u b s c r i b i n g ;
s u b s c r i b e set ( i d = 1 , p r o v i d e r = 1 , r e c e i v e r = 3 , f o r w a r d = no ) ;

26 EOF

slon , .
slon .
Listing 4.16:
1
2

s l o n y @ c u s t o m e r s _ s l a v e 3 $ s l o n customers_rep \
"dbname=c u s t o m e r s u s e r=s l o n y "



,
: ,
:
Listing 4.17:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

%s l o n customers_rep "dbname=c u s t o m e r s u s e r=s l o n y _ u s e r "


CONFIG main : s l o n v e r s i o n 1 . 0 . 5 s t a r t i n g up
CONFIG main : l o c a l node i d = 3
CONFIG main : l o a d i n g c u r r e n t c l u s t e r c o n f i g u r a t i o n
CONFIG s t o r e N o d e : no_id=1 no_comment=CustomersDB
replication cluster
CONFIG s t o r e N o d e : no_id=2 no_comment=Node 2 ,
node2 . example . com
CONFIG s t o r e N o d e : no_id=4 no_comment=Node 4 ,
node4 . example . com
CONFIG s t o r e P a t h : pa_server=1 p a _ c l i e n t=3
pa_conninfo="dbname=c u s t o m e r s
h o s t=mainhost . com p o r t =5432 u s e r=s l o n y _ u s e r
password=slony_user_pass " pa_connretry=10
CONFIG s t o r e L i s t e n : l i _ o r i g i n =1 l i _ r e c e i v e r =3

49

4.2. Slony-I
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

l i _ p r o v i d e r =1
CONFIG s t o r e S e t : s e t _ i d=1 s e t _ o r i g i n =1
set_comment=CustomersDB r e p l i c a t i o n set
WARN remoteWorker_wakeup : node 1 no worker t h r e a d
CONFIG s t o r e S u b s c r i b e : sub_set=1 sub_provider=1 sub_forward = f
WARN remoteWorker_wakeup : node 1 no worker t h r e a d
CONFIG e n a b l e S u b s c r i p t i o n : sub_set=1
WARN remoteWorker_wakeup : node 1 no worker t h r e a d
CONFIG main : c o n f i g u r a t i o n complete s t a r t i n g t h r e a d s
CONFIG enableNode : no_id=1
CONFIG enableNode : no_id=2
CONFIG enableNode : no_id=4
ERROR remoteWorkerThread_1 : " b e g i n t r a n s a c t i o n ; s e t
transaction isolation level
s e r i a l i z a b l e ; l o c k t a b l e " _customers_rep " . s l _ c o n f i g _ l o c k ; s e l e c t
" _customers_rep " . e n a b l e S u b s c r i p t i o n ( 1 , 1 , 4 ) ;
n o t i f y " _customers_rep_Event " ; n o t i f y " _customers_rep_Confirm " ;
i n s e r t i n t o " _customers_rep " . s l _ e v e n t ( e v _ o r i g i n , ev_seqno ,
ev_timestamp , ev_minxid , ev_maxxid , ev_xip ,
ev_type , ev_data1 , ev_data2 , ev_data3 , ev_data4 ) v a l u e s
( 1 , 219440 ,
2005 05 05 1 8 : 5 2 : 4 2 . 7 0 8 3 5 1 , 5 2 5 0 1 2 8 3 , 5 2 5 0 1 2 9 2 ,
5 2 5 0 1 2 8 3 , ENABLE_SUBSCRIPTION ,
1 , 1 , 4 , f ) ; i n s e r t i n t o " _customers_rep " .
s l _ c o n f i r m ( c o n _ o r i g i n , c on_ re cei ve d ,
con_seqno , con_timestamp ) v a l u e s ( 1 , 3 , 2 1 9 4 4 0 ,
CURRENT_TIMESTAMP) ; commit t r a n s a c t i o n ; "
PGRES_FATAL_ERROR ERROR: i n s e r t o r update on t a b l e
" s l _ s u b s c r i b e " v i o l a t e s f o r e i g n key
c o n s t r a i n t " s l _ s u b s c r i b e sl_pathr e f "
DETAIL : Key ( sub_provider , s u b _ r e c e i v e r ) =(1 ,4)
i s not p r e s e n t i n t a b l e " sl_path " .
INFO remoteListenThread_1 : d i s c o n n e c t i n g from
dbname=c u s t o m e r s h o s t=mainhost . com
p o r t =5432 u s e r=s l o n y _ u s e r password=slony_user_pass
%

_< >.sl_path;,
_customers_rep.sl_path
. , id 4, (1,4)
sl_path .
, Slony.
.
,
( ,
(1,4)):
Listing 4.18:
1
2

s l o ny _ u s e r @ m a s t e r h o s t $ p s q l d c u s t o m e r s h _every_one_of_slaves
U s l o n y
c u s t o m e r s=# i n s e r t i n t o _customers_rep . s l _ p a t h

50

4.2. Slony-I
3
4

v a l u e s ( 1 , 4 , dbname=c u s t o m e r s h o s t=mainhost . com


p o r t =5432 u s e r=s l o n y _ u s e r password=slony_user_password , 1 0 ) ;

,
.
_< >,
_customers_rep.


master-, SELECT-
. pg_stat_activity :
Listing 4.19:
1
2
3
4
5
6
7
8
9
10
11
12
13

s e l e c t e v _ o r i g i n , ev_seqno , ev_timestamp , ev_minxid , ev_maxxid ,


ev_xip ,
ev_type , ev_data1 , ev_data2 , ev_data3 , ev_data4 , ev_data5 ,
ev_data6 ,
ev_data7 , ev_data8 from " _customers_rep " . s l _ e v e n t e where
( e . e v _ o r i g i n = 2 and e . ev_seqno > 3 3 6 9 9 6 ) o r
( e . e v _ o r i g i n = 3 and e . ev_seqno > 1 7 1 2 8 7 1 ) o r
( e . e v _ o r i g i n = 4 and e . ev_seqno > 7 2 1 2 8 5 ) o r
( e . e v _ o r i g i n = 5 and e . ev_seqno > 8 0 7 7 1 5 ) o r
( e . e v _ o r i g i n = 1 and e . ev_seqno > 3 5 4 4 7 6 3 ) o r
( e . e v _ o r i g i n = 6 and e . ev_seqno > 2 5 2 9 4 4 5 ) o r
( e . e v _ o r i g i n = 7 and e . ev_seqno > 2 5 1 2 5 3 2 ) o r
( e . e v _ o r i g i n = 8 and e . ev_seqno > 2 5 0 0 4 1 8 ) o r
( e . e v _ o r i g i n = 1 0 and e . ev_seqno > 1 6 9 2 3 1 8 )
o r d e r by e . e v _ o r i g i n , e . ev_seqno ;

_customers_rep ,
.
sl_event - ,
. :
Listing 4.20:
1
2

d e l e t e from _customers_rep . s l _ e v e n t where


ev_timestamp<NOW( ) 1 DAY : : i n t e r v a l ;

.
_customers_rep.sl_log_*
, -
, _customers_rep.sl_log_1 .

51

4.3. Londiste

4.3

Londiste

Londiste ,
python. : .
- , SlonyI. Londiste PgQ
(
,
,
PostgreSQL). :

:
, (failover)
(switchover) ( 3
10 )

, Linux,
Ubuntu Server. ,
( Windows) ,
PostgreSQL Windows, , .
Londiste Skytools,
. , Debian Ubuntu skytools
:
Listing 4.21:
1

sudo a p t i t u d e i n s t a l l s k y t o o l s


http://pgfoundry.org/projects/skytools.
2.1.11. , :
Listing 4.22:
1
2
3

$wget h t t p : / / pgfoundry . o r g / f r s / download . php /2561/


s k y t o o l s 2 . 1 . 1 1 . t a r . gz
$ t a r z x v f s k y t o o l s 2 . 1 . 1 1 . t a r . gz
10

http://skytools.projects.postgresql.org/skytools-3.0/doc/skytools3.html

52

4.3. Londiste
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

$cd s k y t o o l s 2 . 1 . 1 1 /
# deb
$sudo a p t i t u d e i n s t a l l b u i l d e s s e n t i a l a u t o c o n f \
automake a u t o t o o l s dev dhmake \
debhelper devscripts fakeroot x u t i l s l i n t i a n pbuilder \
pythondev yada
# p o s t g r e s q l 8 . 4 . x
$sudo a p t i t u d e i n s t a l l p o s t g r e s q l s e r v e r dev 8.4
# pythonp s y c o p g L o n d i s t e
$sudo a p t i t u d e i n s t a l l pythonpsycopg2
# deb
# p o s t g r e s q l 8 . 4 . x ( 8 . 3 . x "make deb83 ")
$sudo make deb84
$cd . . /
# s k y t o o l s
$dpkg i s k y t o o l s modules 8.4_2 . 1 . 1 1 _i386 . deb
s k y t o o l s _ 2 . 1 . 1 1 _i386 . deb

Skytools
Listing 4.23:
1
2
3

./ configure
make
make i n s t a l l

,
Listing 4.24:
1
2
3
4

$ l o n d i s t e . py V
Skytools version 2.1.11
$pgqadm . py V
Skytools version 2.1.11

,
.

:
host1 ;
host2 ;
ticker-
Londiste ticker ,
. , ,
, .
ticker- ( /etc/skytools/db1-ticker.ini):
53

4.3. Londiste
Listing 4.25: ticker-
1 [ pgqadm ]
2 #
3 job_name = db1t i c k e r
5 #
6 db = dbname=P h o s t=h o s t 1
8 #
9 # ( . . )
10 maint_delay = 600
12 #
13 # ( )
14 loop_delay = 0 . 1
16 # l o g p i d
17 l o g f i l e = / var / l o g /%(job_name ) s . l o g
18 p i d f i l e = / var / p i d /%(job_name ) s . p i d

(SQL)
ticker .
pgqadm.py :
Listing 4.26: ticker-
1
2

pgqadm . py / e t c / s k y t o o l s /db1t i c k e r . i n i i n s t a l l
pgqadm . py / e t c / s k y t o o l s /db1t i c k e r . i n i t i c k e r d

, (/var/log/skytools/db1-tickers.log) .
( ).
ticker,
:
Listing 4.27: ticker-
1

pgqadm . py / e t c / s k y t o o l s /db1t i c k e r . i n i t i c k e r s

ticker:
Listing 4.28: ticker-
1

pgqadm . py / e t c / s k y t o o l s /db1t i c k e r . i n i t i c k e r k


Londiste .
slave
, .

54

4.3. Londiste


( /etc/skytools/db1-londiste.ini):
Listing 4.29:
1 [ londiste ]
2 #
3 job_name = db1l o n d i s t e
5 #
6 provider_db = dbname=db1 p o r t =5432 h o s t=h o s t 1
7 #
8 s u b s c r i b e r _ d b = dbname=db1 h o s t=h o s t 2
10
11
12
13
14
15

#
# SQL, . .
# .
# ! ,
#
pgq_queue_name = db1l o n d i s t e queue

17 # l o g p i d
18 l o g f i l e = / var / l o g /%(job_name ) s . l o g
19 p i d f i l e = / var / run/%(job_name ) s . p i d
21 #
22 l o g _ s i z e = 5242880
23 log_count = 3

Londiste
SQL
.
:
Listing 4.30: Londiste
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i p r o v i d e r i n s t a l l

:
Listing 4.31: Londiste
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i s u b s c r i b e r i n s t a l l

.
Londiste
:
55

4.3. Londiste
Listing 4.32:
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i r e p l a y d

, , ..
,
.
(/var/log/db1-londistes.log).

:
Listing 4.33:
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i p r o v i d e r add a l l

:
Listing 4.34:
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i s u b s c r i b e r add a l l

- all,
,
, .
(sequence)
. :
Listing 4.35:
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i p r o v i d e r adds e q a l l

:
Listing 4.36:
1

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i s u b s c r i b e r adds e q
a l l


all.

, . Londiste
bulk copy , ( COPY)
,
.
:
56

4.3. Londiste
Listing 4.37:
1

l e s s / var / l o g /db1l o n d i s t e . l o g

, .
, "ok".
Listing 4.38:
1
3
4
5
6
7
8
9
10

l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i s u b s c r i b e r t a b l e s
Table S t a t e
public . table1
public . table2
public . table3
public . table4
public . table5
public . table6
...

ok
ok
incopy


( ):
Listing 4.39:
1
2
3

(
while [ $ (
python l o n d i s t e . py / e t c / s k y t o o l s /db1l o n d i s t e . i n i s u b s c r i b e r
tables |
4 t a i l n+2 | awk { p r i n t $2 } | g r e p v ok | wc l ) ne 0 ] ;
5 do s l e e p 6 0 ; done ; echo | m a i l s R e p l i c a t i o n done EOM
user@domain . com
6 ) &



:
Listing 4.40:
1

l o n d i s t e . py <i n i > p r o v i d e r t a b l e s | x a r g s l o n d i s t e . py <i n i >


s u b s c r i b e r add



.
Listing 4.41:
57

4.3. Londiste
1 SELECT queue_name , consumer_name , l a g , l a s t _ s e e n
2
FROM pgq . get_consumer_info ( ) ;

lag , last_seen
.
, 60 .

Londiste
, . PGQ,
, API:
Listing 4.42:
1 SELECT pgq . u n r e g i s t e r _ c o n s u m e r ( queue_name , consumer_name ) ;

pgqadm.py:
Listing 4.43:
1

pgqadm . py < t i c k e r . i n i > u n r e g i s t e r queue_name consumer_name


:
1.
2.
3.
4.
5.


BEGIN;

SELECT londiste.provider_refresh_trigger(queue_name, tablename);
COMMIT;


1.
2.
3.
4.
5.
6.

BEGIN;

SELECT londiste.provider_refresh_trigger(queue_name, tablename);
COMMIT;
lag, londiste

, ,
.

58

4.4. Streaming Replication ( )


Londiste lag
, ,
ticker. UPDATE DELETE
,
. . .
,
pgq.subscription sub_last_tick sub_next_tick.
Listing 4.44:
1 SELECT count ( )
2
FROM pgq . event_1 ,
3
(SELECT t i c k _ s n a p s h o t
4
FROM pgq . t i c k
5
WHERE t i c k _ i d BETWEEN 5715138 AND 5715139
6
) as t ( s n a p s h o t s )
7 WHERE t x i d _ v i s i b l e _ i n _ s n a p s h o t ( ev_txid , s n a p s h o t s ) ;

, 5 400 .
.
Londiste, .
Londiste . INI
ticker- :
Listing 4.45:
1

pgq_lazy_fetch = 500

Londiste 500 .
.

4.4

Streaming Replication (
)

(Streaming Replication, SR)


wall xlog
.
PostgreSQL 9 ( !).
, , ,
,
PostgreSQL.
:
PostgreSQL
59

4.4. Streaming Replication ( )





-
:

PostgreSQL
(
, ,
,
, Hot Standby)

PostgreSQL 9 .
9.0.1 . , ,
Linux.

masterdb(192.168.0.10)
slavedb(192.168.0.20).


ssh. postgres . ,
:
Listing 4.46: userssh
1 $sudo groupadd u s e r s s h
2 $sudo u s e r a d d m g u s e r s s h d /home/ u s e r s s h s / b i n / bash \
3 c " u s e r s s h a l l o w " u s e r s s h

(
postgres):
Listing 4.47: postgres
1

su p o s t g r e s

RSA-
:

60

4.4. Streaming Replication ( )


Listing 4.48: RSA-
1
2
3
4
5
6
7
8

p o s t g r e s @ l o c a l h o s t ~ $ sshkeygen t r s a P " "


G e n e r a t i n g p u b l i c / p r i v a t e r s a key p a i r .
Enter f i l e i n which t o s a v e t h e key
( / var / l i b / p o s t g r e s q l / . s s h / i d _ r s a ) :
Created d i r e c t o r y / var / l i b / p o s t g r e s q l / . ssh .
Your i d e n t i f i c a t i o n has been saved i n
/ var / l i b / p o s t g r e s q l / . s s h / i d _ r s a .
Your p u b l i c key has been saved i n
/ var / l i b / p o s t g r e s q l / . s s h / i d _ r s a . pub .
The key f i n g e r p r i n t i s :
1 6 : 0 8 : 2 7 : 9 7 : 2 1 : 3 9 : b5 : 7 b : 8 6 : e1 : 4 6 : 9 7 : b f : 1 2 : 3 d : 7 6 p o s t g r e s @ l o c a l h o s t

:
Listing 4.49:
1

cat $HOME/ . s s h / i d _ r s a . pub >> $HOME/ . s s h / a u t h o r i z e d _ k e y s

,
:
Listing 4.50: ssh
1

ssh l o c a l h o s t

sshd:
Listing 4.51: sshd
1

/ e t c / i n i t . d/ s s h d s t a r t

$HOME/.ssh slavedb.

ssh.
pg_hba.conf ,
(trust):
Listing 4.52: pg_hba.conf
1

host

replication

all

192.168.0.20/32

trust

Listing 4.53: pg_hba.conf


1

host

replication

all

192.168.0.10/32

trust

postgresql .

masterdb. postgresql.conf
:
61

4.4. Streaming Replication ( )


Listing 4.54:
1 # To e n a b l e reado n l y q u e r i e s on a s t a n d b y s e r v e r , w a l _ l e v e l must
be s e t t o
2 # " h o t _ s t a n d b y " . But you can c h o o s e " a r c h i v e " i f you n e v e r
connect to the
3 # s e r v e r i n s t a n d b y mode .
4 w a l _ l e v e l = hot_standby
6 # S e t t h e maximum number o f c o n c u r r e n t c o n n e c t i o n s from t h e
standby servers .
7 max_wal_senders = 5
9 # To p r e v e n t t h e primary s e r v e r from removing t h e WAL segments
required for
10 # t h e s t a n d b y s e r v e r b e f o r e s h i p p i n g them , s e t t h e minimum number
o f segments
11 # r e t a i n e d i n t h e pg_xlog d i r e c t o r y . At l e a s t wal_keep_segments
s h o u l d be
12 # l a r g e r than t h e number o f segments g e n e r a t e d b e t w e e n t h e
beginning of
13 # o n l i n e backup and t h e s t a r t u p o f s t r e a m i n g r e p l i c a t i o n . I f you
e n a b l e WAL
14 # a r c h i v i n g t o an a r c h i v e d i r e c t o r y a c c e s s i b l e from t h e s t a n d b y ,
t h i s may
15 # not be n e c e s s a r y .
16 wal_keep_segments = 32
18 # Enable WAL a r c h i v i n g on t h e primary t o an a r c h i v e d i r e c t o r y
a c c e s s i b l e from
19 # t h e s t a n d b y . I f wal_keep_segments i s a h i g h enough number t o
r e t a i n t h e WAL
20 # segments r e q u i r e d f o r t h e s t a n d b y s e r v e r , t h i s may not be
necessary .
21 archive_mode
= on
22 archive_command = cp %p / path_to / a r c h i v e/%f

:
wal_level = hot_standby WAL
archive, ,
( archive,
).
max_wal_senders = 5 .
wal_keep_segments = 32 c WAL
pg_xlog .
archive_mode = on WAL
archive_command .
/path_to/archive/.
postgresql .
slavedb.
62

4.4. Streaming Replication ( )



slavedb masterdb. .
masterdb . :
Listing 4.55:
1

p s q l c "SELECT pg_start_backup ( l a b e l , t r u e ) "

.
:
Listing 4.56:
1

r s y n c C a d e l e t e e s s h e x c l u d e p o s t g r e s q l . c o n f e x c l u d e
postmaster . pid \
2 e x c l u d e p o s t m a s t e r . o p t s e x c l u d e pg_log e x c l u d e pg_xlog \
3 e x c l u d e r e c o v e r y . c o n f master_db_datadir /
s l a v e d b _ h o s t : slave_db_datadir /

master_db_datadir postgresql masterdb


slave_db_datadir postgresql slavedb
slavedb_host slavedb( - 192.168.1.20)
, .
:
Listing 4.57:
1

p s q l c "SELECT pg_stop_backup ( ) "

postgresql.conf,
( ).
:
Listing 4.58:
1

hot_standby = on

! wal_level = archive,
(hot_standby = off).
slavedb PostgreSQL
recovery.conf :
Listing 4.59: recovery.conf
1 # S p e c i f i e s w h e t h e r t o s t a r t t h e s e r v e r as a s t a n d b y . In
streaming r e p l i c a t i o n ,
2 # t h i s parameter must t o be s e t t o on .

63

4.4. Streaming Replication ( )


3

standby_mode

= on

5 # S p e c i f i e s a c o n n e c t i o n s t r i n g which i s used f o r t h e s t a n d b y
s e r v e r to connect
6 # w i t h t h e primary .
7 primary_conninfo
= h o s t = 1 9 2 . 1 6 8 . 0 . 1 0 p o r t =5432
u s e r=p o s t g r e s
9 # S p e c i f i e s a t r i g g e r f i l e whose p r e s e n c e s h o u l d c a u s e s t r e a m i n g
r e p l i c a t i o n to
10 # end ( i . e . , f a i l o v e r ) .
11 t r i g g e r _ f i l e = / path_to / t r i g g e r
13 # S p e c i f i e s a command t o l o a d a r c h i v e segments from t h e WAL
archive . If
14 # wal_keep_segments i s a h i g h enough number t o r e t a i n t h e WAL
segments
15 # r e q u i r e d f o r t h e s t a n d b y s e r v e r , t h i s may not be n e c e s s a r y . But
16 # a l a r g e w o r k l o a d can c a u s e segments t o be r e c y c l e d b e f o r e t h e
standby
17 # i s f u l l y s y n c h r o n i z e d , r e q u i r i n g you t o s t a r t a g a i n from a new
b a s e backup .
18 restore_command = s c p masterdb_host : / path_to / a r c h i v e/% f "%p"

standby_mode=on
primary_conninfo
trigger_file -,
.
restore_command , WAL
. scp masterdb (masterdb_host
- masterdb).
PostgreSQL slavedb.


:
Listing 4.60:
1
2
3
4
5
7

$ p s q l c "SELECT p g _ c u r r e n t _ x l o g _ l o c a t i o n ( ) " h192 . 1 6 8 . 0 . 1 0


( masterdb )
pg_current_xlog_location

0/2000000
( 1 row )
$ p s q l c " s e l e c t p g _ l a s t _ x l o g _ r e c e i v e _ l o c a t i o n ( ) " h192 . 1 6 8 . 0 . 2 0
( slavedb )

64

4.4. Streaming Replication ( )


8
pg_last_xlog_receive_location
9
10
0/2000000
11 ( 1 row )
13
14
15
16
17

$ p s q l c " s e l e c t p g _ l a s t _ x l o g _ r e p l a y _ l o c a t i o n ( ) " h192 . 1 6 8 . 0 . 2 0


( slavedb )
pg_last_xlog_replay_location

0/2000000
( 1 row )

ps:
Listing 4.61:
1
2

[ masterdb ] $ ps e f | g r e p s e n d e r
p o s t g r e s 6879 6831 0 1 0 : 3 1 ?
0 0 : 0 0 : 0 0 p o s t g r e s : wal
s e n d e r p r o c e s s p o s t g r e s 1 2 7 . 0 . 0 . 1 ( 4 4 6 6 3 ) s t r e a m i n g 0/2000000

4
5

[ s l a v e d b ] $ ps e f | g r e p r e c e i v e r
p o s t g r e s 6878 6872 1 1 0 : 3 1 ?
0 0 : 0 0 : 0 1 p o s t g r e s : wal
receiver process
s t r e a m i n g 0/2000000

. :
Listing 4.62:
1
2
3
4
5
6
7

$ p s q l test_db
test_db=# create table t e s t 3 ( i d int not null primary key , name
varchar ( 2 0 ) ) ;
NOTICE : CREATE TABLE / PRIMARY KEY w i l l create i m p l i c i t index
" test3_pkey " f o r table " t e s t 3 "
CREATE TABLE
test_db=# i n s e r t into t e s t 3 ( id , name ) values ( 1 , t e s t 1 ) ;
INSERT 0 1
test_db=#

:
Listing 4.63:
1 $ p s q l test_db
2 test_db=# s e l e c t from t e s t 3 ;
3
i d | name
4 +
5
1 | test1
6 ( 1 row )

,
.

65

4.5. Bucardo



- (trigger_file) ,
.

- (trigger_file) .

. ,

.

PostgreSQL .

, ,
. PostgreSQL
.

4.5

Bucardo

Bucardo master-master master-slave PostgreSQL, Perl. ,


.

Ubuntu Server.
DBIx::Safe Perl .
Listing 4.64:
1

sudo a p t i t u d e i n s t a l l l i b d b i x s a f e p e r l

11 :
11

http://search.cpan.org/CPAN/authors/id/T/TU/TURNSTEP/

66

4.5. Bucardo
Listing 4.65:
1
2
3
4

t a r x v f z DBIxSafe 1 . 2 . 5 . t a r . gz
cd DBIxSafe 1 . 2 . 5
p e r l M a k e f i l e . PL
make && make t e s t && sudo make i n s t a l l

Bucardo. 12 :
Listing 4.66:
1
2
3
4
5

t a r x v f z Bucardo 4 . 4 . 0 . t a r . gz
cd Bucardo 4 . 4 . 0
p e r l M a k e f i l e . PL
make
sudo make i n s t a l l

Bucardo pl/perlu
PostgreSQL.
Listing 4.67:
1

sudo a p t i t u d e i n s t a l l p o s t g r e s q l p l p e r l 8.4

Bucardo
:
Listing 4.68: Bucardo
1

bucardo_ctl i n s t a l l

Bucardo PostgreSQL,
:
Listing 4.69: Bucardo
1
2
3

This w i l l i n s t a l l t h e bucardo d a t a b a s e i n t o an e x i s t i n g P o s t g r e s
cluster .
P o s t g r e s must have been c o m p i l e d with P e r l support ,
and you must c o n n e c t a s a s u p e r u s e r

5 We w i l l c r e a t e a new s u p e r u s e r named bucardo ,


6 and make i t t h e owner o f a new d a t a b a s e named bucardo
8
9
10

Current c o n n e c t i o n s e t t i n g s :
1 . Host :
<none>
2 . Port :
5432
12

http://bucardo.org/wiki/Bucardo#Obtaining_Bucardo

67

4.5. Bucardo
11
12
13

3 . User :
postgres
4 . Database :
postgres
5 . PID d i r e c t o r y : / var / run / bucardo

, Bucardo bucardo bucardo.


Unix socket,
pg_hda.conf.

,
Bucardo. master_db slave_db.
:
Listing 4.70:
1
2
3

b u c a r d o _ c t l add db master_db name=master


b u c a r d o _ c t l add a l l t a b l e s herd=a l l _ t a b l e s
b u c a r d o _ c t l add a l l s e q u e n c e s herd=a l l _ t a b l e s

master (
, master_db slave_db
Bucardo ).
,
all_tables.
slave_db:
Listing 4.71:
1

b u c a r d o _ c t l add db slave_db name=r e p l i c a p o r t =6543 h o s t=s l a v e _ h o s t

replica Bucardo.

.
(master-slave):
Listing 4.72:
1

b u c a r d o _ c t l add sync d e l t a type=p u s h d e l t a source=a l l _ t a b l e s


t a r g e t d b=r e p l i c a

Bucardo PostgreSQL.
:
type
. 3 :
68

4.5. Bucardo
Fullcopy. .
Pushdelta. Master-slave .
Swap. Master-master .
Bucardo
. goat (
) standard_conflict
(
):
source source (master_db ).
target target (slave_db
).
skip . .
random ,
.
latest ,
.
abort .
source
.
targetdb
, .
master-master:
Listing 4.73:
1

b u c a r d o _ c t l add sync d e l t a type=swap source=a l l _ t a b l e s


t a r g e t d b=r e p l i c a

/
:
Listing 4.74:
1

bucardo_ctl s t a r t

:
Listing 4.75:
1

bucardo_ctl stop



:
69

4.6. RubyRep
Listing 4.76:
1

b u c a r d o _ c t l show a l l


Listing 4.77:
1

b u c a r d o _ c t l set name=v a l u e

:
Listing 4.78:
1

b u c a r d o _ c t l set s y s l o g _ f a c i l i t y=LOG_LOCAL3


Listing 4.79:
1

bucardo_ctl reload_config

http://bucardo.org/wiki/Bucardo_ctl

4.6

RubyRep

RubyRep
, ruby. :
. master-master,
master-slave , PostgreSQL MySQL.
:


:
MySQL

RubyRep : Ruby
JRuby. JRuby
.
JRuby
Java ( 1.6).
70

4.6. RubyRep
1. JRuby rubyrep c Rubyforge13 .
2.
3.
Ruby
1. Ruby, Rubygems.
2. .
Mysql:
Listing 4.80:
1

sudo gem i n s t a l l mysql

PostgreSQL:
Listing 4.81:
1

sudo gem i n s t a l l p o s t g r e s

3. rubyrep:
Listing 4.82:
1

sudo gem i n s t a l l rubyrep


:
Listing 4.83:
1

rubyrep g e n e r a t e myrubyrep . c o n f

generate myrubyrep.conf:
Listing 4.84:
1 RR : : I n i t i a l i z e r : : run do | c o n f i g |
2 config . l e f t = {
3 : a d a p t e r => p o s t g r e s q l , # or mysql
4 : d a t a b a s e => SCOTT ,
5 : username => s c o t t ,
6 : password => t i g e r ,
7 : host
=> 1 7 2 . 1 6 . 1 . 1
8 }
10

config . right = {
13

http://rubyforge.org/frs/?group_id=7932, ZIP

71

4.6. RubyRep
11
12
13
14
15
16

: adapter
: database
: username
: password
: host
}

=>
=>
=>
=>
=>

postgresql ,
SCOTT ,
scott ,
tiger ,
172.16.1.2

18
19

c o n f i g . i n c l u d e _ t a b l e s dept
c o n f i g . i n c l u d e _ t a b l e s /^ e / # r e g e x p matches a l l t a b l e s s t a r t i n g
with e
20 # c o n f i g . i n c l u d e _ t a b l e s / . / # r e g e x p matches a l l t a b l e s
21 end

. left
right. config.include_tables
( RegEx).

:
Listing 4.85:
1

rubyrep s c a n c myrubyrep . c o n f

:
Listing 4.86:
1 dept 100% . . . . . . . . . . . . . . . . . . . . . . . . .
2 emp 100% . . . . . . . . . . . . . . . . . . . . . . . . .

dept , emp
.

:
Listing 4.87:
1

rubyrep sync c myrubyrep . c o n f

:
Listing 4.88:
1

rubyrep sync c myrubyrep . c o n f dept /^ e /


.
http://www.rubyrep.org/configuration.html.
72

4.6. RubyRep

:
Listing 4.89:
1

rubyrep r e p l i c a t e c myrubyrep . c o n f

( )
. ,
. ,
rubyrep. ,
.
:
Listing 4.90:
1

rubyrep u n i n s t a l l c myrubyrep . c o n f



rubyrep Ruby :
Listing 4.91:
1
2
3
4
5
6

$rubyrep r e p l i c a t e c myrubyrep . c o n f
V e r i f y i n g RubyRep t a b l e s
Checking f o r and removing rubyrep t r i g g e r s from u n c o n f i g u r e d
tables
V e r i f y i n g rubyrep t r i g g e r s o f c o n f i g u r e d t a b l e s
Starting replication
E x c e p t i o n caught : Thread#j o i n : d e a d l o c k 0 x b 7 6 e e 1 a c mutual
j o i n (0 x b 7 5 8 c f a c )

Ruby. :
1. rubyrep JRuby ( )
2. rubyrep :
Listing 4.92:
1 / L i b r a r y /Ruby/Gems / 1 . 8 / gems/ rubyrep 1 . 1 . 2 / l i b / rubyrep /
2 r e p l i c a t i o n _ r u n n e r . rb 20100716 1 5 : 1 7 : 1 6 . 0 0 0 0 0 0 0 0 0 0400
3 +++ . / r e p l i c a t i o n _ r u n n e r . rb 20100716 1 7 : 3 8 : 0 3 . 0 0 0 0 0 0 0 0 0
0400
4 @@ 2,6 +2 ,12 @@
6
r e q u i r e optparse
7
r e q u i r e thread
8 +r e q u i r e monitor

73

4.7.
9
10
11
12
13

+
+c l a s s Monitor
+ a l i a s l o c k mon_enter
+ a l i a s u n l o c k mon_exit
+end

15
16

module RR
# This c l a s s implements t h e f u n c t i o n a l i t y o f t h e
r e p l i c a t e command .
@@ 94 ,7 +100 ,7 @@
# I n i t i a l i z e s t h e w a i t e r t h r e a d used f o r r e p l i c a t i o n
pauses
# and p r o c e s s i n g
# t h e p r o c e s s TERM s i g n a l .
def init_waiter
@termination_mutex = Mutex . new
+ @termination_mutex = Monitor . new
@termination_mutex . l o c k
@waiter_thread | | = Thread . new
{ @termination_mutex . l o c k ;
s e l f . t e r m i n a t i o n _ r e q u e s t e d = true }
%w(TERM INT) . each do | s i g n a l |

17
18
19
20
21
22
23
24
25
26
27

4.7

,
PostgreSQL.
,
, ..
PostgreSQL.
.

, 9.0 PostgreSQL. Slony-I
,
, ,
(failover) (switchover).
Londiste ,
. Bucardo master-master,
master-slave , ,
(failover) (switchover).
RubyRep, master-master ,
,
(
).

74

,
.

5.1

.

.
.
, .
.
,
(, 90% ).
, ( !)
, ().

, , ,
..
.
, ,
.
..
. ? .
, .
, 10
, 10 - .. ..
- , ,
.
, ID
(user_id) .
75

5.1.
..
:
,
user_id


:
- =.
, ,
.
, .
=
() .
, ,
( ,
).

.
,
.. .
, ,
, (,
,
). .
, , ,
ID , 100 .

. PostgreSQL
:

Greenplum Database1
GridSQL for EnterpriseDB Advanced Server2
Sequoia3
PL/Proxy4
HadoopDB5 (Shared-nothing clustering)

http://www.greenplum.com/index.php?page=greenplum-database
http://www.enterprisedb.com/products/gridsql.do
3
http://www.continuent.com/community/lab-projects/sequoia
4
http://plproxy.projects.postgresql.org/doc/tutorial.html
5
http://db.cs.yale.edu/hadoopdb/hadoopdb.html
1
2

76

5.2. PL/Proxy

5.2

PL/Proxy

PL/Proxy -
.
,
, ,
(, ,
, - ).
PL/Proxy ?
. ,
, 26 .
, -,
: , ,
- .

.
PL/Proxy
OLTP . failover-
, -,
.
:
autocommit-

SELECT;

- ;
-,
PgBouncer
( , )
-

1. PL/Proxy6 .
2. PL/Proxy make make install.
PL/Proxy .
Ubuntu Server PostgreSQL 8.4:
Listing 5.1:
1

sudo a p t i t u d e i n s t a l l p o s t g r e s q l 8.4 p l p r o x y
6

http://pgfoundry.org/projects/plproxy

77

5.2. PL/Proxy

3 PostgreSQL. 2
node1 node2, ,
proxy. pl/proxy
.
plproxytest, users. !
node1 node2.
.
plproxytest( ):
Listing 5.2:
1 CREATE DATABASE p l p r o x y t e s t
2
WITH OWNER = p o s t g r e s
3
ENCODING = UTF8 ;

users:
Listing 5.3:
1 CREATE TABLE p u b l i c . u s e r s
2
(
3
username character varying ( 2 5 5 ) ,
4
e m a i l character varying ( 2 5 5 )
5
)
6
WITH (OIDS=FALSE) ;
7 ALTER TABLE p u b l i c . u s e r s OWNER TO p o s t g r e s ;

users:
Listing 5.4:
1
2
3
4
5
6
7
8
9

CREATE OR REPLACE FUNCTION p u b l i c . i n s e r t _ u s e r ( i_username t e x t ,


i_emailaddress
text )
RETURNS integer AS
$BODY$
INSERT INTO p u b l i c . u s e r s ( username , e m a i l ) VALUES ( $1 , $2 ) ;
SELECT 1 ;
$BODY$
LANGUAGE s q l VOLATILE ;
ALTER FUNCTION p u b l i c . i n s e r t _ u s e r ( t e x t , t e x t ) OWNER TO p o s t g r e s ;

. proxy.
, (proxy)
:
Listing 5.5:
1 CREATE DATABASE p l p r o x y t e s t
2
WITH OWNER = p o s t g r e s
3
ENCODING = UTF8 ;

78

5.2. PL/Proxy

pl/proxy:
Listing 5.6:
1
2
3
4
5
6
7
8
9
10

CREATE OR REPLACE FUNCTION p u b l i c . p l p r o x y _ c a l l _ h a n d l e r ( )


RETURNS l a n g u a g e _ h a n d l er AS
$ l i b d i r / plproxy , plproxy_call_handler
LANGUAGE c VOLATILE
COST 1 ;
ALTER FUNCTION p u b l i c . p l p r o x y _ c a l l _ h a n d l e r ( )
OWNER TO p o s t g r e s ;
l a n g u a g e
CREATE LANGUAGE p l p r o x y HANDLER p l p r o x y _ c a l l _ h a n d l e r ;
CREATE LANGUAGE p l p g s q l ;

,
3 pl/proxy
. .
kay-value:
Listing 5.7:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

CREATE OR REPLACE FUNCTION p u b l i c . g e t _ c l u s t e r _ c o n f i g


(IN cluster_name t e x t ,
OUT " key " t e x t , OUT v a l t e x t )
RETURNS SETOF r e c o r d AS
$BODY$
BEGIN
l e t s use same c o n f i g f o r a l l c l u s t e r s
key := c o n n e c t i o n _ l i f e t i m e ;
v a l := 3 0 6 0 ; 30m
RETURN NEXT;
RETURN;
END;
$BODY$
LANGUAGE p l p g s q l VOLATILE
COST 100
ROWS 1 0 0 0 ;
ALTER FUNCTION p u b l i c . g e t _ c l u s t e r _ c o n f i g ( t e x t )
OWNER TO p o s t g r e s ;

.
DSN :
Listing 5.8:
1 CREATE OR REPLACE FUNCTION
2 p u b l i c . g e t _ c l u s t e r _ p a r t i t i o n s ( cluster_name t e x t )
3
RETURNS SETOF t e x t AS
4 $BODY$
5 BEGIN
6
IF cluster_name = u s e r c l u s t e r THEN
7
RETURN NEXT dbname=p l p r o x y t e s t h o s t=node1 u s e r=p o s t g r e s ;

79

5.2. PL/Proxy
8
9
10
11
12
13
14
15
16
17
18

RETURN NEXT dbname=p l p r o x y t e s t h o s t=node2 u s e r=p o s t g r e s ;


RETURN;
END IF ;
RAISE EXCEPTION Unknown c l u s t e r ;
END;
$BODY$
LANGUAGE p l p g s q l VOLATILE
COST 100
ROWS 1 0 0 0 ;
ALTER FUNCTION p u b l i c . g e t _ c l u s t e r _ p a r t i t i o n s ( t e x t )
OWNER TO p o s t g r e s ;

:
Listing 5.9:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

CREATE OR REPLACE FUNCTION


p u b l i c . g e t _ c l u s t e r _ v e r s i o n ( cluster_name t e x t )
RETURNS integer AS
$BODY$
BEGIN
IF cluster_name = u s e r c l u s t e r THEN
RETURN 1 ;
END IF ;
RAISE EXCEPTION Unknown c l u s t e r ;
END;
$BODY$
LANGUAGE p l p g s q l VOLATILE
COST 1 0 0 ;
ALTER FUNCTION p u b l i c . g e t _ c l u s t e r _ v e r s i o n ( t e x t )
OWNER TO p o s t g r e s ;


:
Listing 5.10:
1
2
3
4
5
6
7
8
9
10
11

CREATE OR REPLACE FUNCTION


p u b l i c . i n s e r t _ u s e r ( i_username t e x t , i _ e m a i l a d d r e s s t e x t )
RETURNS integer AS
$BODY$
CLUSTER u s e r c l u s t e r ;
RUN ON h a s h t e x t ( i_username ) ;
$BODY$
LANGUAGE p l p r o x y VOLATILE
COST 1 0 0 ;
ALTER FUNCTION p u b l i c . i n s e r t _ u s e r ( t e x t , t e x t )
OWNER TO p o s t g r e s ;

. proxy :
Listing 5.11:
1 SELECT i n s e r t _ u s e r ( Sven , sven@somewhere . com ) ;

80

5.2. PL/Proxy
2 SELECT i n s e r t _ u s e r ( Marko , marko@somewhere . com ) ;
3 SELECT i n s e r t _ u s e r ( S t e v e , steve@somewhere . com ) ;

. :
Listing 5.12:
1
2
3
4
5
6
7
8
9
10
11
12
13
14

CREATE OR REPLACE FUNCTION


p u b l i c . get_user_email ( i_username t e x t )
RETURNS SETOF t e x t AS
$BODY$
CLUSTER u s e r c l u s t e r ;
RUN ON h a s h t e x t ( i_username ) ;
SELECT e m a i l FROM p u b l i c . u s e r s
WHERE username = i_username ;
$BODY$
LANGUAGE p l p r o x y VOLATILE
COST 100
ROWS 1 0 0 0 ;
ALTER FUNCTION p u b l i c . get_user_email ( t e x t )
OWNER TO p o s t g r e s ;

:
Listing 5.13:
1

s e l e c t p l p r o x y . get_user_email ( S t e v e ) ;

,
, users .

?
pl/proxy
. ,
. 16 .
- .
?
Highload++ 2008,

Skype,
opensource.

,
.
, ,
get_cluster_partitions.

81

5.3. HadoopDB

5.3

HadoopDB

Hadoop ,
.
,
:
: Hadoop
,
;
:
, .
;
:
,
;
: ,
.
;
: ,
Java,
, JVM.
HDFS

Hadoop Distributed File System.
,
:
,
, ;

;
;
;
:
;
: ,
;
HDFS
Namenode
.
.

82

5.3. HadoopDB

Figure 5.1: HDFS


, .
.
Datanode
.

.
Rack, ,
, .

, .
.
HDFS :

. ( ,
; - 64 mb) ,
Datanode.
,
.
, -
,
/trash
83

5.3. HadoopDB
.
, .
- , Datanode Namenode .
Namenode
, -
. ,

.

, TCP/IP.
Namenode ClientProtocol,
DatanodeProtocol,
Remote Procedure Call (RPC).
,
DFSShell, DFSAdmin,
, -.
API : Java API, C pipeline, WebDAV
.
MapReduce
, Hadoop framework
,
. Job ()
, , :
Map
(
-)
-.
.
Reduce
map
.
, .
, ,
( ).
JobTracker,
TaskTracker. JobTracker
, ,
.
, framework,
map reduce,
.
84

5.3. HadoopDB
JobTracker
.

, , ,
.
HBase
Hadoop ,
.
Google
BigTable.
HBase
.
, .
HQL ( Hadoop Query Language),
SQL.
.
HBase
,
--,
.
, ,
,
. -
, .
, .
HQL , SQL,

help;, .
SELECT, INSERT, UPDATE, DROP ,
.
HBase Shell, HBase
API :
Java, Jython, REST Thrift.
HadoopDB
HadoopDB Yale Brown
,
MapReduce, .
MapReduce ,
,
. SQL,
MapReduce, .
MapReduce ,
.

85

5.3. HadoopDB


Ubuntu Server .
Hadoop
, Hadoop,
,
:
ssh
,
[hadoop]:
Listing 5.14:
1 $sudo groupadd hadoop
2 $sudo u s e r a d d m g hadoop d /home/ hadoop s / b i n / bash \
3 c "Hadoop s o f t w a r e owner " hadoop

:
Listing 5.15: hadoop
1

su hadoop

RSA-
:
Listing 5.16: RSA-
1
2
3
4
5
6
7

h a d o o p @ l o c a l h o s t ~ $ sshkeygen t r s a P ""
G e n e r a t i n g p u b l i c / p r i v a t e r s a key p a i r .
Enter f i l e i n which t o s a v e t h e key
( / home/ hadoop / . s s h / i d _ r s a ) :
Your i d e n t i f i c a t i o n has been saved i n
/home/ hadoop / . s s h / i d _ r s a .
Your p u b l i c key has been saved i n
/home/ hadoop / . s s h / i d _ r s a . pub .
The key f i n g e r p r i n t i s :
7b : 5 c : c f : 7 9 : 6 b : 9 3 : d6 : d6 : 8 d : 4 1 : e3 : a6 : 9 d : 0 4 : f 9 : 8 5
hadoop@localhost

:
Listing 5.17:
1

cat $HOME/ . s s h / i d _ r s a . pub >> $HOME/ . s s h / a u t h o r i z e d _ k e y s

,
:
86

5.3. HadoopDB
Listing 5.18: ssh
1

ssh l o c a l h o s t

sshd:
Listing 5.19: sshd
1

/ e t c / i n i t . d/ s s h d s t a r t

JVM
1.5.0 .
Listing 5.20: JVM
1

sudo a p t i t u d e i n s t a l l openjdk 6j d k

Hadoop:
Listing 5.21: Hadoop
1
2
3
4
5
6
7
8

cd / opt
sudo wget h t t p : / /www. g t l i b . g a t e c h . edu /pub/ apache / hadoop
/ c o r e / hadoop 0 . 2 0 . 2 / hadoop 0 . 2 0 . 2 . t a r . gz
sudo t a r z x v f hadoop 0 . 2 0 . 2 . t a r . gz
sudo l n s / opt / hadoop 0 . 2 0 . 2 / opt / hadoop
sudo chown R hadoop : hadoop / opt / hadoop / opt / hadoop 0 . 2 0 . 2
sudo mkdir p / opt / hadoopdata /tmpb a s e
sudo chown R hadoop : hadoop / opt / hadoopdata /

/opt/hadoop/conf/hadoop-env.sh :
Listing 5.22:
1
2
3
4
5
6

export
export
export
export
export
export

JAVA_HOME=/u s r / l i b /jvm/ java 6openjdk


HADOOP_HOME=/opt / hadoop
HADOOP_CONF=$HADOOP_HOME/ c o n f
HADOOP_PATH=$HADOOP_HOME/ b i n
HIVE_HOME=/opt / h i v e
HIVE_PATH=$HIVE_HOME/ b i n

export PATH=$HIVE_PATH:$HADOOP_PATH:$PATH

/opt/hadoop/conf/hadoop-site.xml:
Listing 5.23: hadoop
1 <c o n f i g u r a t i o n>
2 <p r o p e r t y>
3
<name>hadoop . tmp . d i r</name>
4
<v a l u e>/ opt / hadoopdata /tmpb a s e</ v a l u e>
5
<d e s c r i p t i o n>A b a s e f o r o t h e r temporary
d i r e c t o r i e s</ d e s c r i p t i o n>

87

5.3. HadoopDB
6 </ p r o p e r t y>
8 <p r o p e r t y>
9
<name> f s . d e f a u l t . name</name>
10
<v a l u e>l o c a l h o s t : 5 4 3 1 1</ v a l u e>
11
<d e s c r i p t i o n>
12
The name o f t h e d e f a u l t f i l e system .
13
</ d e s c r i p t i o n>
14 </ p r o p e r t y>
16 <p r o p e r t y>
17
<name>hadoopdb . c o n f i g . f i l e </name>
18
<v a l u e>HadoopDB . xml</ v a l u e>
19
<d e s c r i p t i o n>The name o f t h e HadoopDB
20
c l u s t e r c o n f i g u r a t i o n f i l e </ d e s c r i p t i o n>
21 </ p r o p e r t y>
22 </ c o n f i g u r a t i o n>

/opt/hadoop/conf/mapred-site.xml:
Listing 5.24: mapreduce
1
2
3
4
5
6
7
8
9
10

<c o n f i g u r a t i o n>
<p r o p e r t y>
<name>mapred . j o b . t r a c k e r</name>
<v a l u e>l o c a l h o s t : 5 4 3 1 0</ v a l u e>
<d e s c r i p t i o n>
The h o s t and p o r t t h a t t h e
MapReduce j o b t r a c k e r r u n s a t .
</ d e s c r i p t i o n>
</ p r o p e r t y>
</ c o n f i g u r a t i o n>

/opt/hadoop/conf/hdfs-site.xml:
Listing 5.25: hdfs
1
2
3
4
5
6
7
8
9

<c o n f i g u r a t i o n>
<p r o p e r t y>
<name>d f s . r e p l i c a t i o n</name>
<v a l u e>1</ v a l u e>
<d e s c r i p t i o n>
Default block r e p l i c a t i o n .
</ d e s c r i p t i o n>
</ p r o p e r t y>
</ c o n f i g u r a t i o n>

Namenode:
Listing 5.26: Namenode
1
2
3

$ hadoop namenode format


10/05/07 1 4 : 2 4 : 1 2 INFO namenode . NameNode : STARTUP_MSG:
/

88

5.3. HadoopDB
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

STARTUP_MSG: S t a r t i n g NameNode
STARTUP_MSG:
h o s t = hadoop1 / 1 2 7 . 0 . 1 . 1
STARTUP_MSG:
a r g s = [ format ]
STARTUP_MSG:
version = 0.20.2
STARTUP_MSG:
b u i l d = h t t p s : / / svn . apache . o r g / r e p o s
/ a s f / hadoop /common/ b r a n c h e s / branch 0.20 r
9 1 1 7 0 7 ; c o m p i l e d by c h r i s d o on F r i Feb 19 0 8 : 0 7 : 3 4 UTC 2010
/
10/05/07 1 4 : 2 4 : 1 2 INFO namenode . FSNamesystem :
fsOwner=hadoop , hadoop
10/05/07 1 4 : 2 4 : 1 2 INFO namenode . FSNamesystem :
s u p e r g r o u p=s u p e r g r o u p
10/05/07 1 4 : 2 4 : 1 2 INFO namenode . FSNamesystem :
i s P e r m i s s i o n E n a b l e d=true
10/05/07 1 4 : 2 4 : 1 2 INFO common . S t o r a g e :
Image f i l e o f s i z e 96 saved i n 0 s e c o n d s .
10/05/07 1 4 : 2 4 : 1 2 INFO common . S t o r a g e :
S t o r a g e d i r e c t o r y / opt / hadoopdata /tmpb a s e / d f s /name has been
s u c c e s s f u l l y formatted .
10/05/07 1 4 : 2 4 : 1 2 INFO namenode . NameNode :
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: S h u t t i n g down NameNode a t hadoop1 / 1 2 7 . 0 . 1 . 1
/

. Hadoop:
Listing 5.27: Hadoop
1
2
3
4
5
6
7
8
9
10
11

$ s t a r t a l l . sh
s t a r t i n g namenode , l o g g i n g t o / opt / hadoop / b i n / . .
/ l o g s / hadoophadoopnamenodehadoop1 . out
l o c a l h o s t : s t a r t i n g datanode , l o g g i n g t o
/ opt / hadoop / b i n / . . / l o g s / hadoophadoopdatanodehadoop1 . out
l o c a l h o s t : s t a r t i n g secondarynamenode , l o g g i n g t o
/ opt / hadoop / b i n / . . / l o g s / hadoophadoopsecondarynamenodehadoop1 . out
s t a r t i n g jobtracker , l o g g i n g to
/ opt / hadoop / b i n / . . / l o g s / hadoophadoopj o b t r a c k e r hadoop1 . out
l o c a l h o s t : s t a r t i n g tasktracker , l o g g i n g to
/ opt / hadoop / b i n / . . / l o g s / hadoophadoopt a s k t r a c k e r hadoop1 . out

Hadoop stop-all.sh.
HadoopDB Hive
HaddopDB7 hadoopdb.jar $HADOOP_HOME/lib:
Listing 5.28: HadoopDB
1

$cp hadoopdb . j a r $HADOOP_HOME/ l i b


7

http://sourceforge.net/projects/hadoopdb/files/

89

5.3. HadoopDB
PostgreSQL JDBC . 8
$HADOOP_HOME/lib.
Hive HadoopDB SQL .
HDFS Hive:
Listing 5.29: HadoopDB
1
2
3
4

hadoop
hadoop
hadoop
hadoop

fs
fs
fs
fs

mkdir
mkdir
chmod
chmod

/tmp
/ u s e r / h i v e / warehouse
g+w /tmp
g+w / u s e r / h i v e / warehouse

HadoopDB SMS_dist. :
Listing 5.30: HadoopDB
1
2
3

t a r z x v f SMS_dist . t a r . gz
sudo mv d i s t / opt / h i v e
sudo chown R hadoop : hadoop h i v e

Hadoop,
Hive :
Listing 5.31: HadoopDB
1
2
3
4

$ hive
Hive history f i l e =/tmp/ hadoop /
hive_job_log_hadoop_201005081717_1990651345 . t x t
hive >

hive > q u i t ;

. :
Listing 5.32:
1
2

svn co h t t p : / / g r a f f i t i . c s . brown . edu / svn / benchmarks /


cd benchmarks / datagen / t e r a g e n

benchmarks/datagen/teragen/teragen.pl :
Listing 5.33:
1
2

use s t r i c t ;
use warnings ;

4 my $CUR_HOSTNAME = hostname s ;
5 chomp ($CUR_HOSTNAME) ;
8

http://jdbc.postgresql.org/download.html

90

5.3. HadoopDB
7
8
9
10
11
12
13

my
my
my
my
my
my
my

$NUM_OF_RECORDS_1TB
= 10000000000;
$NUM_OF_RECORDS_535MB = 1 0 0 ;
$BASE_OUTPUT_DIR
= " / data " ;
$PATTERN_STRING
= "XYZ" ;
$PATTERN_FREQUENCY = 1 0 8 2 9 9 ;
$TERAGEN_JAR
= " teragen . jar " ;
$HADOOP_COMMAND
= $ENV{ HADOOP_HOME } . " / b i n / hadoop " ;

15
16
17
18
19
20
21
22
23

my % f i l e s = ( " 535MB" => 1 ,


);
system ( "$HADOOP_COMMAND f s rmr$BASE_OUTPUT_DIR" ) ;
f o r e a c h my $ t a r g e t ( k e y s % f i l e s ) {
my $output_dir = $BASE_OUTPUT_DIR. " / S o r t G r e p $ t a r g e t " ;
my $num_of_maps = $ f i l e s { $ t a r g e t } ;
my $num_of_records = ( $ t a r g e t eq " 535MB" ?
$NUM_OF_RECORDS_535MB : $NUM_OF_RECORDS_1TB) ;
p r i n t " G e n e r a t i n g $num_of_maps f i l e s i n $output_dir \ n" ;

25
26
27
28
29
30
31
32
33
34
35
36
37
38

##
## EXEC: hadoop j a r t e r a g e n . j a r 10000000000
## / data / SortGrep / XYZ 108299 100
##
my @args = ( $num_of_records ,
$output_dir ,
$PATTERN_STRING,
$PATTERN_FREQUENCY,
$num_of_maps ) ;
my $cmd = "$HADOOP_COMMAND j a r $TERAGEN_JAR " . j o i n ( " " , @args ) ;
p r i n t "$cmd\n" ;
system ( $cmd ) == 0 | | d i e ( "ERROR: $ ! " ) ;
} # FOR
exit (0) ;

Perl ,
HDFS.
, HDFS.
.
, ,
HDFS, :
Listing 5.34:
1 $hadoop f s g e t / data /SortGrep535MB/ part 00000 my_file
2 $psql
3 p s q l > CREATE DATABASE g r e p 0 ;
4 p s q l > USE g r e p 0 ;
5 p s q l > CREATE TABLE g r e p (
6
>
key1 c h a r a c t e r v a r y i n g ( 2 5 5 ) ,
7
>
f i e l d character varying (255)
8
> ) ;
9 COPY g r e p FROM my_file WITH DELIMITER | ;

91

5.3. HadoopDB
HadoopDB. HadoopDB
Catalog.properties. :
Listing 5.35:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

#P r o p e r t i e s f o r C a t a l o g G e n e r at i o n
##################################
n o d e s _ f i l e=machines . t x t
r e l a t i o n s _ u n c h u n k e d=grep , E n t i r e R a n k i n g s
r e l a t i o n s _ c h u n k e d=Rankings , U s e r V i s i t s
c a t a l o g _ f i l e=HadoopDB . xml
##
#DB Connection Parameters
##
p o r t =5432
username=p o s t g r e s
password=password
d r i v e r=com . p o s t g r e s q l . D r i v e r
u r l _ p r e f i x=j d b c \ : p o s t g r e s q l \ : / /
##
#Chunking p r o p e r t i e s
##
chunks_per_node=0
unchunked_db_prefix=g r e p
chunked_db_prefix=cdb
##
#R e p l i c a t i o n P r o p e r t i e s
##
dump_script_prefix=/r o o t /dump_
r e p l i c a t i o n _ s c r i p t _ p r e f i x =/r o o t / l o a d _ r e p l i c a _
dump_file_u_prefix=/mnt/dump_udb
dump_file_c_prefix=/mnt/dump_cdb
##
#C l u s t e r Connection
##
ssh_key=i d _ r s a

machines.txt localhost (
). HadoopDB HDFS:
Listing 5.36:
1 j a v a cp $HADOOP_HOME/ l i b / hadoopdb . j a r \
2 > edu . y a l e . c s . hadoopdb . c a t a l o g . S i m p l e C a t a l o g G e n e r a t o r \
3 > Catalog . p r o p e r t i e s
4 hadoop d f s put HadoopDB . xml HadoopDB . xml

:
Listing 5.37:
1

j a v a cp hadoopdb . j a r
edu . y a l e . c s . hadoopdb . c a t a l o g . SimpleRandomReplicationFactorTwo
Catalog . p r o p e r t i e s

92

5.3. HadoopDB
HadoopDB.xml,
.
HDFS:
Listing 5.38:
1
2

hadoop d f s rmr HadoopDB . xml


hadoop d f s put HadoopDB . xml HadoopDB . xml

true hadoopdb.config.replication HADOOP_HOME/conf/hadoopsite.xml.


HadoopDB.
, HDFS:
Listing 5.39:
1 j a v a cp $CLASSPATH : hadoopdb . j a r \
2 > edu . y a l e . c s . hadoopdb . benchmark . GrepTaskDB \
3 > p a t t e r n %wo% output p a d r a i g hadoop . c o n f i g . f i l e HadoopDB . xml

:
Listing 5.40:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

$ j a v a cp $CLASSPATH : hadoopdb . j a r
edu . y a l e . c s . hadoopdb . benchmark . GrepTaskDB \
> p a t t e r n %wo% output p a d r a i g hadoop . c o n f i g . f i l e HadoopDB . xml
1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 edu . y a l e . c s . hadoopdb . exec . DBJobBase i n i t C o n f
INFO : SELECT key1 , f i e l d FROM g r e p WHERE f i e l d LIKE %%wo%%;
1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . m e t r i c s . jvm . JvmMetrics i n i t
INFO : I n i t i a l i z i n g JVM M e t r i c s with processName=JobTracker ,
s e s s i o n I d=
1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . J o b C l i e n t
configureCommandLineOptions
WARNING: Use G e n e r i c O p t i o n s P a r s e r f o r p a r s i n g t h e arguments .
A p p l i c a t i o n s s h o u l d implement Tool f o r t h e same .
1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
INFO : Running j o b : job_local_0001
14.08.2010 19:08:48
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader
getConnection
INFO : Data l o c a l i t y f a i l e d f o r l e o p g s q l
14.08.2010 19:08:48
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader
getConnection
INFO : Task from l e o p g s q l i s c o n n e c t i n g t o chunk 0 on h o s t
l o c a l h o s t with
db u r l j d b c : p o s t g r e s q l : / / l o c a l h o s t : 5 4 3 4 / g r e p 0
1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . MapTask runOldMapper
INFO : numReduceTasks : 0
14.08.2010 19:08:48
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader c l o s e

93

5.3. HadoopDB
20 INFO : DB times (ms) : c o n n e c t i o n = 1 0 4 , query e x e c u t i o n = 2 0 , row
r e t r i e v a l = 79
21 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader c l o s e
22 INFO : Rows r e t r i e v e d = 3
23 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . Task done
24 INFO : Task : attempt_local_0001_m_000000_0 i s done . And i s i n t h e
p r o c e s s o f commiting
25 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . LocalJobRunner$Job
statusUpdate
26 INFO :
27 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . Task commit
28 INFO : Task attempt_local_0001_m_000000_0 i s a l l o w e d t o commit now
29 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . FileOutputCommitter
commitTask
30 INFO : Saved output o f t a s k attempt_local_0001_m_000000_0 t o
f i l e : / home/ l e o / p a d r a i g
31 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . LocalJobRunner$Job
statusUpdate
32 INFO :
33 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 8 o r g . apache . hadoop . mapred . Task sendDone
34 INFO : Task attempt_local_0001_m_000000_0 done .
35 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
36 INFO : map 100% r e d u c e 0%
37 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
38 INFO : Job complete : job_local_0001
39 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
40 INFO : Counters : 6
41 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
42 INFO :
FileSystemCounters
43 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
44 INFO :
FILE_BYTES_READ=141370
45 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
46 INFO :
FILE_BYTES_WRITTEN=153336
47 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
48 INFO :
MapReduce Framework
49 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
50 INFO :
Map i n p u t r e c o r d s =3
51 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
52 INFO :
S p i l l e d Records=0
53 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
54 INFO :
Map i n p u t b y t e s=3
55 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 o r g . apache . hadoop . mapred . Counters l o g
56 INFO :
Map output r e c o r d s =3
57 1 4 . 0 8 . 2 0 1 0 1 9 : 0 8 : 4 9 edu . y a l e . c s . hadoopdb . exec . DBJobBase run
58 INFO :
59 JOB TIME : 1828 ms .

HDFS, padraig:
Listing 5.41:
94

5.3. HadoopDB
1
2
3

$ cd p a d r a i g
$ cat part 00000
some data

PostgreSQL:
Listing 5.42:
1 p s q l > s e l e c t from g r e p where f i e l d l i k e %wo% ;
2 +++
3 | key1
| field
4 |
5 +++
6 some data
8
10

1 rows i n set ( 0 . 0 0 s e c )
psql>

. .
. PostgreSQL:
Listing 5.43:
1
2
3
4

p s q l > INSERT i n t o
p s q l > INSERT i n t o
Maybewqe ) ;
p s q l > INSERT i n t o
Maybewqesad )
p s q l > INSERT i n t o
string ! ) ;

g r e p ( key1 , f i e l d ) VALUES( I am l i v e ! , Maybe ) ;


g r e p ( key1 , f i e l d ) VALUES( I am l i v e ! ,
;

g r e p ( key1 , f i e l d ) VALUES( I am l i v e ! ,
g r e p ( key1 , f i e l d ) VALUES ( : ) , May c o o l

HadoopDB:
Listing 5.44:
1

2
3
4
5
6
7
8
9
10

$ j a v a cp $CLASSPATH : hadoopdb . j a r
edu . y a l e . c s . hadoopdb . benchmark . GrepTaskDB p a t t e r n %May%
output p a d r a i g hadoopdb . c o n f i g . f i l e
/ opt / hadoop / c o n f /HadoopDB . xml
padraig
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 5 edu . y a l e . c s . hadoopdb . exec . DBJobBase i n i t C o n f
INFO : SELECT key1 , f i e l d FROM g r e p WHERE f i e l d LIKE %%May%%;
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 6 o r g . apache . hadoop . m e t r i c s . jvm . JvmMetrics i n i t
INFO : I n i t i a l i z i n g JVM M e t r i c s with processName=JobTracker ,
s e s s i o n I d=
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 6 o r g . apache . hadoop . mapred . J o b C l i e n t
configureCommandLineOptions
WARNING: Use G e n e r i c O p t i o n s P a r s e r f o r p a r s i n g t h e arguments .
A p p l i c a t i o n s s h o u l d implement Tool f o r t h e same .
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 6 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
INFO : Running j o b : job_local_0001

95

5.3. HadoopDB
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

01.11.2010 23:14:46
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader
getConnection
INFO : Data l o c a l i t y f a i l e d f o r l e o p g s q l
01.11.2010 23:14:46
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader
getConnection
INFO : Task from l e o p g s q l i s c o n n e c t i n g t o chunk 0 on h o s t
l o c a l h o s t with db u r l j d b c : p o s t g r e s q l : / / l o c a l h o s t : 5 4 3 4 / g r e p 0
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . MapTask runOldMapper
INFO : numReduceTasks : 0
01.11.2010 23:14:47
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader c l o s e
INFO : DB times (ms) : c o n n e c t i o n = 1 8 1 , query e x e c u t i o n = 2 2 , row
r e t r i e v a l = 96
01.11.2010 23:14:47
edu . y a l e . c s . hadoopdb . c o n n e c t o r . AbstractDBRecordReader c l o s e
INFO : Rows r e t r i e v e d = 4
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Task done
INFO : Task : attempt_local_0001_m_000000_0 i s done . And i s i n t h e
p r o c e s s o f commiting
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . LocalJobRunner$Job
statusUpdate
INFO :
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Task commit
INFO : Task attempt_local_0001_m_000000_0 i s a l l o w e d t o commit now
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . FileOutputCommitter
commitTask
INFO : Saved output o f t a s k attempt_local_0001_m_000000_0 t o
f i l e : / home/ hadoop / p a d r a i g
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . LocalJobRunner$Job
statusUpdate
INFO :
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Task sendDone
INFO : Task attempt_local_0001_m_000000_0 done .
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
INFO : map 100% r e d u c e 0%
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . J o b C l i e n t
monitorAndPrintJob
INFO : Job complete : job_local_0001
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO : Counters : 6
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
FileSystemCounters
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
FILE_BYTES_READ=141345
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
FILE_BYTES_WRITTEN=153291
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
MapReduce Framework
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
Map i n p u t r e c o r d s =4

96

5.3. HadoopDB
49
50
51
52
53
54
55
56
57

0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g


INFO :
S p i l l e d Records=0
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
Map i n p u t b y t e s=4
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 o r g . apache . hadoop . mapred . Counters l o g
INFO :
Map output r e c o r d s =4
0 1 . 1 1 . 2 0 1 0 2 3 : 1 4 : 4 7 edu . y a l e . c s . hadoopdb . exec . DBJobBase run
INFO :
JOB TIME : 2332 ms .

May.
. :
Listing 5.45:
1
2
3
4
5
6

$ cd p a d r a i g
$ cat part 00000
I am l i v e ! Maybe
I am l i v e ! Maybewqe
I am l i v e ! Maybewqesad
: ) May c o o l s t r i n g !

PostgreSQL
. HadoopDB
PostgreSQL,
PostgreSQL, shared-nothing .
HadoopDB
http://hadoopdb.sourceforge.net/guide/quick_start_guide.html.

, Hive,
HadoopDB. ,
c Hadoop.
Hadoop HaddopDB.
HadoopDB Hadoop. ,

.
HadoopDB
,

, Hadoop. HadoopDB,
, ,
, PostgreSQL
, , PostgreSQL .
, Hadoop Hive
.
HadoopDB
Hadoop ,
97

5.4.
,
, ,
MapReduce. HadoopDB Hadoop
(
) HadoopDB
.

5.4

.
PostgreSQL ,
, . ,
, .

98

6
PgPool-II

.

6.1

pgpool-II , PostgreSQL
PostgreSQL. :

pgpool-II PostgreSQL

(.. , , ).

.

pgpool-II PostgreSQL.

2 ,

.

, SELECT
. pgpool-II

PostgreSQL SELECT
,
.
PostgreSQL.

.
99

6.2. !


PostgreSQL, .
, ,
. pgpoolII ,

.


,
.

.
pgpool-II PostgreSQL
. ,
() pgpool-II PostgreSQL,
() pgpool-II . pgpoolII , , ,
, pgpool-II
.
http://pgpool.projects.postgresql.org/pgpoolII/doc/tutorial-en.html.

6.2


pgpool-II
.

pgpool-II
pgpool-II . ,
, .
Listing 6.1: pgpool-II
1
2
3

./ configure
make
make i n s t a l l

configure
.
configure -,
100

6.2. !
, , . pgpool-II -
/usr/local.
make , make install
.
.
: pgpool-II libpq
PostgreSQL 7.4 (3 ).
configure ,
libpq 3 .
Listing 6.2: pgpool-II
1

c o n f i g u r e : e r r o r : l i b p q i s not i n s t a l l e d o r l i b p q i s o l d

3 , - ,
libpq, , configure.
configure libpq /usr/local/pgsql. PostgreSQL /usr/local/pgsql with-pgsql
with-pgsql-includedir with-pgsql-libdir
configure.
Linux pgpool-II
. Ubuntu Linux, , :
Listing 6.3: pgpool-II
1

sudo a p t i t u d e i n s t a l l p g p o o l 2


pgpool-II pgpool.conf.
: = . pgpoolII pgpool.conf.sample.
pgpool.conf,
.
Listing 6.4:
1

cp / u s r / l o c a l / e t c / p g p o o l . c o n f . sample / u s r / l o c a l / e t c / p g p o o l . c o n f

pgpool-II localhost 9999.


,
listen_addresses *.
Listing 6.5:
1
2

listen_addresses = localhost
p o r t = 9999

101

6.2. !
- .
Ubuntu Linux /etc/pgpool.conf.

PCP
pgpool-II
, pgpool-II .. .
PCP, .
PostgreSQL. pcp.conf.
,
(:). .
md5.
Listing 6.6: PCP
1

postgres : e8a48653851e28c69d0506508fb27fc5

pgpool-II pcp.conf.sample.
pcp.conf .
Listing 6.7: PCP
1

$ cp / u s r / l o c a l / e t c / pcp . c o n f . sample / u s r / l o c a l / e t c / pcp . c o n f

Ubuntu Linux /etc/pcp.conf.


md5
pg_md5,
pgpool-II. pg_md5
md5 .
, postgres
pg_md5 md5 .
Listing 6.8: PCP
1
2

$ / u s r / b i n /pg_md5 p o s t g r e s
e8a48653851e28c69d0506508fb27fc5

PCP , pgpool.conf
pcp_port.
- pcp_port
9898 .
Listing 6.9: PCP
1

pcp_port = 9898

102

6.2. !


PostgreSQL pgpoolII. pgpool-II
. ,
.
,
pgpool-II.

pgpool-II 5432, 5433, 5434 .
pgpool-II pgpool.conf
.
Listing 6.10:
1
2
3
4
5
6
7
8
9

backend_hostname0 = l o c a l h o s t
backend_port0 = 5432
backend_weight0 = 1
backend_hostname1 = l o c a l h o s t
backend_port1 = 5433
backend_weight1 = 1
backend_hostname2 = l o c a l h o s t
backend_port2 = 5434
backend_weight2 = 1

backend_hostname, backend_port, backend_weight


,
.
0 (.. 0,
1, 2).
backend_weight 1, SELECT .

/ pgpool-II
pgpool-II .
Listing 6.11:
1

pgpool

, ,
pgpool .
pgpool, -n
pgpool. pgpool-II -
.
Listing 6.12:
1

p g p o o l n &

103

6.3.
,
, , .
Listing 6.13:
1

p g p o o l n d > /tmp/ p g p o o l . l o g 2>&1 &

-d .

/tmp/pgpool.log. ,
,
. , , cronolog.
Listing 6.14:
1
2
3

p g p o o l n 2>&1 | / u s r / s b i n / c r o n o l o g
h a r d l i n k =/var / l o g / p g s q l / p g p o o l . l o g
/ var / l o g / p g s q l/%Y%m%dp g p o o l . l o g &

pgpool-II, .
Listing 6.15:
1

pgpool stop

- , pgpool-II
.
pgpool-II , .
Listing 6.16:
1

p g p o o l m f a s t s t o p

6.3


.
,
6.2. !,
.

pgbench.



true replication_mode pgpool.conf.

104

6.3.
Listing 6.17:
1

r e p l i c a t i o n _ m o d e = true

replication_mode true, pgpool-II


.
load_balance_mode true, pgpool-II
SELECT .
Listing 6.18:
1

load_balance_mode = true

replication_mode load_balance_mode.


, pgpool.conf, pgpool-II
. /
pgpool-II.
pgpool.conf pgpool-II,
.
, .
bench_replication.
. createdb pgpool-II
.
Listing 6.19:
1

c r e a t e d b p 9999 b e n c h _ r e p l i c a t i o n

pgbench -i. -i
.
Listing 6.20:
1

pgbench i p 9999 b e n c h _ r e p l i c a t i o n


, pgbench -i.
,
.

branches
1
tellers
10
accounts
100000
history
0

shell.
105

6.4.
branches, tellers, accounts history
(5432, 5433, 5434).
Listing 6.21:
1
2
3
4
5
6
7
8

f o r p o r t i n 5432 5433 5 4 3 4 ; do
>
echo $ p o r t
>
f o r table_name i n b r a n c h e s t e l l e r s a c c o u n t s history ; do
>
echo $table_name
>
p s q l c "SELECT count ( ) FROM $table_name " p \
>
$port bench_replication
>
done
> done

6.4


.
( partitioning . ).
,
.
pgpool-II
, (System
Database) ( SystemDB).
SystemDB ,
. SystemDB
dblink.
,
6.2. !,
.
pgbench.



parallel_mode true pgpool.conf.
Listing 6.22:
1

p a r a l l e l _ m o d e = true

parallel_mode true
. pgpool-II SystemDB
.

106

6.4.
SystemDB dblink pgpoolII. , listen_addresses
pgpool-II .
Listing 6.23:
1

listen_addresses =

: ,
, ,
. , -
,
bench_replication, 6.3.
.
Listing 6.24:
1
2

r e p l i c a t i o n _ m o d e = true
load_balance_mode = f a l s e

Listing 6.25:
1
2

replication_mode = false
load_balance_mode = true

parallel_mode load_balance_mode
true, listen_addresses *, replication_mode false.

SystemDB
, .
, dblink
, .
dist_def . ,
, pgpool-II
.
SystemDB 5432.
SystemDB
Listing 6.26: SystemDB
1
2
3
4
5
6

system_db_hostname = l o c a l h o s t
system_db_port = 5432
system_db_dbname = pgpool
system_db_schema = p g p o o l _ c a t a l o g
system_db_user = pgpool
system_db_password =

107

6.4.
, pgpool.conf.
pgpool pgpool pgpool.
Listing 6.27: SystemDB
1
2

c r e a t e u s e r p 5432 p g p o o l
c r e a t e d b p 5432 O p g p o o l p g p o o l

dblink
dblink pgpool. dblink
contrib PostgreSQL.
dblink .
Listing 6.28: dblink
1 USE_PGXS=1 make C c o n t r i b / d b l i n k
2 USE_PGXS=1 make C c o n t r i b / d b l i n k i n s t a l l

dblink
dblink pgpool. PostgreSQL
/usr/local/pgsql, dblink.sql ( )
/usr/local/pgsql/share/contrib.
dblink.
Listing 6.29: dblink
1

p s q l f / u s r / l o c a l / p g s q l / s h a r e / c o n t r i b / d b l i n k . s q l p 5432 p g p o o l

dist_def
dist_def,
. pgpool-II
, system_db.sql
/usr/local/share/system_db.sql (
- /usr/local).
system_db.sql ,
dist_def.
dist_def.
Listing 6.30: dist_def
1

$ p s q l f / u s r / l o c a l / s h a r e / system_db . s q l p 5432 U p g p o o l p g p o o l

system_db.sql, dist_def,
pgpool_catalog. system_db_schema
108

6.4.
, ,
system_db.sql.
dist_def .
.
Listing 6.31: dist_def
1 CREATE TABLE p g p o o l _ c a t a l o g . d i s t _ d e f (
2
dbname t e x t ,
3
schema_name t e x t ,
4
table_name t e x t ,
5
col_name t e x t NOT NULL CHECK ( col_name = ANY ( c o l _ l i s t ) ) ,
6

7
c o l _ l i s t t e x t [ ] NOT NULL,
8
t y p e _ l i s t t e x t [ ] NOT NULL,
9
d i s t _ d e f _ f u n c t e x t NOT NULL,
10

11
PRIMARY KEY ( dbname , schema_name , table_name )
12 ) ;

, dist_def, .
(col_name, dist_def_func)
- (dbname, schema_name, table_name,
col_list, type_list)

.
col_name. dist_def_func
, col_name
,
, .
- .

, -,
.
replicate_def
,
SQL, dist_def
, ,
,
replicate_def. replicate_def
system_db.sql dist_def. replicate_def
.
Listing 6.32: replicate_def
1 CREATE TABLE p g p o o l _ c a t a l o g . r e p l i c a t e _ d e f (

109

6.4.
2
3
4
5
6
7
8

);

dbname t e x t ,
schema_name t e x t ,
table_name t e x t ,
c o l _ l i s t t e x t [ ] NOT NULL,
t y p e _ l i s t t e x t [ ] NOT NULL,
PRIMARY KEY ( dbname , schema_name , table_name )



, pgbench, .
pgbench -i -s 3 (..
3).
bench_parallel.
sample pgpool-II
dist_def_pgbench.sql.
pgbench.
pgpool-II.
Listing 6.33:
1

p s q l f sample / dist_def_pgbench . s q l p 5432 p g p o o l

dist_def_pgbench.sql.
dist_def_pgbench.sql
dist_def. accounts.
- aid.
Listing 6.34:
1 INSERT INTO p g p o o l _ c a t a l o g . d i s t _ d e f VALUES (
2
bench_parallel ,
3
public ,
4
accounts ,
5
aid ,
6
ARRAY[ a i d , b i d , a b a l a n c e , f i l l e r ] ,
7
ARRAY[ i n t e g e r , i n t e g e r , i n t e g e r ,
8
character (84) ] ,
9
pgpool_catalog . dist_def_accounts
10 ) ;


accounts. ,
.
SQL (, PL/pgSQL, PL/Tcl, ..).
accounts
3, aid 1 300000.

110

6.4.

.
SQL- .
Listing 6.35:
1 CREATE OR REPLACE FUNCTION
2 p g p o o l _ c a t a l o g . d i s t _ d e f _ b r a n c h e s ( anyelement )
3 RETURNS integer AS $$
4
SELECT CASE WHEN $1 > 0 AND $1 <= 1 THEN 0
5
WHEN $1 > 1 AND $1 <= 2 THEN 1
6
ELSE 2
7
END;
8 $$ LANGUAGE s q l ;



.
pgbench
branches tellers.
, accounts
, branches tellers.
Listing 6.36:
1 INSERT INTO p g p o o l _ c a t a l o g . r e p l i c a t e _ d e f VALUES (
2
bench_parallel ,
3
public ,
4
branches ,
5
ARRAY[ b i d , b b a l a n c e , f i l l e r ] ,
6
ARRAY[ i n t e g e r , i n t e g e r , c h a r a c t e r ( 8 8 ) ]
7 );
9 INSERT INTO p g p o o l _ c a t a l o g . r e p l i c a t e _ d e f VALUES (
10
bench_parallel ,
11
public ,
12
tellers ,
13
ARRAY[ t i d , b i d , t b a l a n c e , f i l l e r ] ,
14
ARRAY[ i n t e g e r , i n t e g e r , i n t e g e r , c h a r a c t e r ( 8 4 ) ]
15 ) ;

Replicate_def_pgbench.sql
sample. psql ,
, , .
Listing 6.37:
1

p s q l f sample / r e p l i c a t e _ d e f _ p g b e n c h . s q l p 5432 p g p o o l

111

6.4.


, pgpool.conf, pgpool-II
. /
pgpool-II.
pgpool.conf pgpool-II,
.
, .
bench_parallel.
. createdb pgpool-II
.
Listing 6.38:
1

c r e a t e d b p 9999 b e n c h _ p a r a l l e l

pgbench -i -s 3. -i
. -s
.
Listing 6.39:
1

pgbench i s 3 p 9999 b e n c h _ p a r a l l e l


.

SELECT pgpool-II
. bench_parallel
.

branches
3
tellers
30
accounts
300000
history
0

pgpool-II shell.
accounts
5432, 5433, 5434 9999.
Listing 6.40:
1
2
3
4
5

f o r p o r t i n 5432 5433 5434 i 9 9 9 9 ; do


>
echo $ p o r t
>
p s q l c "SELECTmin ( a i d ) , max( a i d ) FROM a c c o u n t s " \
>
p $ p o r t b e n c h _ p a r a l l e l
> done

112

6.5. Master-slave

6.5

Master-slave

pgpool-II
( Slony-I, Londiste).
. master_slave_mode load_balance_mode
true. pgpool-II INSERT/UPDATE/DELETE
Master DB (1 ), SELECT ,
.
, DDL DML
. SELECT ,
/*NO LOAD BALANCE*/ SELECT.
Master/Slave replication_mode false,
master_slave_mode true.

Streaming Replication ( )
master-slave ,
, pgpoolII. PostgreSQL, pgpool-II
( ),
( ). ,
, (
recovery.conf, trigger_file), PostgreSQL
.
:
Listing 6.41: PostgreSQL
1
2
3
4
5
6
7
8
9
11
12
13

#! / b i n / sh
# F a i l o v e r command f o r s t r e m i n g r e p l i c a t i o n .
# This s c r i p t assumes t h a t DB node 0 i s primary , and 1 i s s t a n d b y .
#
# I f s t a n d b y g o e s down , d o e s n o t h i n g . I f primary g o e s down ,
create a
# t r i g g e r f i l e so t h a t s t a n d b y t a k e o v e r primary node .
#
# Arguments : $1 : f a i l e d node i d . $2 : new master hostname . $3 :
path to
# trigger file .
f a i l e d _ n o d e=$1
new_master=$2
t r i g g e r _ f i l e=$3

15 # Do n o t h i n g i f s t a n d b y g o e s down .
16 i f [ $ f a i l e d _ n o d e = 1 ] ; then
17
e x it 0 ;
18 f i

113

6.6.

20 # C r e at e t r i g g e r f i l e .
21 / u s r / b i n / s s h T $new_master / b i n / touch $ t r i g g e r _ f i l e
23

e x it 0 ;

: ,
.
failover_stream.sh pgpool.conf :
Listing 6.42:
1

failover_command = / p a t h_ t o _ s c r i p t / f a i l o v e r _ s t r e a m . sh %d %H
/tmp/ t r i g g e r _ f i l e

/tmp/trigger_file , recovery.conf.
, ,
.

6.6

pgpool-II, ,
pgpool. .
,
.
.
pgpool,
. :

CHECKPOINT;
;
, ;
CHECKPOINT;
;
postmaster ( pgpool_remote_start);
;


:
backend_data_directory
PostgreSQL .
recovery_user
PostgreSQL.
recovery_password
PostgreSQL.
114

6.6.
recovery_1st_stage_command
.

- . ,
recovery_1st_stage_command = some_script, pgpool-II
$PGDATA/some_script. , pgpool-II
recovery_1st_stage.
recovery_2nd_stage_command
.

- . ,
recovery_2st_stage_command = some_script, pgpool-II
$PGDATA/some_script. , pgpool-II
recovery_2st_stage.
, pgpool-II ,
.

Streaming Replication ( )
master-slave ,
PostgreSQL.
, .
PostgreSQL
pgpool-II ( ).
:
recovery_user. postgres.
Listing 6.43: recovery_user
1

recovery_user = postgres

recovery_password recovery_user
.
Listing 6.44: recovery_password
1

recovery_password = some_password

recovery_1st_stage_command.
basebackup.sh ($PGDATA),
. :
Listing 6.45: basebackup.sh
1 #! / b i n / sh
2 # Recovery s c r i p t f o r s t r e a m i n g r e p l i c a t i o n .

115

6.6.
3 # This s c r i p t assumes t h a t DB node 0 i s primary , and 1 i s
standby .
4 #
5 d a t a d i r=$1
6 d e s t h o s t=$2
7 d e s t d i r=$3
9

p s q l c "SELECT pg_start_backup ( Streaming R e p l i c a t i o n ,


true )" postgres

11

r s y n c C a d e l e t e e s s h e x c l u d e p o s t g r e s q l . c o n f
e x c l u d e p o s t m a s t e r . p i d \
12 e x c l u d e p o s t m a s t e r . o p t s e x c l u d e pg_log e x c l u d e pg_xlog
\
13 e x c l u d e r e c o v e r y . c o n f $ d a t a d i r / $ d e s t h o s t : $ d e s t d i r /
15

s s h T l o c a l h o s t mv $ d e s t d i r / r e c o v e r y . done
$destdir / recovery . conf

17

p s q l c "SELECT pg_stop_backup ( ) " p o s t g r e s

,
rsync .
SSH , recovery_user
.
:
Listing 6.46: recovery_1st_stage_command
1

recovery_1st_stage_command = basebackup . sh

recovery_2nd_stage_command .
, ,
basebackup.sh,
WAL .
C SQL
.
Listing 6.47: C SQL
1
2
3
4

$
$
$
$

cd pgpoolI I x . x . x/ s q l / pgpoolr e c o v e r y
make
make i n s t a l l
p s q l f pgpoolr e c o v e r y . s q l t e m p l a t e 1

. pcp_recovery_node
.

116

6.7.

6.7

PgPool-II ,
PostgreSQL.

117

7


,
.
.
?

7.1

( )
, ,

PostgreSQL. Windows,
. -,
.
, :
PgBouncer
Pgpool

7.2

PgBouncer

PostgreSQL Skype.
.
Session Pooling. .
;

.
Transaction Pooling.
. PgBouncer ,
, .
118

7.3. PgPool-II vs PgBouncer


Statement Pooling. .
.
,
.
(prepared statements) .
PgBouncer :
( 2 );
;
.
:
Listing 7.1: PgBouncer
1

pgbouncer [d ][ R][ v ][ u u s e r ] <pgbouncer . i n i >

:
Listing 7.2: PgBouncer
1
2
3
4
5
6
7
8
9
10

[ databases ]
t e m p l a t e 1 = h o s t = 1 2 7 . 0 . 0 . 1 p o r t =5432 dbname=t e m p l a t e 1
[ pgbouncer ]
l i s t e n _ p o r t = 6543
listen_addr = 127.0.0.1
auth_type = md5
auth_file = u s e r l i s t . txt
l o g f i l e = pgbouncer . l o g
p i d f i l e = pgbouncer . p i d
admin_users = someuser

userlist.txt :someuser
same_password_as_in_server
pgbouncer:
Listing 7.3: PgBouncer
1

p s q l h 1 2 7 . 0 . 0 . 1 p 6543 pgbouncer


SHOW.

7.3

PgPool-II vs PgBouncer

. PgBouncer ,
PgPool-II. ,
PgPool-II ( ),
PgBouncer.
119

7.3. PgPool-II vs PgBouncer


PgBouncer , PgPool-II
PgBouncer
PgBouncer (
)
PgBouncer PgPool-II .

120

8
PostgreSQL
- ,
- .

8.1

,
, .
SELECT
PostgreSQL.
,
, , .
SQL ,
, , . PostgreSQL
. ? -,
. ?
(MVCC MultiVersion Concurrency Control)
,
, ,

. ,
,
. ,
, . -,
, , ,
.
( ),

.

121

8.2. Pgmemcache
, ,
.
PostgreSQL:
Pgmemcache ( memcached)
Pgpool-II (query cache)

8.2

Pgmemcache

Memcached1 ,
.

.
-.

.
,
memcached
.
Pgmemcache2 PostgreSQL API libmemcached
memcached. PostgreSQL , , memcached. , .

2.0.4 pgmemcache3 . Pgmemcache PostgreSQL


8.4 ( 9.0 ), Ubuntu
Server 10.10. Pgmemcache ,
PostgreSQL PGXS ( ,
Linux PGXS). memcached libmemcached
0.38.
,
Pgmemcache:

:
Listing 8.1:
http://memcached.org/
http://pgfoundry.org/projects/pgmemcache/
3
http://pgfoundry.org/frs/download.php/2672/pgmemcache_2.0.4.tar.bz2
1
2

122

8.2. Pgmemcache
1
2

$ make
$ sudo make i n s t a l l

deb ( Debian, Ubuntu)


, Debian Ubuntu,
deb
PostgreSQL:
Listing 8.2: deb
1

$ sudo aptg e t i n s t a l l libmemcacheddev


p o s t g r e s q l s e r v e r dev 8.4 l i b p q dev d e v s c r i p t s yada f l e x
bison
2 $ make deb84
3 # deb
4 $ sudo dpkg i . . / p o s t g r e s q l pgmemcache 8 . 4 . deb

2.0.4 yada deb


:
Listing 8.3: deb
1
2

Cannot r e c o g n i z e source name i n d e b i a n / c h a n g e l o g a t


/ u s r / b i n / yada l i n e 1 4 5 , <CHANGELOG> l i n e 1 .
make : [ deb84 ] 9


debian/changelog , :
Listing 8.4: deb
1
2
4

$PostgreSQL : pgmemcache/ d e b i a n / c h a n g e l o g , v 1 . 2 2010/05/05


1 9 : 5 6 : 5 0 ormod Exp $ <
pgmemcache ( 2 . 0 . 4 ) u n s t a b l e ; urgency=low
v2 . 0 . 4

, deb
.

Pgmemcache
( Pgmemcache)
:
Listing 8.5:
1 % p s q l [ mydbname ] [ p g u s e r ]
2 [ mydbname]=# BEGIN;

123

8.2. Pgmemcache
3 [ mydbname]=# \ i / u s r / l o c a l / p o s t g r e s q l / s h a r e / c o n t r i b /pgmemcache . s q l
4 # Debian : \ i / u s r / s h a r e / p o s t g r e s q l / 8 . 4 / c o n t r i b /pgmemcache . s q l
5 [ mydbname]=# COMMIT;

memcached memcache_server_add
. . memcached
PostgreSQL.
, postgresql.conf :
pgmemcache shared_preload_libraries (
pgmemcache PostgreSQL)
pgmemcache custom_variable_classes (
pgmemcache)
pgmemcache.default_servers, host:port
(port - ) . :
Listing 8.6: default_servers
1

pgmemcache . d e f a u l t _ s e r v e r s = 1 2 7 . 0 . 0 . 1 , 1 9 2 . 1 6 8 . 0 . 2 0 : 1 1 2 1 1
# memcached

pgmemcache
pgmemcache.default_behavior.
libmemcached. :
Listing 8.7: pgmemcache
1

pgmemcache . d e f a u l t _ b e h a v i o r =BINARY_PROTOCOL: 1

PostgreSQL
memcached.

pgmemcache,
memcached :
.
memcached :
Listing 8.8: memcache_stats
1 pgmemcache=# SELECT memcache_stats ( ) ;
2
memcache_stats
3
5
6
7
8
9

Server : 1 2 7 . 0 . 0 . 1 (11211)
p i d : 1116
uptime : 70
time : 1289598098
version : 1.4.5

124

8.2. Pgmemcache
Table 8.1: pgmemcache

10
11
12
13
14
15
16
17

memcache_server_add(hostname:port::TEXT)
memcache_server_add(hostname::TEXT)

memcached
. ,
11211.

memcache_add(key::TEXT, value::TEXT,
expire::TIMESTAMPTZ)
memcache_add(key::TEXT, value::TEXT,
expire::INTERVAL)
memcache_add(key::TEXT, value::TEXT)

newval = memcache_decr(key::TEXT,
decrement::INT4)
newval = memcache_decr(key::TEXT)


,
( ).
.

memcache_delete(key::TEXT,
hold_timer::INTERVAL)
memcache_delete(key::TEXT)

.

,

.

memcache_flush_all()

memcached
.

value = memcache_get(key::TEXT)

. NULL,
,
.

memcache_get_multi(keys::TEXT[])
memcache_get_multi(keys::BYTEA[])

.

=.

newval = memcache_incr(key::TEXT,
increment::INT4)
newval = memcache_incr(key::TEXT)


,
( ).
.

memcache_replace(key::TEXT, value::TEXT,
expire::TIMESTAMPTZ)
memcache_replace(key::TEXT, value::TEXT,
expire::INTERVAL)
memcache_replace(key::TEXT, value::TEXT)

memcache_set(key::TEXT, value::TEXT,
expire::TIMESTAMPTZ)
memcache_set(key::TEXT, value::TEXT,
expire::INTERVAL)
memcache_set(key::TEXT, value::TEXT)

.

.

stats = memcache_stats()

memcached.

p o i n t e r _ s i z e : 32
rusage_user : 0.0
rusage_system : 0 . 2 4 0 0 1
curr_items : 0
total_items : 0
bytes : 0
curr_connections : 5
total_connections : 7

125

8.2. Pgmemcache
18
19
20
21
22
23
24
25
26
27
29

connection_structures : 6
cmd_get : 0
cmd_set : 0
get_hits : 0
get_misses : 0
evictions : 0
bytes_read : 20
b y t e s _ w r i t t e n : 782
limit_maxbytes : 67108864
threads : 4
( 1 row )

memcached :
Listing 8.9:
1 pgmemcache=# SELECT memcache_add ( some_key , t e s t _ v a l u e ) ;
2
memcache_add
3
4
t
5 ( 1 row )
7 pgmemcache=# SELECT memcache_get ( some_key ) ;
8
memcache_get
9
10
test_value
11 ( 1 row )

memcached (
):
Listing 8.10:
1 pgmemcache=# SELECT memcache_add ( some_seq , 1 0 ) ;
2
memcache_add
3
4
t
5 ( 1 row )
7 pgmemcache=# SELECT memcache_incr ( some_seq ) ;
8
memcache_incr
9
10
11
11 ( 1 row )
13 pgmemcache=# SELECT memcache_incr ( some_seq ) ;
14
memcache_incr
15
16
12
17 ( 1 row )
19
20

pgmemcache=# SELECT memcache_incr ( some_seq , 10) ;


memcache_incr

126

8.2. Pgmemcache
21
22
22
23 ( 1 row )
25 pgmemcache=# SELECT memcache_decr ( some_seq ) ;
26
memcache_decr
27
28
21
29 ( 1 row )
31 pgmemcache=# SELECT memcache_decr ( some_seq ) ;
32
memcache_decr
33
34
20
35 ( 1 row )
37 pgmemcache=# SELECT memcache_decr ( some_seq , 6) ;
38
memcache_decr
39
40
14
41 ( 1 row )

pgmemcache , ,
.
,
memcached ( ),
, . :
Listing 8.11:
1 CREATE OR REPLACE FUNCTION auth_passwd_upd ( ) RETURNS TRIGGER AS $$
2
BEGIN
3
IF OLD. passwd != NEW. passwd THEN
4
PERFORM memcache_set ( user_id_ | | NEW. u s e r _ i d | |
_password , NEW. passwd ) ;
5
END IF ;
6
RETURN NEW;
7 END;
8 $$ LANGUAGE p l p g s q l ;

:
Listing 8.12:
1 CREATE TRIGGER auth_passwd_upd_trg AFTER UPDATE ON passwd FOR
EACH ROW EXECUTE PROCEDURE auth_passwd_upd ( ) ;

(!!!)
.
:
Listing 8.13:
127

8.2. Pgmemcache
1 CREATE OR REPLACE FUNCTION auth_passwd_upd ( ) RETURNS TRIGGER AS $$
2 BEGIN
3
IF OLD. passwd != NEW. passwd THEN
4
PERFORM memcache_delete ( user_id_ | | NEW. u s e r _ i d | |
_password ) ;
5
END IF ;
6
RETURN NEW;
7 END; $$ LANGUAGE p l p g s q l ;

:
Listing 8.14:
1 CREATE TRIGGER auth_passwd_del_trg AFTER DELETE ON passwd FOR
EACH ROW EXECUTE PROCEDURE auth_passwd_upd ( ) ;

, memcached
( ) .

PostgreSQL Pgmemcache
memcached ,
.
memcached,
SQL ,
PostgreSQL.

128

9.1

PostgreSQL
.
.

9.2

PostGIS

: Open Source
: http://www.postgis.org/
PostGIS PostgreSQL. PostGIS PostgreSQL

(), , ESRI SDE Oracle.
PostGIS OpenGIS " SQL"
.

9.3

PostPic

: Open Source
: http://github.com/drotiro/postpic
PostPic PostgreSQL,
, PostGIS
. image,
(, ,
..) (, , ).
129

9.4. OpenFTS

9.4

OpenFTS

: Open Source
: http://openfts.sourceforge.net/
OpenFTS (Open Source Full Text Search engine)
PostgreSQL ,
.
,
.

9.5

PL/Proxy

: Open Source
: http://pgfoundry.org/projects/plproxy/
PL/Proxy -
.
5.2 .

9.6

Texcaller

: Open Source
: http://www.profv.de/texcaller/
Texcaller TeX,
. C, ,
, TeX. TeX
NULL,
. , ,
NOTICEs.

9.7

Pgmemcache

: Open Source
: http://pgfoundry.org/projects/pgmemcache/
Pgmemcache PostgreSQL API libmemcached
memcached. PostgreSQL , , memcached. 8.2 .

9.8

Prefix

: Open Source
: http://pgfoundry.org/projects/prefix
130

9.9. pgSphere
Prefix (prefix @> text). Prefix
,
/
.

9.9

pgSphere

: Open Source
: http://pgsphere.projects.postgresql.org/
pgSphere PostgreSQL ,
.
( PostGIS)
.

9.10

PostgreSQL
. PostgreSQL
, , ,
.

131

10
PostgreSQL

,
, ,


-

, .

10.1

.
, ,
, - .
PostgreSQL . !
, ,
, PostgreSQL
.
, ,
(DELETE, DROP),
( ).

PostgreSQL:
SQL ;
;
;
.

132

10.2. SQL

10.2

SQL

SQL.

, . PostgreSQL
pg_dump.
pg_dump:
Listing 10.1: pg_dump
1 pg_dump dbname > o u t f i l e

:
Listing 10.2:
1

p s q l dbname < i n f i l e

dbname .
, ,
( ,
). ,
,
:
Listing 10.3:
1

p s q l set ON_ERROR_STOP=on dbname < i n f i l e

,
:
Listing 10.4:
1 pg_dump h h o s t 1 dbname | p s q l h h o s t 2 dbname

ANALYZE,
.
, , ,
?
PostgreSQL pg_dumpall. pg_dumpall
PostgreSQL:
Listing 10.5: PostgreSQL
1

pg_dumpall > o u t f i l e

:
Listing 10.6: PostgreSQL
1

p s q l f i n f i l e p o s t g r e s

133

10.2. SQL

SQL

,
pg_dump. , pg_dump
. Unix,
. :
.
, GZIP:
Listing 10.7: PostgreSQL
1 pg_dump dbname | g z i p > f i l e n a m e . gz

:
Listing 10.8: PostgreSQL
1

g u n z i p c f i l e n a m e . gz | p s q l dbname

Listing 10.9: PostgreSQL


1

cat f i l e n a m e . gz | g u n z i p | p s q l dbname

split.
split ,
.
, 1 :
Listing 10.10: PostgreSQL
1 pg_dump dbname | s p l i t b 1m f i l e n a m e

:
Listing 10.11: PostgreSQL
1

cat f i l e n a m e | p s q l dbname

pg_dump
PostgreSQL Zlib,
.
GZIP,
:
Listing 10.12: PostgreSQL
1 pg_dump Fc dbname > f i l e n a m e

134

10.3.
psql ,
pg_restore:
Listing 10.13: PostgreSQL
1

p g _ r e s t o r e d dbname f i l e n a m e

, split
.

10.3


, PostgreSQL
. :
Listing 10.14: PostgreSQL
1

t a r c f backup . t a r / u s r / l o c a l / p g s q l / data

, ,
, , SQL :
PostgreSQL , ,
(PostgreSQL
, ). ,

PostgreSQL.
.
, (snapshot)
( PostgreSQL). PostgreSQL
. , , ,
,
. PostgreSQL
, ,
WAL. ,
( WAL ). ,
PostgreSQL
,
(!!!).
,
.
rsync.
rsync PostgreSQL (PostgreSQL
). PostgreSQL
135

10.4.
rsync. rsync ,
,
, .

.

10.4

PostgreSQL (Write Ahead


Log, WAL) pg_xlog ,
. .

PostgreSQL: ,
. ,

:
WAL .
,
,
WAL .
,
, :
.

( ,
).
.
WAL
, ,
PostgreSQL
( ).
,
. ,
WAL .

.
WAL
pg_xlog. postgresql.conf:
Listing 10.15:
1

archive_mode = on # e n a b l e a r c h i v i n g

136

10.5.
2
3

archive_command = cp v %p / data / p g s q l / a r c h i v e s/%f


a r c h i v e _ t i m e o u t = 300 # t i m e o u t t o c l o s e b u f f e r s

( )
. rsync.
, ,
.
Listing 10.16: WAL
1
2

r s y n c avz d e l e t e prod1 : / data / p g s q l / a r c h i v e s / \


/ data / p g s q l / a r c h i v e s / > / dev / n u l l

, pg_xlog
PostgreSQL ( ).
PostgreSQL recovery.conf
:
Listing 10.17: recovery.conf
1

restore_command = cp / data / p g s q l / a r c h i v e s/% f "%p"

PostgreSQL
, (,
, ).
http://www.postgresql.org/docs/9.0/static
archiving.html.

10.5

, ,
, .
,
PostgreSQL (, ).

137

11

PostgreSQL
,
(),

.
-
,
, .

11.1

,

. -
( ,
). ,
SQL , ,
PostgreSQL , .
,
. :
?

( ..),
, ,
. , ,
.

138

11.2.


, - - , .
, :
;
;
,
, ( Twitter Facebook ).
, .

11.2

,
, .
, .. ,
,
.


PgPool-II v.3 + Postgresql v.9 Streaming Replication
,
1 . :


(failover)


.
pgpool-II

2 .
PgPool-II v.3 + Postgresql Slony
, Slony. :
(failover)


.
pgpool-II
Postgresql 9
1
2

http://pgpool.projects.postgresql.org/contrib_docs/simple_sr_setting/index.html
http://www.slideshare.net/leopard_me/postgresql-5845499

139

11.3.

11.3

,
( Google Analytics). (
).



. PgQ.
PgQ , PostgreSQL.
Skype. Londiste ( 4.3). :
PostgreSQL
,

PgQ , ,

(batches)
API SQL

RabbitMQ. RabbitMQ ,
(Message Oriented Middleware)
AMQP (Advanced Message Queuing Protocol). RabbitMQ
Mozilla Public License. RabbitMQ Open
Telecom Platform,
Erlang.

140

12

(Performance Snippets)



.
. -
.
, ,
.
(House M.D.),
1 1

12.1

PostgreSQL,
,
( ).
,
PostgreSQL.
,
PostgreSQL , .
,
1 .
1

https://github.com/le0pard/postgresql_book/issues

141

12.2.

12.2



( ). PostgreSQL >= 8.1.
Listing 12.1: . SQL
1 SELECT nspname | | . | | relname AS " r e l a t i o n " ,
2
p g _ s i z e _ p r e t t y ( p g _ r e l a t i o n _ s i z e (C . o i d ) ) AS " s i z e "
3
FROM p g _ c l a s s C
4
LEFT JOIN pg_namespace N ON (N. o i d = C . r e l n a m e s p a c e )
5
WHERE nspname NOT IN ( pg_catalog , information_schema )
6
ORDER BY p g _ r e l a t i o n _ s i z e (C . o i d ) DESC
7
LIMIT 2 0 ;

: https://gist.github.com/910674
:
Listing 12.2: .
1
relation
|
size
2 +
3
public . accounts
| 326 MB
4
p u b l i c . accounts_pkey
| 44 MB
5
p u b l i c . history
| 592 kB
6
public . tellers_pkey
| 16 kB
7
p u b l i c . branches_pkey
| 16 kB
8
public . t e l l e r s
| 16 kB
9
public . branches
| 8192 b y t e s



. PostgreSQL >= 8.1.
Listing 12.3: . SQL
1 SELECT nspname | | . | | relname AS " r e l a t i o n " ,
2
p g _ s i z e _ p r e t t y ( p g _ t o t a l _ r e l a t i o n _ s i z e (C . o i d ) ) AS " t o t a l _ s i z e "
3
FROM p g _ c l a s s C
4
LEFT JOIN pg_namespace N ON (N. o i d = C . r e l n a m e s p a c e )
5
WHERE nspname NOT IN ( pg_catalog , information_schema )
6
AND C . r e l k i n d <> i
7
AND nspname ! ~ ^ pg_toast

142

12.2.
8
9

ORDER BY p g _ t o t a l _ r e l a t i o n _ s i z e (C . o i d ) DESC
LIMIT 2 0 ;

: https://gist.github.com/910696
:
Listing 12.4: .
1
relation
| total_size
2 +
3
public . actions
| 4249 MB
4
p u b l i c . p r o d u c t _ h i s t o r y _ r e c o r d s | 197 MB
5
p u b l i c . product_updates
| 52 MB
6
p u b l i c . import_products
| 34 MB
7
public . products
| 29 MB
8
public . v i s i t s
| 25 MB

count

. ,
count.
Listing 12.5: count. SQL
1
2
3
4
5
6
7
8

CREATE LANGUAGE p l p g s q l ;
CREATE FUNCTION c o u n t _ e s t i m a t e ( query t e x t ) RETURNS i n t e g e r AS $$
DECLARE
rec
record ;
rows i n t e g e r ;
BEGIN
FOR r e c IN EXECUTE EXPLAIN | | query LOOP
rows := s u b s t r i n g ( r e c . "QUERYPLAN" FROM
rows = ( [ [ : d i g i t : ] ] + ) ) ;
9
EXIT WHEN rows IS NOT NULL;
10
END LOOP;
12
RETURN rows ;
13 END;
14 $$ LANGUAGE p l p g s q l VOLATILE STRICT ;
17

Testing :

20 CREATE TABLE f o o ( r d o u b l e p r e c i s i o n ) ;
21 INSERT INTO f o o SELECT random ( ) FROM g e n e r a t e _ s e r i e s ( 1 , 1 0 0 0 ) ;
22 ANALYZE f o o ;
24 # SELECT c ou n t ( ) FROM f o o WHERE r < 0 . 1 ;

143

12.2.
25
count
26
27
92
28 ( 1 row )
30 # SELECT c o u n t _ e s t i m a t e ( SELECT FROM f o o WHERE r < 0 . 1 ) ;
31
count_estimate
32
33
94
34 ( 1 row )

: https://gist.github.com/910728

-
-
( DEFAULT).
Listing 12.6: - . SQL
1 CREATE OR REPLACE FUNCTION r e t _ d e f ( t e x t , t e x t , t e x t ) RETURNS t e x t
AS $$
2 SELECT
3
COLUMNS. column_default : : t e x t
4 FROM
5
information_schema .COLUMNS
6
WHERE table_name = $2
7
AND table_schema = $1
8
AND column_name = $3
9 $$ LANGUAGE s q l IMMUTABLE;

: https://gist.github.com/910749
:
Listing 12.7: - .
1 # SELECT r e t _ d e f ( schema , t a b l e , column ) ;
3 SELECT r e t _ d e f ( p u b l i c , i m a g e _ f i l e s , id ) ;
4
ret_def
5
6
nextval ( image_files_id_seq : : r e g c l a s s )
7 ( 1 row )
9 SELECT r e t _ d e f ( p u b l i c , schema_migrations , v e r s i o n ) ;
10
ret_def
11
13

( 1 row )

144

12.2.


(random)
( ).
Listing 12.8: . SQL
1 CREATE OR REPLACE FUNCTION random ( numeric , numeric )
2 RETURNS numeric AS
3 $$
4
SELECT ( $1 + ( $2 $1 ) random ( ) ) : : numeric ;
5 $$ LANGUAGE s q l VOLATILE ;

: https://gist.github.com/910763
:
Listing 12.9: .
1 SELECT random ( 1 , 1 0 ) : : i n t , random ( 1 , 1 0 ) ;
2
random |
random
3 +
4
6 | 5.11675184825435
5 ( 1 row )
7 SELECT random ( 1 , 1 0 ) : : i n t , random ( 1 , 1 0 ) ;
8
random |
random
9 +
10
7 | 1.37060070643201
11 ( 1 row )



, .

, ,
,
.
IBM 1960 .

,
.
http://en.wikipedia.org/wiki/Luhn_algorithm
SQL. ,
.

145

12.2.
Listing 12.10: . SQL
1 CREATE OR REPLACE FUNCTION l u h n _ v e r i f y ( i n t 8 ) RETURNS BOOLEAN AS $$
2 Take t h e sum o f t h e
3 doubled d i g i t s and t h e evennumbered undoubled d i g i t s , and s e e
if
4 t h e sum i s e v e n l y d i v i s i b l e by z e r o .
5 SELECT
6
Doubled d i g i t s might i n t u r n be two d i g i t s . In t h a t
case ,
7
we must add each d i g i t i n d i v i d u a l l y r a t h e r than
adding t h e
8
doubled d i g i t v a l u e t o t h e sum . I e i f t h e o r i g i n a l
d i g i t was
9
6 t h e doubled r e s u l t was 1 2 and we must add 1+2
to the
10
sum r a t h e r than 1 2 .
11
MOD(SUM( d o u b l e d _ d i g i t / INT8 1 0 + d o u b l e d _ d i g i t % INT8
10 ) , 10) = 0
12 FROM
13 Double oddnumbered d i g i t s ( c o u n t i n g l e f t with
14 l e a s t s i g n i f i c a n t a s z e r o ) . I f t h e doubled d i g i t s end up
15 having v a l u e s
16 > 10 ( i e they r e two d i g i t s ) , add t h e i r d i g i t s t o g e t h e r .
17 (SELECT
18
E x t r a c t d i g i t n c o u n t i n g l e f t from l e a s t s i g n i f i c a n t
19
a s z e r o
20
MOD( ( $1 : : i n t 8 / (10^ n ) : : i n t 8 ) , 1 0 : : i n t 8 )
21
Double oddnumbered d i g i t s
22
(MOD( n , 2 ) + 1 )
23
AS d o u b l e d _ d i g i t
24
FROM g e n e r a t e _ s e r i e s ( 0 , CEIL (LOG( $1 ) ) : : INTEGER 1 ) AS
n
25 ) AS d o u b l e d _ d i g i t s ;
27 $$ LANGUAGE SQL
28 IMMUTABLE
29 STRICT ;
31 COMMENT ON FUNCTION l u h n _ v e r i f y ( i n t 8 ) IS Return true i f f t h e
l a s t d i g i t of the
32 i n p u t i s a c o r r e c t check d i g i t f o r t h e r e s t o f t h e i n p u t
a c c o r d i n g t o Luhn s
33 a l g o r i t h m . ;
34 CREATE OR REPLACE FUNCTION l u h n _ g e n e r a t e _ c h e c k d i g i t ( i n t 8 ) RETURNS
i n t 8 AS $$
35 SELECT
36
Add t h e d i g i t s , d o u b l i n g evennumbered d i g i t s ( c o u n t i n g
left
37
with l e a s t s i g n i f i c a n t a s z e r o ) . S u b t r a c t t h e r e m a i n d e r o f
38
d i v i d i n g t h e sum by 10 from 1 0 , and t a k e t h e r e m a i n d e r
39
o f d i v i d i n g t h a t by 10 i n t u r n .
40
( ( INT8 1 0 SUM( d o u b l e d _ d i g i t / INT8 1 0 + d o u b l e d _ d i g i t
% INT8 1 0 ) %

146

12.2.
41
INT8 1 0 ) % INT8 1 0 ) : : INT8
42 FROM (SELECT
43
E x t r a c t d i g i t n c o u n t i n g l e f t from l e a s t
significant\
44
a s z e r o
45
MOD( ( $1 : : i n t 8 / (10^ n ) : : i n t 8 ) , 1 0 : : i n t 8 )
46
d o u b l e evennumbered d i g i t s
47
( 2 MOD( n , 2 ) )
48
AS d o u b l e d _ d i g i t
49
FROM g e n e r a t e _ s e r i e s ( 0 , CEIL (LOG( $1 ) ) : : INTEGER 1 ) AS n
50 ) AS d o u b l e d _ d i g i t s ;
52 $$ LANGUAGE SQL
53 IMMUTABLE
54 STRICT ;
56 COMMENT ON FUNCTION l u h n _ g e n e r a t e _ c h e c k d i g i t ( i n t 8 ) IS For t h e
input
57 val ue , g e n e r a t e a check d i g i t a c c o r d i n g t o Luhn s a l g o r i t h m ;
58 CREATE OR REPLACE FUNCTION l u h n _ g e n e r a t e ( i n t 8 ) RETURNS i n t 8 AS $$
59 SELECT 10 $1 + l u h n _ g e n e r a t e _ c h e c k d i g i t ( $1 ) ;
60 $$ LANGUAGE SQL
61 IMMUTABLE
62 STRICT ;
64 COMMENT ON FUNCTION l u h n _ g e n e r a t e ( i n t 8 ) IS Append a check d i g i t
generated
65 a c c o r d i n g t o Luhn s a l g o r i t h m t o t h e i n p u t v a l u e . The i n p u t
v a l u e must be no
66 g r e a t e r than ( maxbigint / 1 0 ) . ;
67 CREATE OR REPLACE FUNCTION l u h n _ s t r i p ( i n t 8 ) RETURNS i n t 8 AS $$
68 SELECT $1 / 1 0 ;
69 $$ LANGUAGE SQL
70 IMMUTABLE
71 STRICT ;
73 COMMENT ON FUNCTION l u h n _ s t r i p ( i n t 8 ) IS S t r i p t h e l e a s t
s i g n i f i c a n t d i g i t from
74 t h e i n p u t v a l u e . I n t e n d e d f o r u s e when s t r i p p i n g t h e check d i g i t
from a number
75 i n c l u d i n g a Luhn s a l g o r i t h m check d i g i t . ;

: https://gist.github.com/910793
:
Listing 12.11: .
1 S e l e c t luhn_verify (49927398716) ;
2
luhn_verify
3
4
t
5 ( 1 row )
7

S e l e c t luhn_verify (49927398714) ;

147

12.2.
8
luhn_verify
9
10
f
11 ( 1 row )



IN.
,
. :
: (2,6,4,10,25,7,9)
.. 2 2 2 6 6 4 4
Listing 12.12: . SQL

1 SELECT f o o . FROM f o o
2 JOIN (SELECT i d . val , row_number ( ) o v e r ( ) FROM
(VALUES( 3 ) , ( 2 ) , ( 6 ) , ( 1 ) , ( 4 ) ) AS
3 i d ( v a l ) ) AS i d
4 ON ( f o o . c a t a l o g _ i d = i d . v a l ) ORDER BY row_number ;

VALUES(3),(2),(6),(1),(4)
foo ,
foo.catalog_id ( foo.catalog_id
IN(3,2,6,1,4))

,
, (. quine) (
),
.
SQL PostgreSQL.
Listing 12.13:
1
2
3

s e l e c t a | | from ( s e l e c t | | q u o t e _ l i t e r a l ( a ) | | b | | , | |
q u o t e _ l i t e r a l ( b ) | | : : t e x t a s b ) a s quine from
( s e l e c t s e l e c t a | | from ( s e l e c t | | q u o t e _ l i t e r a l ( a ) | | b
| | , | | q u o t e _ l i t e r a l (b) | | : : text as b) as
quine : : t e x t a s a , : : t e x t a s a : : t e x t a s b ) a s q u i n e ;

148

12.2.
: https://gist.github.com/972335

LIKE
web2.0 .
LIKE some%, some ,
. ,
( ) .
LIKE bla%
text_pattern_ops ( varchar_pattern_ops varchar).
.
Listing 12.14: LIKE
1 p r e f i x _ t e s t=# c r e a t e t a b l e t a g s (
2 p r e f i x _ t e s t (# t a g
t e x t primary key ,
3 p r e f i x _ t e s t (# name
t e x t not n u l l ,
4 p r e f i x _ t e s t (# shortname t e x t ,
5 p r e f i x _ t e s t (# s t a t u s
c har d e f a u l t S ,
6 p r e f i x _ t e s t (#
7 p r e f i x _ t e s t (# c h e c k ( s t a t u s i n ( S , R ) )
8 p r e f i x _ t e s t (# ) ;
9 NOTICE : CREATE TABLE / PRIMARY KEY w i l l c r e a t e i m p l i c i t i n d e x
" tags_pkey " f o r t a b l e " t a g s "
10 CREATE TABLE
11 p r e f i x _ t e s t=# CREATE INDEX i _ t a g ON t a g s USING b t r e e ( l o w e r ( t a g )
text_pattern_ops ) ;
12 CREATE INDEX
14 p r e f i x _ t e s t=# c r e a t e t a b l e i n v a l i d _ t a g s (
15 p r e f i x _ t e s t (# t a g
t e x t primary key ,
16 p r e f i x _ t e s t (# name
t e x t not n u l l ,
17 p r e f i x _ t e s t (# shortname t e x t ,
18 p r e f i x _ t e s t (# s t a t u s
c har d e f a u l t S ,
19 p r e f i x _ t e s t (#
20 p r e f i x _ t e s t (# c h e c k ( s t a t u s i n ( S , R ) )
21 p r e f i x _ t e s t (# ) ;
22 NOTICE : CREATE TABLE / PRIMARY KEY w i l l c r e a t e i m p l i c i t i n d e x
" invalid_tags_pkey " for t a b l e " invalid_tags "
23 CREATE TABLE

27 p r e f i x _ t e s t=# s e l e c t co un t ( ) from t a g s ;
28
count
29
30
11966
31 ( 1 row )
33

p r e f i x _ t e s t=# s e l e c t co un t ( ) from i n v a l i d _ t a g s ;

149

12.2.
34
count
35
36
11966
37 ( 1 row )

41 TEST STANDART LIKE

44 # EXPLAIN ANALYZE s e l e c t from i n v a l i d _ t a g s where l o w e r ( t a g )


LIKE l o w e r ( 0146% ) ;
45
QUERY PLAN
46
47
Seq Scan on i n v a l i d _ t a g s ( c o s t = 0 . 0 0 . . 2 6 5 . 4 9 rows=60 width =26)
( a c t u a l time = 0 . 3 5 9 . . 2 0 . 6 9 5 rows=1 l o o p s =1)
48
F i l t e r : ( l o w e r ( t a g ) ~~ 0 1 4 6 % : : t e x t )
49
T o t a l runtime : 2 0 . 8 0 3 ms
50 ( 3 rows )

52 # EXPLAIN ANALYZE s e l e c t from i n v a l i d _ t a g s where l o w e r ( t a g )


LIKE l o w e r ( 0146% ) ;
53
QUERY PLAN
54
55
Seq Scan on i n v a l i d _ t a g s ( c o s t = 0 . 0 0 . . 2 6 5 . 4 9 rows=60 width =26)
( a c t u a l time = 0 . 5 4 9 . . 1 9 . 5 0 3 rows=1 l o o p s =1)
56
F i l t e r : ( l o w e r ( t a g ) ~~ 0 1 4 6 % : : t e x t )
57
T o t a l runtime : 1 9 . 5 5 0 ms
58 ( 3 rows )

62 TEST VARIANT WITH text_pattern_ops

65 # EXPLAIN ANALYZE s e l e c t from t a g s where l o w e r ( t a g ) LIKE


l o w e r ( 0146% ) ;
66
QUERY PLAN
67
68
Bitmap Heap Scan on t a g s ( c o s t = 5 . 4 9 . . 9 7 . 7 5 rows =121 width =26)
( a c t u a l time = 0 . 0 5 4 . . 0 . 0 5 7 rows=1 l o o p s =1)
69
F i l t e r : ( l o w e r ( t a g ) ~~ 0 1 4 6 % : : t e x t )
70
> Bitmap Index Scan on i_tag ( c o s t = 0 . 0 0 . . 5 . 4 6 rows =120
width =0) ( a c t u a l time = 0 . 0 3 2 . . 0 . 0 3 2 rows=1 l o o p s =1)
71
Index Cond : ( ( l o w e r ( t a g ) ~>=~ 0 1 4 6 : : t e x t ) AND
( l o w e r ( t a g ) ~<~ 0 1 4 7 : : t e x t ) )
72
T o t a l runtime : 0 . 1 1 9 ms
73 ( 5 rows )

75 # EXPLAIN ANALYZE s e l e c t from t a g s where l o w e r ( t a g ) LIKE


l o w e r ( 0146% ) ;
76
QUERY PLAN
77

150

12.2.
78
79
80
81
82
83

Bitmap Heap Scan on t a g s ( c o s t = 5 . 4 9 . . 9 7 . 7 5 rows =121 width =26)


( a c t u a l time = 0 . 0 2 5 . . 0 . 0 2 5 rows=1 l o o p s =1)
F i l t e r : ( l o w e r ( t a g ) ~~ 0 1 4 6 % : : t e x t )
> Bitmap Index Scan on i_tag ( c o s t = 0 . 0 0 . . 5 . 4 6 rows =120
width =0) ( a c t u a l time = 0 . 0 1 6 . . 0 . 0 1 6 rows=1 l o o p s =1)
Index Cond : ( ( l o w e r ( t a g ) ~>=~ 0 1 4 6 : : t e x t ) AND
( l o w e r ( t a g ) ~<~ 0 1 4 7 : : t e x t ) )
T o t a l runtime : 0 . 0 5 0 ms
( 5 rows )

: https://gist.github.com/972713

151