SQ Oop Commands

S.
NO Command
$ sqoop import (generic-args) (import-args)
OR $ sqoop-import (generic-
args) (import-args)
1
$ sqoop import --connect <db-URL> --username <db-

username> --password <db-password> --table <table name>
--target-dir <new or exist directory in HDFS>

--target-dir <new or exist directory in HDFS>

--where <condition>
4
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
-P \
--table <table name>
5
$ sqoop import \
--password-file <file-path> \
--table <table name>
6
$ sqoop import \
--table <table name> \
--incremental <mode> \
--check-column <column name> \
--last value <last check column value>
7
$ sqoop import \
--table <table name> \ --as-
sequencefile
8
$ sqoop import \
--table <table name> \ --as-
avrodatafile
9
$ sqoop import \
--compress
10
$ sqoop import \
--direct
11
$ sqoop import \
--table <table name> \ --num-
mappers <number>
12
$ sqoop import-all-tables (generic-args) (import-args)
OR $ sqoop-
import-all-tables (generic-args) (import-args)
13
$ sqoop import-all-tables \
--exclude-tables <table names>
14
$ sqoop job (generic-args) (job-args) [-- [subtool-
name] (subtool-args)] or $ sqoop-job
(generic-args) (job-args) [-- [subtool-name] (subtool-
args)]
15
$ sqoop job --list
16
$ sqoop job --show <jobId>
17
18 $ sqoop job --delete <jobId>
19 $ sqoop job --exec <jobId>

$ sqoop export (generic-args) (export-args)
OR $ sqoop-export (generic-
args) (export-args)
20
$ sqoop export \
--password <db-password> \
--export-dir <new or exist directory in HDFS> \
--batch
21
OR $ sqoop export \
-Dsqoop.export.records.per.statement=10 \
22
OR $ sqoop export \
-sqoop.export.statements.per.transaction=<number> \
23
$ sqoop export \
--columns <column-names>
24
25
Explanation
syntax is used to import individual tables data from RDBMS into
HDFS in the default directory.
syntax to specify the target directory on HDFS where Sqoop

should import table data.
syntax to specify the parent directory on HDFS where Sqoop

should import table data.
syntax used to transfer only a subset of the rows based on various

conditions with WHERE clause.
The option -P will instruct Sqoop to read the password from

standard input.
The parameter --password-file, will load the password from any

specified file on HDFS cluster.
Used to import only newly added rows in a table.require to add

incremental, check-column, and last-value options to perform
the incremental import.
Used to display output in binary format i.e., sequencial file

format.
Used to display output in binary format i.e.,avro file format.

Using -- compress option ,By default output files will be
compressed using the GZip codec, and all files will end up with
a .gz extension.
To improve performance for faster importing of bulk data over

HDFS and to reduce burden on database server.
Sqoop by default uses four concurrent map tasks to transfer data

to Hadoop.
Using parameter --num-mappers, we have flexibility to change
the number of map tasks used per-job.
used to import all the tables data at same time from the RDBMS
to the HDFS. Each table data is stored in a separate directory and
the directory name is same as the table name
Parameter --exclude-tables ,used to skip few table while

importing all tables data at same time from RDBMS to HDFS.
syntax for creating a Sqoop job.
used to verify the list of saved Sqoop jobs.
Displays information about a job.

Deletes an existing job
Executes a saved job.
export data back from the HDFS to the RDBMS database.The
target table must exist in the target database.otherwise table has
to be created manually in the target database.
we can enable JDBC batching using the --batch parameter.The

--batch parameter used for inserting more the one row at a time.
By using the property sqoop.export.records.per.statement we

can specify the number of records that will be used in each insert
statement.
By using the property sqoop.export.statements.per.transaction ,

we can specify the number of rows will be inserted per
transaction can be set.
The --columns parameter to specify which columns (and in what

order) are present in the Hadoop data.
Example Result
$ sqoop import \ output will be a comma separated CSV file
--connect jdbc:mysql://localhost/userdb \ .we can view the file using command hadoop
--username root \
--password sqoop \ fs -cat /emp/part-m-*
--table emp
$ sqoop import \ we can view output file using command

--connect jdbc:mysql://localhost/userdb \ hadoop fs -cat /input/empresult/part-m-*
--username root \
--password sqoop \
--table emp \
--target-dir /input/empresult
$ sqoop import \ we can view output file using command

--connect jdbc:mysql://localhost/userdb \ hadoop fs -cat /input/*
--username root \
--password sqoop \
--table emp \
--warehouse-dir /input
$ sqoop import \ can view output file using command hadoop

--connect jdbc:mysql://localhost/userdb \ fs -cat /input/empresult/wherequery/part-
--username root \
--password sqoop \ m-*
--table emp \
--where "City ='Hyderabad'"
--target-dir /input/empresult/wherequery
$ sqoop import \ Sqoop will prompt user as Enter

--connect jdbc:mysql://localhost/userdb \ Password:
--username root \
-P \ Type password and
--table emp press the Enter key .

--username root \
--table emp \ fs -cat /emp/part-m-*
--incremental append \
--check-column id \
--last value 3
Example uses the BZip2 codec instead of GZip output files on HDFS will end up having
$ sqoop import \ the .bz2 extension.
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp \
--compression-codec
org.apache.hadoop.io.compress.BZip2Codec

--username root \
--direct

--username root \
--num-mappers 6
$ sqoop import-all-tables \ output contains list of tables data present in

--connect database userdb.Each table data is stored in
jdbc:mysql://localhost/userdb \
--username root separate directory with table name in HDFS
we can view using command hadoop fs -ls
$ sqoop import-all-tables \
--connect
jdbc:mysql://localhost/userdb \
--username root \
--exclude-tables empContact,empSalary
$ sqoop job --create myjob \ command is used to create a job that is

--import \
--connect jdbc:mysql://localhost/userdb \ importing data from the emptablein
--username root \ heuserdbdatabasetotheHDFS
--password sqoop \
--table emp \ file.
$ sqoop job --list following is the output

Available jobs: myjob
$ sqoop job --show myjob output displays job tools and their options
Job: myjob
Tool: import Options:
----------------------------
direct.import = true
codegen.input.delimiters.record = 0
hdfs.append.dir = false
userdb.table = emp
...
incremental.last.value = 3

$ sqoop export \ The employee data is available in emp_data
--connect jdbc:mysql://localhost/exportdb \ file in emp/ directory in HDFS.
--username root \
--password sqoop \ command used to verify the
--table employee \ employee table content in mysql command
--export-dir /emp/emp_data line using select * from employee;
Table Content
Emp
EmpId EmpName City
1 Narsi
Khammam 2 Kiran
Hyderabad 3 Ananya
Hyderabad
Newly added columns 4
abc xyz
5 def
xxx
EmpContact
EmpId PhoneNo
1 5645357577
2 7653836599
3 5735339083

SQ Oop Commands

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

SQ Oop Commands

Загружено:

Авторское право:

Доступные форматы

S.

$ sqoop import --connect <db-URL> --username <db-

$ sqoop import --connect <db-URL> --username <db-

$ sqoop import --connect <db-URL> --username <db-

19 $ sqoop job --exec <jobId>

syntax to specify the target directory on HDFS where Sqoop

syntax to specify the parent directory on HDFS where Sqoop

syntax used to transfer only a subset of the rows based on various

The option -P will instruct Sqoop to read the password from

The parameter --password-file, will load the password from any

Used to import only newly added rows in a table.require to add

Used to display output in binary format i.e., sequencial file

Used to display output in binary format i.e.,avro file format.

To improve performance for faster importing of bulk data over

Sqoop by default uses four concurrent map tasks to transfer data

Parameter --exclude-tables ,used to skip few table while

syntax for creating a Sqoop job.

used to verify the list of saved Sqoop jobs.

Displays information about a job.

we can enable JDBC batching using the --batch parameter.The

By using the property sqoop.export.records.per.statement we

By using the property sqoop.export.statements.per.transaction ,

The --columns parameter to specify which columns (and in what

$ sqoop import \ we can view output file using command

$ sqoop import \ we can view output file using command

$ sqoop import \ can view output file using command hadoop

$ sqoop import \ Sqoop will prompt user as Enter

$ sqoop import \ output will be a comma separated CSV file

$ sqoop import \ output will be a comma separated CSV file

$ sqoop import \ output will be a comma separated CSV file

$ sqoop import-all-tables \ output contains list of tables data present in

$ sqoop job --create myjob \ command is used to create a job that is

$ sqoop job --list following is the output

Вам также может понравиться