Вы находитесь на странице: 1из 10

S.

NO Command
$ sqoop import (generic-args) (import-args)

OR $ sqoop-import (generic-
args) (import-args)
1

$ sqoop import --connect <db-URL> --username <db-


username> --password <db-password> --table <table name>
--target-dir <new or exist directory in HDFS>

$ sqoop import --connect <db-URL> --username <db-


username> --password <db-password> --table <table name>
--target-dir <new or exist directory in HDFS>

$ sqoop import --connect <db-URL> --username <db-


username> --password <db-password> --table <table name>
--where <condition>

4
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
-P \
--table <table name>
5
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--password-file <file-path> \
--table <table name>
6
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \
--incremental <mode> \
--check-column <column name> \
--last value <last check column value>

7
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \ --as-
sequencefile
8
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \ --as-
avrodatafile

9
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \
--compress

10
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \
--direct

11
$ sqoop import \
--connect <db-URL> \
--username <db-username> \
--table <table name> \ --num-
mappers <number>

12
$ sqoop import-all-tables (generic-args) (import-args)

OR $ sqoop-
import-all-tables (generic-args) (import-args)
13
$ sqoop import-all-tables \
--connect <db-URL> \
--username <db-username> \
--exclude-tables <table names>

14
$ sqoop job (generic-args) (job-args) [-- [subtool-
name] (subtool-args)] or $ sqoop-job
(generic-args) (job-args) [-- [subtool-name] (subtool-
args)]

15
$ sqoop job --list

16
$ sqoop job --show <jobId>

17
18 $ sqoop job --delete <jobId>

19 $ sqoop job --exec <jobId>


$ sqoop export (generic-args) (export-args)

OR $ sqoop-export (generic-
args) (export-args)

20
$ sqoop export \
--connect <db-URL> \
--username <db-username> \
--password <db-password> \
--table <table name> \
--export-dir <new or exist directory in HDFS> \
--batch
21
OR $ sqoop export \
-Dsqoop.export.records.per.statement=10 \
--connect <db-URL> \
--username <db-username> \
--password <db-password> \
--table <table name> \
--export-dir <new or exist directory in HDFS> \
22

OR $ sqoop export \
-sqoop.export.statements.per.transaction=<number> \

--connect <db-URL> \
--username <db-username> \
--password <db-password> \
--table <table name> \
--export-dir <new or exist directory in HDFS> \
23
$ sqoop export \
--connect <db-URL> \
--username <db-username> \
--table <table name> \
--columns <column-names>
24
25
Explanation
syntax is used to import individual tables data from RDBMS into
HDFS in the default directory.

syntax to specify the target directory on HDFS where Sqoop


should import table data.

syntax to specify the parent directory on HDFS where Sqoop


should import table data.

syntax used to transfer only a subset of the rows based on various


conditions with WHERE clause.

The option -P will instruct Sqoop to read the password from


standard input.

The parameter --password-file, will load the password from any


specified file on HDFS cluster.

Used to import only newly added rows in a table.require to add


incremental, check-column, and last-value options to perform
the incremental import.

Used to display output in binary format i.e., sequencial file


format.

Used to display output in binary format i.e.,avro file format.


Using -- compress option ,By default output files will be
compressed using the GZip codec, and all files will end up with
a .gz extension.

To improve performance for faster importing of bulk data over


HDFS and to reduce burden on database server.

Sqoop by default uses four concurrent map tasks to transfer data


to Hadoop.
Using parameter --num-mappers, we have flexibility to change
the number of map tasks used per-job.

used to import all the tables data at same time from the RDBMS
to the HDFS. Each table data is stored in a separate directory and
the directory name is same as the table name

Parameter --exclude-tables ,used to skip few table while


importing all tables data at same time from RDBMS to HDFS.

syntax for creating a Sqoop job.

used to verify the list of saved Sqoop jobs.

Displays information about a job.


Deletes an existing job
Executes a saved job.
export data back from the HDFS to the RDBMS database.The
target table must exist in the target database.otherwise table has
to be created manually in the target database.

we can enable JDBC batching using the --batch parameter.The


--batch parameter used for inserting more the one row at a time.

By using the property sqoop.export.records.per.statement we


can specify the number of records that will be used in each insert
statement.

By using the property sqoop.export.statements.per.transaction ,


we can specify the number of rows will be inserted per
transaction can be set.

The --columns parameter to specify which columns (and in what


order) are present in the Hadoop data.
Example Result
$ sqoop import \ output will be a comma separated CSV file
--connect jdbc:mysql://localhost/userdb \ .we can view the file using command hadoop
--username root \
--password sqoop \ fs -cat /emp/part-m-*
--table emp

$ sqoop import \ we can view output file using command


--connect jdbc:mysql://localhost/userdb \ hadoop fs -cat /input/empresult/part-m-*
--username root \
--password sqoop \
--table emp \
--target-dir /input/empresult

$ sqoop import \ we can view output file using command


--connect jdbc:mysql://localhost/userdb \ hadoop fs -cat /input/*
--username root \
--password sqoop \
--table emp \
--warehouse-dir /input

$ sqoop import \ can view output file using command hadoop


--connect jdbc:mysql://localhost/userdb \ fs -cat /input/empresult/wherequery/part-
--username root \
--password sqoop \ m-*
--table emp \
--where "City ='Hyderabad'"
--target-dir /input/empresult/wherequery

$ sqoop import \ Sqoop will prompt user as Enter


--connect jdbc:mysql://localhost/userdb \ Password:
--username root \
-P \ Type password and
--table emp press the Enter key .

$ sqoop import \ output will be a comma separated CSV file


--connect jdbc:mysql://localhost/userdb \ .we can view the file using command hadoop
--username root \
--table emp \ fs -cat /emp/part-m-*
--incremental append \
--check-column id \
--last value 3
Example uses the BZip2 codec instead of GZip output files on HDFS will end up having
$ sqoop import \ the .bz2 extension.
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp \
--compression-codec
org.apache.hadoop.io.compress.BZip2Codec

$ sqoop import \ output will be a comma separated CSV file


--connect jdbc:mysql://localhost/userdb \ .we can view the file using command hadoop
--username root \
--table emp \ fs -cat /emp/part-m-*
--direct

$ sqoop import \ output will be a comma separated CSV file


--connect jdbc:mysql://localhost/userdb \ .we can view the file using command hadoop
--username root \
--table emp \ fs -cat /emp/part-m-*
--num-mappers 6

$ sqoop import-all-tables \ output contains list of tables data present in


--connect database userdb.Each table data is stored in
jdbc:mysql://localhost/userdb \
--username root separate directory with table name in HDFS
we can view using command hadoop fs -ls

$ sqoop import-all-tables \
--connect
jdbc:mysql://localhost/userdb \
--username root \
--exclude-tables empContact,empSalary

$ sqoop job --create myjob \ command is used to create a job that is


--import \
--connect jdbc:mysql://localhost/userdb \ importing data from the emptablein
--username root \ heuserdbdatabasetotheHDFS
--password sqoop \
--table emp \ file.

$ sqoop job --list following is the output


Available jobs: myjob
$ sqoop job --show myjob output displays job tools and their options
Job: myjob
Tool: import Options:
----------------------------
direct.import = true
codegen.input.delimiters.record = 0
hdfs.append.dir = false
userdb.table = emp
...
incremental.last.value = 3

$ sqoop export \ The employee data is available in emp_data
--connect jdbc:mysql://localhost/exportdb \ file in emp/ directory in HDFS.
--username root \
--password sqoop \ command used to verify the
--table employee \ employee table content in mysql command
--export-dir /emp/emp_data line using select * from employee;
Table Content
Emp
EmpId EmpName City

1 Narsi
Khammam 2 Kiran
Hyderabad 3 Ananya
Hyderabad
Newly added columns 4
abc xyz
5 def
xxx

EmpContact
EmpId PhoneNo
1 5645357577
2 7653836599
3 5735339083

Вам также может понравиться