Showing posts from November, 2013

Export more as 10k records with Sqoop

If you want to export more as 10k records out of a RDBMS via sqoop, you have some settings and properties you can tweak.

Parameter --num-mapper
Number of simultaneous connections that will be opened against database. Sqoop will use that many processes to export data (each process will export slice of the data). Here you have to take care about the max open connections to your RDBMS, since this can overwhelm the RDBMS easily.

Enabling batch mode on the JDBC driver. Here you use the JDBC batching mode, which queues the queries and deliver batched results
Number of rows that will be created for single insert statement, e.g. INSERT INTO xxx VALUES (), (), (), ... Here you have to know which VALUES you want to catch, but this

Number of insert statements per single transaction. e.g BEGIN; INSERT, INSERT, .... COMMIT
You can specify the properties (even both at the same time) in the HADOOP_ARGS secti…