Sqoop is a very popular tool to import data from RDMS e.g: Oracle, MySQL, DB2 to HDFS or you can export data from HDFS to RDMS.
Problem definition: In Sqoop import/export we need to provide the RDBMS credentials. How do we secure the password to restrict the access of unauthorized users?
Solution:
1. Using hadoop credential-provider API.
Here are the steps:
hadoop credential create oracle.john -provider jceks://hdfs/user/test/sqoop/pass.jceks
-- Enter the RDMS password
After that, you will get the success message as follows -
oracle.john -provider has been successfully created.
Provider jceks://hdfs/user/test/sqoop/pass.jceks has been updated.
Now in Sqoop import command, you need to pass the password as follows:
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/test/sqoop/pass.jceks \
--connect jdbc:oracle:thin:@server:1521:xyz \
--username use_name \
--password-alias oracle.john \
--table db_name.table_name \
--mapreduce-job-name table_name_LOAD \
--delete-target-dir \
--fields-terminated-by '\001' \
--null-string "" \
--null-non-string "" \
--hive-drop-import-delims \
--escaped-by "\\" \
--split-by "col_name" \
--num-mappers 1
2. Using Option file.
Before starting the sqoop process we can generate option file and after completion of the job we can delete the option file.
Thanks a lot for the read.
Regards,
Ratan Kumar Nath
Email:ratanKrNath@gmail.com
Problem definition: In Sqoop import/export we need to provide the RDBMS credentials. How do we secure the password to restrict the access of unauthorized users?
Solution:
1. Using hadoop credential-provider API.
Here are the steps:
hadoop credential create oracle.john -provider jceks://hdfs/user/test/sqoop/pass.jceks
-- Enter the RDMS password
After that, you will get the success message as follows -
oracle.john -provider has been successfully created.
Provider jceks://hdfs/user/test/sqoop/pass.jceks has been updated.
Now in Sqoop import command, you need to pass the password as follows:
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/test/sqoop/pass.jceks \
--connect jdbc:oracle:thin:@server:1521:xyz \
--username use_name \
--password-alias oracle.john \
--table db_name.table_name \
--mapreduce-job-name table_name_LOAD \
--delete-target-dir \
--fields-terminated-by '\001' \
--null-string "" \
--null-non-string "" \
--hive-drop-import-delims \
--escaped-by "\\" \
--split-by "col_name" \
--num-mappers 1
2. Using Option file.
Before starting the sqoop process we can generate option file and after completion of the job we can delete the option file.
Thanks a lot for the read.
Regards,
Ratan Kumar Nath
Email:ratanKrNath@gmail.com