Wednesday, October 15, 2014

Add Kerberos to Hadoop

Environment: CentOS 6.5; HW 1.3;

Install Kerberos:

yum install krb5-server krb5-libs krb5-auth-dialog krb5-workstation

Intialize KDC:

kdb5_util create -s

Edit the Access Control List (/var/kerberos/krb5kdc/kadm5.acl in RHEL or CentOS and /var/lib/kerberos/krb5kdc/kadm5.acl in SLES ) to define the principals that have admin (modifying) access to the database. A simple example would be a single entry:

*/admin@EXAMPLE.COM *

Create first user principal (through root user only):

[root@centos johnz]# kadmin.local -q "addprinc adminuser/admin"
Authenticating as principal root/admin@EXAMPLE.COM with password.
WARNING: no policy specified for adminuser/admin@EXAMPLE.COM; defaulting to no policy
Enter password for principal "adminuser/admin@EXAMPLE.COM":
Re-enter password for principal "adminuser/admin@EXAMPLE.COM":
Principal "adminuser/admin@EXAMPLE.COM" created.

Start Kerberos:

[root@centos johnz]# service krb5kdc start
Starting Kerberos 5 KDC:                                   [  OK  ]
[root@centos johnz]# service kadmin start
Starting Kerberos 5 Admin Server:                          [  OK  ]

Create a directory to store keytabs:

[root@centos johnz]# mkdir -p /etc/security/keytabs/
[root@centos johnz]# chown root:hadoop /etc/security/keytabs/
[root@centos johnz]# chmod 750 /etc/security/keytabs/

Add service principles for Hadoop:

[root@centos johnz]# kadmin -p adminuser/admin
Authenticating as principal adminuser/admin with password.
Password for adminuser/admin@EXAMPLE.COM:
kadmin:  addprinc -randkey nn/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey dn/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey HTTP/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey jt/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey tt/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey hbase/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey zookeeper/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey hcat/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey oozie/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey hdfs/centos.hw13@EXAMPLE.COM
kadmin:  addprinc -randkey hive/centos.hw13@EXAMPLE.COM

Export these principals to keytab files:

kadmin:  xst -k /etc/security/keytabs/spnego.service.keytab HTTP/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/nn.service.keytab nn/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/dn.service.keytab dn/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/jt.service.keytab jt/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/tt.service.keytab tt/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/hive.service.keytab hive/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/oozie.service.keytab oozie/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/hbase.service.keytab hbase/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/zk.service.keytab zookeeper/centos.hw13@EXAMPLE.COM
kadmin:  xst -k /etc/security/keytabs/hdfs.headless.keytab hdfs/centos.hw13@EXAMPLE.COM

Config Hadoop:

1. add following to core-site.xml:

  <property>
    <name>hadoop.security.authentication</name>
    <value>kerberos</value>
  </property>
  <property>
    <name>hadoop.security.authorization</name>
    <value>true</value>
  </property>
  <property>
    <name>hadoop.security.auth_to_local</name>
    <value>RULE:[2:$1@$0](jt@.*EXAMPLE.COM)s/.*/mapred/
RULE:[2:$1@$0](tt@.*EXAMPLE.COM)s/.*/mapred/
RULE:[2:$1@$0](nn@.*EXAMPLE.COM)s/.*/hdfs/
RULE:[2:$1@$0](dn@.*EXAMPLE.COM)s/.*/hdfs/
RULE:[2:$1@$0](hbase@.*EXAMPLE.COM)s/.*/hbase/
RULE:[2:$1@$0](hbase@.*EXAMPLE.COM)s/.*/hbase/
RULE:[2:$1@$0](oozie@.*EXAMPLE.COM)s/.*/oozie/
DEFAULT</value>
  </property>

2. add following to hdfs-site.xml:

<property>
    <name>dfs.block.access.token.enable</name>
    <value>true</value>
  </property>
<property>
        <name>dfs.namenode.kerberos.principal</name>
        <value>nn/centos.hw13@EXAMPLE.COM</value>
        <description> Kerberos principal name for the
        NameNode </description>
</property>
<property>
        <name>dfs.secondary.namenode.kerberos.principal</name>
        <value>nn/centos.hw13@EXAMPLE.COM</value>
        <description>Kerberos principal name for the secondary NameNode.
        </description>
</property>
<property>
        <name>dfs.web.authentication.kerberos.principal</name>
        <value>HTTP/centos.hw13@EXAMPLE.COM</value>
        <description> The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.
The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification.
        </description>
</property>  
<property>
        <name>dfs.datanode.kerberos.principal</name>
        <value>dn/_HOST@EXAMPLE.COM</value>
        <description>The Kerberos principal that the DataNode runs as. "_HOST" is replaced by the real host name.
        </description>
</property>  

<property>
        <name>dfs.web.authentication.kerberos.keytab</name>
        <value>/etc/security/keytabs/spnego.service.keytab</value>
        <description>The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP

endpoint.
        </description>
</property>  
<property>
        <name>dfs.namenode.keytab.file</name>
        <value>/etc/security/keytabs/nn.service.keytab</value>
        <description>Combined keytab file containing the NameNode service and host principals.
        </description>
</property>
<property>
        <name>dfs.secondary.namenode.keytab.file</name>
        <value>/etc/security/keytabs/nn.service.keytab</value>
        <description>Combined keytab file containing the NameNode service and host principals.
        </description>
</property>
<property>
        <name>dfs.datanode.keytab.file</name>
        <value>/etc/security/keytabs/dn.service.keytab</value>
        <description>The filename of the keytab file for the DataNode.
        </description>
</property>

3. add following to mapred-site.xml:

<property>
        <name>mapreduce.jobtracker.kerberos.principal</name>
        <value>jt/centos.hw13@EXAMPLE.COM</value>
        <description>Kerberos principal name for the JobTracker   </description>
</property>
<property>
        <name>mapreduce.tasktracker.kerberos.principal</name>  
        <value>tt/centos.hw13@EXAMPLE.COM</value>
        <description>Kerberos principal name for the TaskTracker."_HOST" is replaced by the host name of the TaskTracker.
        </description>
</property>
<property>
        <name>mapreduce.jobtracker.keytab.file</name>
        <value>/etc/security/keytabs/jt.service.keytab</value>
        <description>The keytab for the JobTracker principal.
        </description>
</property>
<property>
        <name>mapreduce.tasktracker.keytab.file</name>
        <value>/etc/security/keytabs/tt.service.keytab</value>
        <description>The filename of the keytab for the TaskTracker</description>
</property>
<property>
        <name>mapreduce.jobhistory.kerberos.principal</name>
        <!--cluster variant -->
        <value>jt/centos.hw13@EXAMPLE.COM</value>
        <description> Kerberos principal name for JobHistory. This must map to the same user as the JobTracker user (mapred).
        </description>
</property>
<property>  
        <name>mapreduce.jobhistory.keytab.file</name>
        <!--cluster variant -->
        <value>/etc/security/keytabs/jt.service.keytab</value>
        <description>The keytab for the JobHistory principal.
        </description>
</property>  

4. Important: have to replace existing local_policy.jar and US_export_policy.jar files ($JAVA_HOME/jre/lib/security/ directory) from

following link:

 http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html

5. Very important: do following before start data node.

export HADOOP_SECURE_DN_USER=hdfs

6. Very important: check permission of keytab files.  They should be:

[root@centos ~]# ls -l /etc/security/keytabs/
total 40
-rw-------. 1 hdfs   hadoop 400 Oct 13 14:56 dn.service.keytab
-rw-------. 1 hdfs   hadoop 412 Oct 13 14:59 hdfs.headless.keytab
-rw-------. 1 mapred hadoop 400 Oct 13 14:56 jt.service.keytab
-rw-------. 1 hdfs   hadoop 400 Oct 13 14:56 nn.service.keytab
-rw-------. 1 mapred hadoop 400 Oct 13 14:56 tt.service.keytab

7. Important: have to start all services through root user.

[root@centos security]# /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start namenode
[root@centos security]# /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode
[root@centos security]# /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start secondarynamenode
[root@centos security]# /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start jobtracker
[root@centos security]# /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start tasktracker

Access Hadoop:

[root@centos ~]# kinit -k -t /etc/security/keytabs/hdfs.headless.keytab -p hdfs/centos.hw13@EXAMPLE.COM
[root@centos ~]# hadoop fs -ls /user/hive
Found 1 items
drwxr-xr-x   - hdfs supergroup          0 2014-10-03 17:13 /user/hive/warehouse

No comments:

Post a Comment