【Kerberos】快速入门与使用笔记

2023-12-30 16:16:22

前略：在印象笔记里躺了一段时间了，凑巧翻到分享出来给被Kerberos所谓“经典对话”折磨的朋友，原文对话大体没啥问题，但不是每个读者都有那个心境，而且如果学过LDAP的朋友，可能下文更好理解。

一、推导

为了尽量简化这个流程，说下简单的推导：

非对称的加密模式不适合集群模式，只能使用认证中心的方式，我们需要一个权威，此时是KDC
一切机制的目标都是保证中心的权威
中心负责保证client和server的身份正确，server信任KDC（废话），client信任KDC，server、client 互相不信任
KDC保存了server密钥，server/kdc 双方知道，互相信任
KDC保存了client密钥，client/kdc 双方知道，互相信任
client 找KDC 要票，kdc 用client的密钥加密发了个session key 回去，同时还有client看不懂的，server信息，这部分用server的加密，server才能解开。
client 解开信息，自己记下session key ，用session key 加密一段话，把这个和看不懂的扔给server
server通过自己的密钥解开client的票，看到了自己的信息，于是信任client(client 解不开)，同时获取到一个session key ，这个key用来解开client说的话
client 收到 server的回复（这个回复是client之前用session key 加密过的那段话），信任server
10.整个过程中，client/server都不知道对方的任何秘密，唯一往来是那句回复。

二、简化的流程

下图表示简化的Kerberos流程：

以下表示实际获取的凭证等：

1.允许服务票证

Kerberos ticket具有lifetime，超过此时间则ticket就会过期，需要重新申请或renew

ticket_lifetime: tiket有效期

renew_lifetime: 通过renew操作最长可续命的时间

2.允许服务器请求票证

这一步的TGT又称ST(Service Ticke 、ST)

三、衍生：Delegation token.

为了减轻KDC的压力，毕竟每次服务器请求，都需要服务器票证，而服务器票证一般有效期为数分钟，在大集群高访问量情况下，每分钟上万的kdc请求也有可能。

所以Hadoop Security在做这方面优化时，引入了delegation token 的机制。通过这一步认证，我们得以在大多数时候跳过上述的步骤，在认证完成后，使用token进行client/server双方的访问，而token也引入了自己的有效期，。

Delegation Tokens作为Kerberos的一个补充，实现了一种轻量级的认证机制。Kerberos是三方认证协议，而Delegation Tokens只涉及到两方。

Delegation Tokens的认证过程如下：

client通过Kerberos与Server完成认证(TGT 认证通过)，并从server获取相应的Delegation Tokens。
client与server之间后续的认证都是通过Delegation Tokens，而不通过Kerberos。

-->注意！此时client/server并不代指提交程序和YARN，仅是通过Kerberos认证的客户端/服务器双方。

client可以把Delegation Tokens传递给其它的服务（如：YARN），如此一来，这些服务（如：MapReduce任务）以client身份进行认证。换句话说，client可以将身份凭证"委托"给这些服务。Delegation Tokens有一个过期时间的概念，需要周期性的更新以保证其有效性。但是，它也不能无限制的更新，这由最大生命周期控制。此外，在Delegation Token过期前也被取消。

各种组件的token更新周期如hdfs的更新周期dfs.namenode.delegation.token.renew-interval默认为1天，hbase的token更新周期hbase.auth.key.update.interval默认为1天；调度更新的周期为如上各组件最小值的75%，

（一）Spark delegation token.

如上文：既然token自身存在有效期的问题，就必然有token过期的问题，对于Spark-Streaming 和 Hbase而言，这种情况变得不可接受；

spark应对这种情况时，在token即将过期时，并不指望server端（HDFS）能够继续更新token（在token自身生命周期到期以前，它一直能这么干，只是现在spark干脆连token的renew也不要了，一视同仁），因为用户程序也无法重新提交，于是只能再次认证：

spark 通过Kerberos与 hdfs 完成认证(TGT 认证重新登录)，并从 hdfs 获取相应的Delegation Tokens。
spark 与 hdfs 之间后续的认证都是通过Delegation Tokens，而不通过Kerberos。

为了避免renew的周期和ticket的周期错开，spark 每次在token快过期时，都重新登录（不renew TGT），用新的TGT，生成新的token。

以下摘自源码注释：

// HACK:

// HDFS will not issue new delegation tokens, if the Credentials object

// passed in already has tokens for that FS even if the tokens are expired (it really only

// checks if there are tokens for the service, and not if they are valid).

// So the only real

// way to get new tokens is to make sure a different Credentials object is used each time to

// get new tokens and then the new tokens are copied over the the current user's Credentials.

// So:

// - we login as a different user and get the UGI

// - use that UGI to get the tokens (see doAs block below)

// - copy the tokens over to the current user's credentials (this will overwrite the tokens

// in the current user's Credentials object for this FS).

// The login to KDC happens each time new tokens are required, but this is rare enough to not

// have to worry about (like once every day or so). This makes this code clearer than having

// to login and then relogin every time (the HDFS API may not relogin since we don't use this

// UGI directly for HDFS communication.

（二）Driver 和 Executor

在yarn-client模式下，driver在yarnclient进程中启动，同样需要访问业务层及集群的相关组件如hdfs。

driver通过读取am更新在hdfs路径下的credentials文件来保证driver节点的token有效。

// SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver

// reads the credentials from HDFS, just like the executors and updates its own credentials cache.

if (conf.contains("spark.yarn.credentials.file")) {

YarnSparkHadoopUtil.startCredentialUpdater(conf)}

在applicationMaster中，定期更新token，并写入文件到hdfs的相关目录，并清理旧文件以供各executor使用。

client 模式由于其特殊性，可以视作特殊的executor，毕竟 applicationMaster 在yarn上。

在yarn-cluster模式下，driver运行在applicationMaster的JVM中，其安全相关由Am同一操作

（三）UGI

代码中UGI：UserGroupInformation主要作用：

登录
启动后台线程来刷新 TGT
支持tgt的手动重刷

四、配置相关

领域：（Realm）是定义属于同一主 KDC 的一组系统的逻辑网络，类似于域。

位于：krb5.conf

keytab文件：

keytab文件实际只是一个密码文件，显然，修改lifetime相关设置跟密码是没有关系的，不需要去重新生成现有的keytab文件。

五、对应的命令

KDC也即服务端的命令与客户端是错开的

也即krb-admin-server\krb-kdc和krb-user 几个独立包的区别（使用apt-get时）

其中：

服务端KDC由kadmind、krbkdc 两个命令组成

kadmind	启动AD服务，控制权限
krbkdc	启动KDC服务

安装见：

（中文简版）

https://www.jianshu.com/p/f84c3668272b

（MIT完整版）

https://web.mit.edu/kerberos/krb5-latest/doc/admin/install_kdc.html

服务端常用命令为：

KINIT	获取ticket，也即第一步的TGT，后续交互由后台进程负责
KLIST	默认展示已认证的tiket
KDESTROY	销毁所有tiket
KPASSWD	更改密码
kadmin	生成keytab

Linux使用任意包管理安装krb-user即可，windows在官网有安装包

六、不同供应商的krb5

目前krb5有许多的供应商，大体上他们实现相差无几，但是仍有部分区别，主要表现在配置上，也即与KDC交互时，使用的配置

以Klist命令为例，大多数厂商都实现了这一命令，在一台机子上可能找到多个klist：

从上往下依次是

1.windows自带klist(兼容windows自身的krb5 以及Mit 的krb5)：

其路径下的ksetup读取 C:\ProgramData\MIT\Kerberos5\krb5.ini 以及 C:\windows\krb5.ini

2.JVM自带，调用Java代码实现：

其路径下的kinit读取java.security.krb5.conf=/dir 设置的路径或者默认的 C:\windows\krb5.ini

详见Java security(JAAS)

3.MIT 为Windows提供的版本：

其路径下的kinit读取 C:\ProgramData\MIT\Kerberos5\krb5.ini

4.MIT 为Linux提供的版本及其他厂商为Linux提供的版本：

其路径下的kinit读取 /etc/krb5.conf

其中MIT版本中的许多配置厂商实现的标准是不能用的，但是不涉及。

5.SunOS

    其路径下的kinit读取 /etc/krb5/krb5.conf

参考：

https://www.jianshu.com/p/ae5a3f39a9af

https://www.jianshu.com/p/7fe5351399a8

https://github.com/cloudera/spark/blob/cdh5-1.6.0_5.13.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala

码农公寓