当前位置:网站首页>Getting started with kuberentes Auditing

Getting started with kuberentes Auditing

2022-08-10 20:57:00 ghostwritten

kuberentes Auditing 入门

tags: Auditing,对象


{% youtube %}
{% endyoutube %}

1. 前言

KubernetesIs a container layout tools,Can reduce the complexity of deployment and management of container application.It is quickly becoming the deployment and management of large applications and微服务行业标准,And various organizations widely used to manage their applications in the cloud and local.

作为Kubernetes 管理员,You must be recorded in the cluster events.These records will be debugging problems and improve集群安全性的真实来源.Kubernetes AuditingRecord in your cluster do(Or someone trying to do).

在本文中,您将了解 Kubernetes Auditing 是什么,它们为何重要,以及如何在 Kubernetes 集群中启用 Auditing日志.

The log is naturally AuditingThe core of the log;在整个 Kubernetes 运行时,System Settings for tracking performance indicators、Metadata and any important management activities.Start from the capture point,这些日志被“流式传输”To their storage destination,Then can be used for retrospective analysis.Each with a timestamp to add context.

在 AuditingIn the process of logging,Logging has the right to access the history began to look for any outstanding information or any outward show suspicious activity information.These may include accidental login or a system crash.

AuditingThe log is a set of records,其中包含对Kubernetes APIAll request of chronological list.Kubernetes Storage per user and控制平面生成的操作.These logs using JSON 格式,包含 HTTP 方法、The initiating user information、发起的请求、处理请求的Kubernetes 组件等信息.包括:

  • The control plane (built-in controllers, the scheduler)
  • Node daemons (the kubelet, kube-proxy, and others)
  • Cluster services (e.g., the cluster autoscaler, kube-state-metrics,CoreDNS, etc.)
  • Users making kubectl requests
  • Applications, controllers, and operators that send requests through a kube client
  • Even the API server itself


AuditingAnswer the following questions allow the cluster administrator:

  • Kubernetes 发生了什么?
  • When what happened?
  • 谁触发了事件?
  • What happened in the level?
  • Where it happened,Where was it started?
  • 它去哪儿了?

2. Auditing 阶段

When someone or components to the Kubernetes API 服务器发出请求时,The request will go through one or more stages:

RequestReceivedAuditingHandler has received a request.
ResponseStartedSent response headers,But not yet sent response body.
ResponseCompleteResponse body has been completed,No longer send any bytes.

Requests each stage will generate an event,The event according to the policy for processing.Policy specifies whether the event log for the log entries should be,如果需要记录,What data should be included in the log entry.

3. Auditing级别

AuditingPolicy definitions about which data to record what events and they should contain the rules. AuditingStrategy object structure in audit.k8s.ioAPI 组中定义.处理事件时,会按顺序将其与规则列表进行比较.第一个匹配规则设置事件的 Auditing级别.定义的 Auditing级别是:

  • None- Not to record events that conform to the rules.
  • Metadata- Record request metadata(请求用户、时间戳、资源、动词等),But does not request or response recorded text to be.
  • Request- Metadata record events and request body,But I don't record response body.This does not apply to the resource request.
  • RequestResponse- Metadata record events、请求和响应主体.This does not apply to the resource request.

You can transfer files with strategies tokube-apiserver 使用--audit-policy-file标志.If you omit this flag,Do not record any event.请注意,必须在 AuditingIn the policy file provides therules字段.没有 (0) Rule strategy is regarded as illegal.

下面是一个示例 Auditing策略文件audit/audit-policy.yaml

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
  - "RequestReceived"
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
      - "RequestReceived"

You can use the smallest AuditingThe policy file to recordMetadataLevels of all the requests:

# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1
kind: Policy
- level: Metadata

If you are making their own Auditing配置文件,则可以使用 Google Container-Optimized OS 的 AuditingThe configuration file as a starting point.您可以检查 生成 Auditing策略文件的configure-helper.sh脚本.You can directly see script to check the most Auditing策略文件.

您还可以参考Policy配置参考 In order to get detailed information about the field of the definition of.

4. Auditing 技巧

  • 通常,createupdatedelete Such as written request will be RequestResponse 级别记录.
  • 通常,getlistwatch 事件会在 Metadata 级别记录.
  • Some events are regarded as special case.If you want to see is regarded as a special case of the request the exact list of,Please refer to the policy script.在撰写本文时,The following is a special case:
    • kubeletsystem:node-problem-detectorsystem:serviceaccount:kube-system:node-problem-detector Send to nodes/status 资源或 pods/status 资源的 updatepatch 请求会在 Request 级别记录.
    • system:nodes In the group from any identity for nodes/status 资源或 pods/status 资源的 updatepatch 请求会在 Request 级别记录.
    • system:serviceaccount:kube-system:namespace-controller 发出的 deletecollection 请求会在 Request 级别记录.
    • 针对 secrets 资源、configmaps 资源或 tokenreviews The request of the resources will be Metadata 级别记录.
  • Some request is not recorded.If you want to look at the request of the system does not record the exact list,Please refer to the policy in the script level: None 规则.As of this writing so far,The following request will not be recorded:
    • system:kube-proxy The monitor endpoints 资源、services 资源或 services/status 资源的请求;
    • system:unsecured Send to kube-system 命名空间中 configmaps 资源的 get 请求.
    • kubelet Send to nodes 资源或 nodes/status 资源的 get 请求.
    • system:nodes In the group from any identity for nodes 资源或 nodes/status 资源的 get 请求.
    • system:kube-controller-managersystem:kube-schedulersystem:serviceaccount:endpoint-controller Send to kube-system 命名空间中 endpoints 资源的 getupdate 请求.
    • system:apiserver Send to namespaces 资源、namespaces/status 资源或 namespaces/finalize 资源的 get 请求.
    • system:kube-controller-manager Send to metrics.k8s.io Any resources in a group getlist 请求.
    • 对与 /healthz*/version/swagger* Match the url request from.-

5. Auditing 后端

Auditing后端将 AuditingEvents persisted to external storage.开箱即用的 kube-apiserver Provides two back-end:

  • Log 后端,The events written to the file system
  • Webhook 后端,将事件发送到外部 HTTP API

在所有情况下, AuditingEvents follow audit.k8s.io API 组中 Kubernetes API 定义的结构.

5.1 Log 后端

Log back end will Auditing事件写入JSONlines格式的文件.您可以使用以下kube-apiserverMark configuration log Auditing后端:

  • --audit-log-path:The specified log back-end for writing AuditingEvent log file path.不指定此标志会禁用日志后端.-表示标准输出
  • --audit-log-maxage:Define retain old AuditingLog file of the biggest days
  • --audit-log-maxbackup;Define to keep Auditing日志文件的最大数量
  • --audit-log-maxsize:定义 AuditingLog file before rotating the maximum size of the(以 MB 为单位)
    If your cluster control plane will be kube-apiserver 作为 Pod 运行,Please remember to mounthostPath To the policy file and the location of the log file,以便保留 Auditing记录.例如:
   --audit-policy-file=/etc/kubernetes/audit-policy.yaml \


  - mountPath: /etc/kubernetes/audit-policy.yaml
    name: audit
    readOnly: true
  - mountPath: /var/log/kubernetes/audit/
    name: audit-log
    readOnly: false


 - name: audit
    path: /etc/kubernetes/audit-policy.yaml
    type: File

 - name: audit-log
    path: /var/log/kubernetes/audit/
    type: DirectoryOrCreate

5.2 Webhook 后端

webhook Auditing后端将 AuditingEvents sent to the remote Web API,该 API 被假定为 Kubernetes API 的一种形式,Including authentication way.您可以使用以下 kube-apiserver 标志配置 webhook Auditing后端:

  • --audit-webhook-config-file:使用 webhook Configuration of the specified file path.webhook Configuration is in fact a special kubeconfig.

  • --audit-webhook-initial-backoff:Specified in the first failure retry the request after waiting for the amount of time before.Using index retreat retry subsequent requests. webhook 配置文件使用 kubeconfig Format to specify the remote address of the service and the credentials used to connect to it.

6. Event batch

日志和 webhook The backend support batch.以 webhook 为例,Here is a list of available sign.To log the back-end to obtain the same sign,In the label, pleasewebhook替换为.log默认情况下,批处理在 中启用webhook和禁用log.同样,默认情况下,在 中启用webhookAnd disable the throttlelog.

--audit-webhook-modeDefine the buffer strategy.以下之一:

  • batch- Buffer events and asynchronous processing them.这是默认设置.
  • blocking- When dealing with each individual event to stop API 服务器响应.
  • blocking-strict- The same as the block,但当在 RequestReceived 阶段 AuditingLogging fails,对 kube-apiserver The request will fail.

The following sign in onlybatch模式中使用:

  • --audit-webhook-batch-buffer-size:Define the batch number of events to buffer before.If the rate of incoming events overflow buffer,All events will discard.
  • --audit-webhook-batch-max-size:Definition of a batch of the big number.
  • --audit-webhook-batch-max-wait:Unconditional batch events defined in the queue waiting for the longest time before.
  • --audit-webhook-batch-throttle-qps:Define generated maximum average number of times per second.
  • --audit-webhook-batch-throttle-burst:If underutilized allowed before QPS,Is defined at the same time generated by the large number of.

7. 参数调优

Parameters should be set up to adapt to the API 服务器上的负载.

例如,如果 kube-apiserver 每秒接收 100 个请求,And each request only inResponseStartedResponseComplete阶段进行 Auditing,You should consider to generate a second ≅200 个 Auditing事件.Suppose the batch up 100 个事件,You should limit level set to at least a second 2 个查询.Assuming the backend might need most 5 Seconds to write events,You should set the buffer size to accommodate 5 秒的事件;即:10 个批次,或 1000 个事件.

然而,在大多数情况下,The default parameters is enough,You don't have to worry about manually them.您可以查看 kube-apiserver Below and log in public Prometheus 指标,以监控 Auditing子系统的状态.

  • apiserver_audit_event_totalmetric: Contains the exported Auditing事件的总数.
  • apiserver_audit_error_total:Index contains due to errors in the process of export total number of events and cast it away.

7.1 Cutting log entry

日志和 webhook The backend support limit the size of the record of events.例如,The following is a sign of log back-end available list:

  • audit-log-truncate-enabled:Whether to enable events and batch off.
  • audit-log-truncate-max-batch-size:Sent to the underlying backend batch the maximum size of the(以字节为单位).
  • audit-log-truncate-max-event-size:发送到底层后端的 AuditingOne of the biggest events of bytes.

默认情况下,truncate 在webhookAnd are disabledlog,The cluster administrator should be set audit-log-truncate-enabledaudit-webhook-truncate-enabled启用该功能.

8. Auditing 日志格式示例

以下是 Kubernetes Auditing日志的示例.JSON Structure of each key contains is very important to understand in the cluster is happening in the information.

  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "RequestResponse",
  "auditID": "fbc474df-2466-4612-ae36-69af2c927f9d",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/default/pods/nginx",
  "verb": "get",
  "user": {
    "username": "system:node:minikube",
    "groups": [
  "sourceIPs": [
  "userAgent": "kubelet/v1.21.2 (linux/amd64) kubernetes/092fbfb",
  "objectRef": {
    "resource": "pods",
    "namespace": "default",
    "name": "nginx",
    "apiVersion": "v1"
  "responseStatus": {
    "metadata": {
    "code": 200
  "responseObject": {
    "kind": "Pod",
    "apiVersion": "v1",
    "metadata": {
    "spec": {
    "status": {
  "requestReceivedTimestamp": "2022-01-18T06:57:18.944663Z",
  "stageTimestamp": "2022-01-18T06:57:18.968543Z",
  "annotations": {
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": ""

8.1 user 信息和 sourceIP

userKey to tell you is which a user or service account by Kubernetes API Server has launched a request,而sourceIPThe key provided the account IP.IP Address can tell you(例如城市、The zip code or area code)以及用户 ISP 的名称.IP The address is not always reliable,Because they can be the requesting user changes or hide,But in trying to prevent from IP Address range of the malicious activities,此信息可能很有用.

8.2 Information about is started and execute the request of

verbrequestURIobjectRefProvide information about users request in the cluster by.Verb key said performKubernetes HTTP 请求方法:getpostlistwatchpatchdelete.requestURIIn the cluster by provide you API 请求的信息——例如,获取所有 pod Or create a new deployment request.objectRefContains information about the associated with the request ofKubernetes 对象.objectRefverbrequestURIProvide complete information about the user initiated the request.

8.3 Executed operation response status

responseStatusresponseObjectannotationsKey combination,Can insight into the response to a request for a user or service account by.annotations.authorization.k8s.io/decisionProvides allow or deny value.在对 Kubernetes 集群执行 AuditingTo detect abnormal behavior,These keys are very useful.

9. Auditing policy 场景

在大多数情况下,GKE Applying the following rules to record from Kubernetes API The server entry:

  • 表示 createdeleteupdate Request log entries will be written to the administrator activities.
  • 表示 getlistupdateStatus Request items will write data access log.

Kubernetes AuditingPolicy document at the beginning of rules will specify not to record what events.例如,This rule should not be specified records kubelet Send to nodes 资源或 nodes/status Any resources get 请求.前文中提到过,None Level means should not record any match event:

- level: None
  users: ["kubelet"] # legacy kubelet identity
  verbs: ["get"]
    - group: "" # core
    resources: ["nodes", "nodes/status"]

level: None After rules list,Policy document contains a list of rules for your particular situation.例如,The following is a special case rules,指定在 Metadata Level record certain requests:

 - level: Metadata
      - group: "" # core
        resources: ["secrets", "configmaps"]
      - group: authentication.k8s.io
        resources: ["tokenreviews"]
      - "RequestReceived"

如果满足以下所有条件,The event and rule matching:

  • Events and the front of the policy document does not match any of the rules.
  • 请求针对 secretsconfigmapstokenreviews 类型的资源.
  • Events are not to call RequestReceived 阶段.

List the rules of special cases after,Policy documents listed some general rules. 如需查看脚本The general rule,您必须用 known_apis 的值替换 ${known_apis}.替换后,You will get the following rules:

- level: Request
  verbs: ["get", "list", "watch"]
    - group: "" # core
    - group: "admissionregistration.k8s.io"
    - group: "apiextensions.k8s.io"
    - group: "apiregistration.k8s.io"
    - group: "apps"
    - group: "authentication.k8s.io"
    - group: "authorization.k8s.io"
    - group: "autoscaling"
    - group: "batch"
    - group: "certificates.k8s.io"
    - group: "extensions"
    - group: "metrics.k8s.io"
    - group: "networking.k8s.io"
    - group: "policy"
    - group: "rbac.authorization.k8s.io"
    - group: "settings.k8s.io"
    - group: "storage.k8s.io"
    - "RequestReceived"

This rule is applicable to any rules and policy documents in the front does not match and not RequestReceived 阶段的事件.该规则指定,To belong to any of the groups listed in any resource getlistwatch The request should be in Request 级别记录.

注意:getlistwatch Request has virtually no body,Therefore the real role is specified in the Metadata Level to create a log entry.

- level: RequestResponse
    - group: "" # core
    - group: "admissionregistration.k8s.io"
    - group: "apiextensions.k8s.io"
    - group: "apiregistration.k8s.io"
    - group: "apps"
    - group: "authentication.k8s.io"
    - group: "authorization.k8s.io"
    - group: "autoscaling"
    - group: "batch"
    - group: "certificates.k8s.io"
    - group: "extensions"
    - group: "metrics.k8s.io"
    - group: "networking.k8s.io"
    - group: "policy"
    - group: "rbac.authorization.k8s.io"
    - group: "settings.k8s.io"
    - group: "storage.k8s.io"
    - "RequestReceived"

This rule is applicable to any rules and policy documents in the front does not match and not RequestReceived 阶段的事件.具体来说,This rule does not apply to read the request:getlistwatch.But this rule is applicable to write requests,如 createupdatedelete.The rule should be specified in the RequestResponse Level records written request.


