Splunk OpenTelemetry Collector for Kubernetes笔记

Splunk OpenTelemetry Collector for Kubernetes笔记

Splunk

01 Splunk OpenTelemetry Collector for Kubernetes 介绍

1.1 介绍

Splunk OpenTelemetry Collector​ 是 OpenTelemetry Collector 的一个发行版。它为 Splunk Observability Cloud、Splunk Enterprise和 Splunk Cloud提供了一种方案来统一的接收、处理和导出指标、跟踪和日志数据。

支持的平台有:

此发行版支持并在以下各种平台上打包:

1.2 系统架构说明

https://docs.splunk.com/Documentation/SVA/current/Architectures/OTelKubernetes

image

02 安装配置

说明:测试环境为本地Kubernetes环境,使用helm3​进行安装配置

2.1 helm安装

Helm 是 Kubernetes 的一个包管理工具,就像 Linux 的 Apt 或 Yum 一样。这个工具能帮助开发者和系统管理员更方便地管理在 Kubernetes 集群上部署、更新、卸载应用。

Helm​ 中的三个主要概念:

概念 描述
Chart 在 Kubernetes 集群上部署应用所需的所有资源定义的包
Release 在 Kubernetes 集群上部署的 Chart 的实例
Repository Chart 的存储位置,类似软件仓库,用于分发和分享 Chart
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1. 添加 Helm 的官方 GPG key
root@k8s-master:~# curl https://baltocdn.com/helm/signing.asc | gpg --dearmor -o /usr/share/keyrings/helm-keyring.gpg

2. 添加 Helm 的官方 APT 仓库
root@k8s-master:~# echo "deb [signed-by=/usr/share/keyrings/helm-keyring.gpg] https://baltocdn.com/helm/stable/debian/ all main" | tee /etc/apt/sources.list.d/helm-stable-debian.list

3. 更新 apt 源
root@k8s-master:~# apt-get update

4. 安装 Helm
root@k8s-master:~# apt-get install -y helm

5. 检查 Helm 是否已正确安装
root@k8s-master:~# helm version
version.BuildInfo{Version:"v3.13.3", GitCommit:"c8b948945e52abba22ff885446a1486cb5fd3474", GitTreeState:"clean", GoVersion:"go1.20.11"}

2.2 安装或卸载Splunk OpenTelemetry Collector

2.2.1 配置helm repo

安装Splunk OpenTelemetry Collector只需要给Helm添加对应的OpenTelemetry Collector仓库即可,后续再使用过程中,执行helm instll xxxx splunk-otel-collector-chart/splunk-otel-collector​即可

1
helm repo add splunk-otel-collector-chart https://signalfx.github.io/splunk-otel-collector-chart

2.2.2 安装OpenTelemetry部署

  1. 将数据发送到Splunk Enterprise

    1
    helm install otel --set="splunkPlatform.endpoint=https://127.0.0.1:8088/services/collector,splunkPlatform.token=2fd976bb-be11-4f5c-bba4-240675b5b76d,splunkPlatform.metricsIndex=k8s-metrics,splunkPlatform.index=main,clusterName=my-cluster" splunk-otel-collector-chart/splunk-otel-collector
  2. 调用values.yaml创建otel部署

    -n​ :使用-n参数将helm部署到指定的名称空间中

    -f​:指定helm配置文件

    1
    helm -n otel install otel -f values.yaml splunk-otel-collector-chart/splunk-otel-collector
  3. 使用指定的YAML配置文件,进行部署

    --values​:与-f​参数效果相同

    1
    helm install otel --values my_values.yaml splunk-otel-collector-chart/splunk-otel-collector

2.2.3 卸载OpenTelemetry部署

卸载/删除名为 my-splunk-otel-collector 的部署:

1
2
3
helm delete otel

helm uninstall otel

2.2.4 更新OpenTelemetry部署

1
helm upgrade otel --values my_values.yaml

03 使用说明

3.1 参数说明

序号 参数名称 说明
1 splunkPlatform.endpoint Splunk HEC url,Splunk 实例的 URL,例如 “http://localhost:8088/services/collector
2 splunkPlatform.token Splunk HTTP 事件收集器令牌
3 splunkPlatform.metricsIndex
Splunk实例保存metrics数据的索引
4 splunkPlatform.index Splunk实例保存event数据的索引
5 clusterName 设置集群的名称,用于区分数据来自于哪个集群

3.2 简单使用例子

  • 将数据发送到Splunk Enterprise 或 Splunk Cloud

    1
    2
    3
    helm install my-splunk-otel-collector --set="splunkPlatform.endpoint=https://127.0.0.1:8088/services/collector,splunkPlatform.token=xxxxxx,splunkPlatform.metricsIndex=k8s-metrics,splunkPlatform.index=main,clusterName=my-cluster" splunk-otel-collector-chart/splunk-otel-collector

    helm install my-splunk-otel-collector --set="splunkPlatform.endpoint=http://10.10.0.121:8088/services/collector,splunkPlatform.token=2fd976bb-be11-4f5c-bba4-240675b5b76d,splunkPlatform.metricsIndex=local_k8s_metrics,splunkPlatform.index=local_k8s,clusterName=lcoal-k8s" splunk-otel-collector-chart/splunk-otel-collector
  • 创建时指定namespaces

    可以使用 -n 参数指定部署图表的命名空间。以下是一个如何在 otel 命名空间中部署的示例

    1
    helm -n otel install my-splunk-otel-collector --set="splunkPlatform.endpoint=https://127.0.0.1:8088/services/collector,splunkPlatform.token=xxxxxx,splunkPlatform.metricsIndex=k8s-metrics,splunkPlatform.index=main,clusterName=my-cluster" splunk-otel-collector-chart/splunk-otel-collector
  • 使用--values values.yaml​进行部署

    可以提供一个YAML文件,而不是将helm值设置为参数

    1
    helm install my-splunk-otel-collector --values values.yaml splunk-otel-collector-chart/splunk-otel-collector

    values.yaml例子:https://github.com/signalfx/splunk-otel-collector-chart/tree/main/examples

    values.yaml参数说明:https://github.com/signalfx/splunk-otel-collector-chart/blob/main/helm-charts/splunk-otel-collector/values.yaml

04 高级使用例子

4.1 对接etcd数据库

4.1.1 对接etcd数据库介绍

通过设置 agent.controlPlaneMetrics.{component}.enabled=true​ ,helm 图表将设置 otel-collector 代理以从特定的控制平面组件收集指标。大多数指标可以从控制平面收集,无需额外配置,但是,由于 TLS 安全要求,必须采取额外配置步骤来收集 etcd 的指标,需要获取etcd的TLS 身份验证进行通信。

为了收集控制平面指标,Helm 图表在每个节点上使用 otel-collector 代理,通过接收器创建器在运行时实例化控制平面接收器。接收器创建器有一组发现规则,以确定要创建哪些控制平面接收器。默认的发现规则可能因 Kubernetes 发行版和版本而异。如果您的控制平面使用非标准规范,则可以提供自定义配置(见下文),以便 otel-collector 代理仍能成功连接。

otel-collector 代理依赖于对 Pod 级别的网络访问来从控制平面 Pod 收集指标。由于大多数云 Kubernetes 作为服务的分发版不向最终用户公开控制平面 Pod,因此不支持从这些分发版收集指标。

  • 支持的版本:

    • kubernetes 1.22 (kops created) kubernetes 1.22(由 kops 创建)
    • openshift v4.9
  • 不支持的版本:

    • aks
    • eks
    • eks/fargate
    • gke
    • gke/autopilot

4.1.2 获取etcd客户端证书

故对接etcd数据库需要获取etcd 证书和密钥,获取步骤如下:

  • 官方文档中罗列出的获取证书和密钥的步骤:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    # The steps for kubernetes and openshift are listed here.
    # For kubernetes:
    etcd_pod_name=$(kubectl get pods -n kube-system -l k8s-app=etcd-manager-events -o=name | sed "s/^.\{4\}//" | head -n 1)
    kubectl exec -it -n kube-system {etcd_pod_name} cat /etc/kubernetes/pki/etcd-manager/etcd-clients-ca.crt
    kubectl exec -it -n kube-system {etcd_pod_name} cat /etc/kubernetes/pki/etcd-manager/etcd-clients-ca.key

    # For openshift:
    etcd_pod_name=$(kubectl get pods -n openshift-etcd -l k8s-app=etcd -o=name | sed "s/^.\{4\}//" | head -n 1)
    kubectl exec -it -n openshift-etcd {etcd_pod_name} cat /etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-serving-metrics-{etcd_pod_name}.crt
    kubectl exec -it -n openshift-etcd {etcd_pod_name} cat /etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-serving-metrics-{etcd_pod_name}.key
  • 实际获取证书和密钥的步骤:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    sudo ls /etc/kubernetes/pki/

    # 输出示例:
    apiserver.crt apiserver.key ca.crt front-proxy-ca.crt front-proxy-ca.key sa.key
    apiserver-etcd-client.crt apiserver-kubelet-client.crt ca.key front-proxy-client.crt front-proxy-client.key sa.pub

    # 查看证书内容和密钥的内容
    cat /etc/kubernetes/pki/apiserver-etcd-client.crt
    cat /etc/kubernetes/pki/apiserver-etcd-client.crt

    # 验证证书和密钥是否匹配正常
    curl -k --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key https://127.0.0.1:2379/metrics

    输出示例:

  • image

4.1.3 编辑values.yaml​配置内容

  • 官方文档中的配置例子

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    agent:
    controlPlaneMetrics:
    etcd:
    enabled: true
    secret:
    create: true
    # The PEM-format CA certificate for this client.
    clientCert: |
    -----BEGIN CERTIFICATE-----
    ...
    -----END CERTIFICATE-----
    # The private key for this client.
    clientKey: |
    -----BEGIN RSA PRIVATE KEY-----
    ...
    -----END RSA PRIVATE KEY-----
    # Optional. The CA cert that has signed the TLS cert.
    # caFile: |
  • 实际配置内容

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    agent:
    controlPlaneMetrics:
    # k8s etcd 数据库,获取指标数据
    etcd:
    enabled: true
    secret:
    create: true
    # The PEM-format CA certificate for this client.
    clientCert: |
    -----BEGIN CERTIFICATE-----
    MIIDKDCCAhCgAwIBAgIIehpqh5a8toswDQYJKoZIhvcNAQELBQAwEjEQMA4GA1UE
    AxMHZXRjZC1jYTAeFw0yNTAxMTYxMjEyNTNaFw0yNjAxMTYxMjE3NTRaMD4xFzAV
    BgNVBAoTDnN5c3RlbTptYXN0ZXJzMSMwIQYDVQQDExprdWJlLWFwaXNlcnZlci1l
    dGNkLWNsaWVudDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBALZRD4jZ
    63qc56QR4jBkQB3Yi9UEJvHq9duBdlax72KsR9uootkHxS3Wz+Fy5freJqgzmnTx
    DbYcb7GMEQpxpnSYl0dpIeYsxSx2HbSRyfF/hs5AJh7HXeUzkmS2OO07WNFGcBrz
    4qxQhvYqg22N8oLYBZnon7EWYNqvaSvP2CGwWX4QaWCuGuZxnIsE5BnJMXlRmKy7
    +IsNRtqs7n/AYnUYFcILiKP99YvXo7QIZeNCLwHoFKfw17ODS4sjD4TeuM9KjVck
    RXfp8i1EHPWAlYN7c9M1T4MIAI/fKp6QYLuesPbRgMBNj90OEUc2dDwHn0ZXYZMk
    sM4fqxJ9WhOVsUsCAwEAAaNWMFQwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQMMAoG
    CCsGAQUFBwMCMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU4CyZJRib7cjN/oxN
    HNaVQMCpST4wDQYJKoZIhvcNAQELBQADggEBAAnXIbYiNNbQnGibfuqmcAm2mdjY
    xv/kbUsiMLt6OBqdqfh4XlA63LkQE8UML6U5qFgPGlznRzrXnRrCOMn82jXCQyyk
    iOE0gAGjBHozOeu5dFKMH01P6bXxIXaRmFhtxD9gnUAC02zA4UNbIxcEL651JZC8
    oL0fyoV5f2W236Wpo+PfzlOB+CSVTvJUb5NMGGs17/zGnTZXLDzChChfmFMhw06Q
    V40EZhx4YYbZQPls0662+a4P9j3YaS221nuNCvYvoCIimEIaAvfQw4RAmbzRfabh
    KX+djjDgdWSxnoM2IbrgWZuFstKCABg5Mzw11sxd0tjtBwQJCliHF19J14I=
    -----END CERTIFICATE-----
    # The private key for this client.
    clientKey: |
    -----BEGIN RSA PRIVATE KEY-----
    MIIEowIBAAKCAQEAtlEPiNnrepznpBHiMGRAHdiL1QQm8er124F2VrHvYqxH26ii
    2QfFLdbP4XLl+t4mqDOadPENthxvsYwRCnGmdJiXR2kh5izFLHYdtJHJ8X+GzkAm
    Hsdd5TOSZLY47TtY0UZwGvPirFCG9iqDbY3ygtgFmeifsRZg2q9pK8/YIbBZfhBp
    YK4a5nGciwTkGckxeVGYrLv4iw1G2qzuf8BidRgVwguIo/31i9ejtAhl40IvAegU
    p/DXs4NLiyMPhN64z0qNVyRFd+nyLUQc9YCVg3tz0zVPgwgAj98qnpBgu56w9tGA
    wE2P3Q4RRzZ0PAefRldhkySwzh+rEn1aE5WxSwIDAQABAoIBACHBZGTsJCMxhdnk
    zcIz7YMZItqvyB4maJrZn3VxwGa+ixdqY6xXOfTAvwB464fFNdcSpthcATPkk/GF
    g2oxnKYd0nSQTIx3YZJX1CwoigFCoUzyp5wvQX08TTCEZInX4RvuNLdozGEnD7Xo
    LSlNjMcZBAB5B4gcIpaav5gzBUtHMWoiI15/uGZBRDm4kY9DsOhNgWk4MuZzz+u/
    mAMRh81EuSqDTx++/7Z0VKoVuyq2Ht5351kDM6WBX1YhAUnJglix36Uq2pCwypAE
    5XbS17gkvfgN+zuR7EwqG3meV1KZVcIEHGy3GhQ+BCk2csPXELMYvztQ0+fv8K82
    1342mWECgYEA1mHz3tFW05VM7UqerHklGl+6y0g1Y/vtFGbzCATB+VZwExcTOhuc
    yKp69tLv28YTHHhiYXiZI/1eNPDfVU3jLrcHy75F5HRIAOk0jPDhPHIZ5lU6cxZ2
    JA2HGCDf7R10XYnRhb2V/A/Z3xm2Tn3+vpH0p2QAWOE0oAFBpw/2C4kCgYEA2bWL
    MDS5aATcZMF1YXJeS5dddvYGIn2vGvVcZnIEC+SgXSatVUMB85cjpUZ7tvVdyqp1
    7DbB5OWCR8VWxRWhnje75G27/srveKdxTFeoHPXxu3fnooqWrKhahvTPnT60jP5+
    Ir3HmEnDUIhVqDRlhxP4WHVFBOkqffvEBJkT/TMCgYBpLmHSHm81G/lEKuoywLU9
    fV5OQj0/suicq+3tLzhkNs6B7z5Vshp4MXxnARMBhur1evL504t/Jt5DpzJLzgz6
    bH58rfvonEx/det8gupfF7QxV/t3X7vS8HgplGeJFHx1MBsGPQALTVOdrCXP2O1V
    XpLkVaH9+XAyWKt3ZdNX0QKBgCdkM8ULJSjvCDmqz2RMX0dqId0ucrm26AIGtytK
    IfVM7r8sClzM/QNoK2jyMdxO1SOgaCnPVpHl/QajbCnI2i9YgkS4njVh3qaEFXns
    ulxTG+QBtAWy8cRXydl1XkNjXyPLwGLk18J0RkTCBk2i/WPNdzf6L/zNe4TEExmJ
    4RYFAoGBAJA63hzOqTiBnWd9/zxh6xv36du8v1jfy8/Ib9AcOKMflQdb1N2FCcZp
    rGkRUgcqTw33hKFQ2k8rmivDfoMcx85px5qie31cU8+ZOCc6UnNMUbl+DEau4mTy
    DkqdjLms+JcgRjQowBK+4UgzNUi9bChqo/eN4sE5n5oNNMYxYLVR
    -----END RSA PRIVATE KEY-----
    # Optional. The CA cert that has signed the TLS cert.
    # caFile: |

4.2 自定义values.yaml说明

4.2.1 数据转发部分配置介绍

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
clusterName: "local-k8s"  # 此参数用于定义日志或指标是来自于哪个集群。

splunkPlatform:
endpoint: "https://10.10.0.121:8088/services/collector" # 此参数为Splunk HEC URL地址
token: "2fd976bb-be11-4f5c-bba4-240675b5b76d" # 此参数为Splunk HEC token,注意点:token必须相同

index: "local_k8s" # 此参数用于指定日志保存的索引
metricsIndex: "local_k8s_metrics" # 此参数用于指定metrics保存的索引

# Splunk Platform. Only logs collection is enabled by default.
logsEnabled: true # 用于禁用或启用将发送到 Splunk 平台的特定遥测数据类型的选项。默认情况下仅启用日志收集.

# If you enable metrics collection, make sure that `metricsIndex` is provided as well.
metricsEnabled: true # 此参数用于启用或禁用指标数据收集,如果你启用了指标收集,请确保同时提供了 metricsIndex。
insecureSkipVerify: true # 是否跳过 HEC 在HTTPS传输时的证书验证
tracesEnabled: false # 如果你启用了跟踪收集,请确保同时提供了 tracesIndex。

fieldNameConvention: # 要使用的字段名称约定。(仅适用于从 Splunk Connect for Kubernetes 的 helm 图表迁移的用户),保存默认配置
renameFieldsSck: false # 用于重命名pod元数据字段的布尔值,以匹配Kubernetes helm chart的Splunk Connect。
keepOtelConvention: true # 用于在重命名后保留Otel约定字段的布尔值

4.2.2 自定义监控对象

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sobjectsreceiver

1
2
3
4
5
6
7
8
9
10
11
12
13
14
k8sobjects:
auth_type: serviceAccount # auth_type (默认 = serviceAccount ):确定如何认证到 K8s API 服务器。这可以是以下之一: none (无认证)、 serviceAccount (使用提供给代理 pod 的标准服务帐户令牌)或 kubeConfig 使用来自 ~/.kube/config 的凭据。
objects:
- name: pods # 资源对象的名称
mode: pull # 定义以何种方式收集此类对象,要么是 "pull",要么是 "watch"
# pull 模式将按间隔读取此类型的所有对象,使用列表 API
# watch 模式将使用 watch API 设置一个长连接,仅用于获取更新
label_selector: environment in (production),tier in (frontend) # 通过标签选择对象
field_selector: status.phase=Running # 通过字段选择对象
interval: 15m # 对象被拉取的间隔,默认为 60 分钟。仅对 pull 模式有效。
- name: events
mode: watch
group: events.k8s.io # API 分组名称。这是一个可选配置。当给定的资源对象存在于多个分组中时,使用此配置来指定要选择的分组。默认情况下,它将选择第一个分组。例如, events 资源同时存在于 v1 和 events.k8s.io/v1 APIGroup 中。在这种情况下,它将默认选择 v1 。
namespaces: [default] # 用于收集事件的数组。(默认 = all )
  1. 创建一个包含配置的 ConfigMap,用 otelcontribcol​ 替换。将 OTLP_ENDPOINT​ 替换为有效值。
  • 发送到 OpenTelemetry Collector:

    1
    2
    endpoint: otel-collector:4317  # Kubernetes 集群内的 Collector 服务

  • 直接发送到云平台:

    1
    endpoint: https://ingest.us-central1-1.gcp.cloud.otel.com:4317
  • 发送到 Jaeger:

    1
    endpoint: http://jaeger-all-in-one:4317  # Jaeger 的 OTLP 接收地址
  • 创建配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: otelcontribcol
    labels:
    app: otelcontribcol
    data:
    config.yaml: |
    receivers:
    k8sobjects:
    objects:
    - name: pods
    mode: pull
    - name: events
    mode: watch
    exporters:
    otlp:
    endpoint: otel-collector:4317
    tls:
    insecure: true

    service:
    pipelines:
    logs:
    receivers: [k8sobjects]
    exporters: [otlp]
    EOF

  1. Service Account 服务帐户

    创建一个收集器应使用的服务帐户。

    1
    2
    3
    4
    5
    6
    7
    8
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    labels:
    app: otelcontribcol
    name: otelcontribcol
    EOF
  2. RBAC<**Role-Based Access Control**>

    使用以下命令创建具有所需权限的 ClusterRole​ 和授予上述服务账户角色的 ClusterRoleBinding​ 。以下配置仅适用于收集 pods 和事件。您需要添加适当的规则以收集其他对象。

    当使用监视模式时,您还必须指定 list 动词,以便接收者有权限执行其初始列表,如果没有提供 resource_version 或从 410 Gone 场景中恢复的列表。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    cat <<EOF | kubectl apply -f -
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
    name: otelcontribcol
    labels:
    app: otelcontribcol
    rules:
    - apiGroups:
    - ""
    resources:
    - events
    - pods
    verbs:
    - get
    - list
    - watch
    - apiGroups:
    - "events.k8s.io"
    resources:
    - events
    verbs:
    - watch
    - list
    EOF
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    cat <<EOF | kubectl apply -f -
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
    name: otelcontribcol
    labels:
    app: otelcontribcol
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: otelcontribcol
    subjects:
    - kind: ServiceAccount
    name: otelcontribcol
    namespace: default
    EOF
  3. 部署

    创建部署以部署收集器。注意:此接收器必须以一个副本的形式部署,否则将产生重复数据。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    cat <<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: otelcontribcol
    labels:
    app: otelcontribcol
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: otelcontribcol
    template:
    metadata:
    labels:
    app: otelcontribcol
    spec:
    serviceAccountName: otelcontribcol
    containers:
    - name: otelcontribcol
    # This image is created by running `make docker-otelcontribcol`.
    # If you are not building the collector locally, specify a published image: `otel/opentelemetry-collector-contrib`
    image: otelcontribcol:latest
    args: ["--config", "/etc/config/config.yaml"]
    volumeMounts:
    - name: config
    mountPath: /etc/config
    imagePullPolicy: IfNotPresent
    volumes:
    - name: config
    configMap:
    name: otelcontribcol
    EOF
  4. Troubleshooting 故障排除

    如果接收器返回类似以下错误,请确保资源已添加到 ClusterRole 。

    1
    {"kind": "receiver", "name": "k8sobjects", "pipeline": "logs", "resource": "events.k8s.io/v1, Resource=events", "error": "unknown"}

05 values.yaml配置详细介绍

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
# Configurable parameters and default values for splunk-otel-collector.
# This is a YAML-formatted file.
# Declared variables will be passed into templates.

# nameOverride replaces the name of the chart, when this is used to construct
# Kubernetes object names.
nameOverride: ""
# fullnameOverride completely replaces the generated name.
fullnameOverride: ""
# namespaceOverride can be used override the deployment namespace for collector resources
# Useful when including this chart as a subchart, so it can be released into a different namespace than the parent
namespaceOverride: ""

################################################################################
# clusterName is an optional parameter. It can be set to an arbitrary value that identifies
# your K8s cluster. The value will be associated with every trace, metric and
# log as "k8s.cluster.name" attribute. It's optional on EKS and GKE, but required
# on all other Kubernetes services.
################################################################################

clusterName: ""

################################################################################
# Splunk Cloud / Splunk Enterprise configuration.
################################################################################

# Specify `endpoint` and `token` in order to send data to Splunk Cloud or Splunk
# Enterprise.
splunkPlatform:
# Required for Splunk Enterprise/Cloud. URL to a Splunk instance to send data
# to. e.g. "http://X.X.X.X:8088/services/collector/event". Setting this parameter
# enables Splunk Platform as a destination. Use the /services/collector/event
# endpoint for proper extraction of fields.
endpoint: ""
# Required for Splunk Enterprise/Cloud (if `endpoint` is specified). Splunk
# Alternatively the token can be provided as a secret.
# Refer to https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#provide-tokens-as-a-secret
# HTTP Event Collector token.
token: ""

# Name of the Splunk event type index targeted. Required when ingesting logs to Splunk Platform.
index: "main"
# Name of the Splunk metric type index targeted. Required when ingesting metrics to Splunk Platform.
metricsIndex: ""
# Name of the Splunk event type index targeted. Required when ingesting traces to Splunk Platform.
tracesIndex: ""
# Optional. Default value for `source` field.
source: "kubernetes"
# Optional. Default value for `sourcetype` field. For container logs, it will
# be container name. For metrics and traces it will default to "httpevent".
sourcetype: ""
# Maximum HTTP connections to use simultaneously when sending data.
maxConnections: 200
# Whether to disable gzip compression over HTTP. Defaults to true.
disableCompression: true
# HTTP timeout when sending data. Defaults to 10s.
timeout: 10s
# Idle connection timeout. defaults to 10s
idleConnTimeout: 10s
# Whether to skip checking the certificate of the HEC endpoint when sending
# data over HTTPS.
insecureSkipVerify: false
# The PEM-format CA certificate for this client.
# Alternatively the clientCert, clientKey and caFile can be provided as a secret.
# Refer to https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#provide-tokens-as-a-secret
# NOTE: The content of the certificate itself should be used here, not the
# file path. The certificate will be stored as a secret in kubernetes.
clientCert: ""
# The private key for this client.
# NOTE: The content of the key itself should be used here, not the file path.
# The key will be stored as a secret in kubernetes.
clientKey: ""
# The PEM-format CA certificate file.
# NOTE: The content of the file itself should be used here, not the file path.
# The file will be stored as a secret in kubernetes.
caFile: ""

# Options to disable or enable particular telemetry data types that will be sent to
# Splunk Platform. Only logs collection is enabled by default.
logsEnabled: true
# If you enable metrics collection, make sure that `metricsIndex` is provided as well.
metricsEnabled: false
# If you enable traces collection, make sure that `tracesIndex` is provided as well.
tracesEnabled: false
# Field name conventions to use. (Only for those who are migrating from Splunk Connect for Kubernetes helm chart)
fieldNameConvention:
# Boolean for renaming pod metadata fields to match to Splunk Connect for Kubernetes helm chart.
renameFieldsSck: false
# Boolean for keeping Otel convention fields after renaming it
keepOtelConvention: true

# Refer to https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md#configuration
# for detailed examples
retryOnFailure:
enabled: true
# Time to wait after the first failure before retrying; ignored if enabled is false
initialInterval: 5s
# The upper bound on backoff; ignored if enabled is false
maxInterval: 30s
# The maximum amount of time spent trying to send a batch; ignored if enabled is false
maxElapsedTime: 300s

# Refer to https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md
sendingQueue:
enabled: true
# Number of consumers that dequeue batches; ignored if enabled is false
numConsumers: 10
# Maximum number of batches kept in memory before applying backpressure; ignored if enabled is false
# The sending queue keeps by default all elements in memory, so it is best to keep it small
# and allow the collector to slow down ingestion.
queueSize: 1000

# This option enables the persistent queue to store data on the disk instead of memory before sending it to the backend.
# It allows setting higher queue limits and preserving the data across restarts of the collector container.
# NOTE: The File Storage extension will persist state to the node's local file system.
# While using the persistent queue it is advised to increase memory limit for agent (agent.resources.limits.memory)
# to 1Gi.
# Refer to: https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#data-persistence
persistentQueue:
# Specifies whether to persist log/metric/trace data.
enabled: false
storagePath: "/var/addon/splunk/exporter_queue"

# Option to set fsync value for filestorage extension used by the agent. Enabling this option will ensure
# database integrity at the cost of performance. If not set it will use default value for this extension.
# Refer to: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/storage/filestorage#file-storage
# fsyncEnabled: true

################################################################################
# Splunk Observability configuration
################################################################################

# Specify `realm` and `accessToken` to telemetry data to Splunk Observability
# Cloud.
splunkObservability:
# Required for Splunk Observability. Splunk Observability realm to send
# telemetry data to. Setting this parameter enables Splunk Observability as a
# destination.
realm: ""
# Required for Splunk Observability (if `realm` is specified). Splunk
# Alternatively the accessToken can be provided as a secret.
# Refer to https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#provide-tokens-as-a-secret
# Observability org access token.
accessToken: ""

# Optional. Splunk Observability ingest URL, default:
# "https://ingest.<realm>.signalfx.com".
ingestUrl: ""
# Optional. Splunk Observability API URL, default:
# "https://api.<realm>.signalfx.com".
apiUrl: ""

# Options to disable or enable particular telemetry data types.
metricsEnabled: true
tracesEnabled: true
logsEnabled: false

# Option to send Kubernetes events to Splunk Observability Infrastructure Monitoring as data events:
# https://docs.splunk.com/Observability/alerts-detectors-notifications/view-data-events.html
# To send Kubernetes events to Splunk Observability Log Observer, configure clusterReceiver.k8sObjects
# and set splunkObservability.logsEnabled to true.
infrastructureMonitoringEventsEnabled: false

# This option just enables the shared pipeline for logs and profiling data.
# There is no active collection of profiling data.
# Instrumentation libraries must be configured to send it to the collector.
# If you don't use AlwaysOn Profiling for Splunk APM, you can disable it.
profilingEnabled: false

################################################################################
# Logs collection engine:
# - `fluentd`: deploy a fluentd sidecar that will collect logs and send them to
# otel-collector agent for further processing.
# - `otel`: utilize native OpenTelemetry log collection.
#
# `fluentd` will be deprecated in October 2025, so it's recommended to use `otel` instead.
################################################################################

logsEngine: otel

################################################################################
# Cloud provider, if any, the collector is running on. Leave empty for none/other.
# - "aws" (Amazon Web Services)
# - "gcp" (Google Cloud Platform)
# - "azure" (Microsoft Azure)
################################################################################

cloudProvider: ""

################################################################################
# Kubernetes distribution being run. Leave empty for other.
# - "aks" (Azure Kubernetes Service)
# - "eks" (Amazon Elastic Kubernetes Service)
# - "eks/fargate" (Amazon Elastic Kubernetes Service with Fargate profiles )
# - "gke" (Google Kubernetes Engine / Standard mode)
# - "gke/autopilot" (Google Kubernetes Engine / Autopilot mode)
# - "openshift" (RedHat OpenShift)
################################################################################

distribution: ""

################################################################################
# Optional "environment" parameter that will be added to all the telemetry
# data (traces/logs/metrics) as an attribute. It will allow Splunk Observability
# users to investigate data coming from different source separately.
# See: https://docs.splunk.com/observability/apm/set-up-apm/environments.html#setting-the-deployment-environment-span-tag
################################################################################

# environment: production

################################################################################
# Optional: Automatic detection of additional metric sources.
# Set autodetect.prometheus=true if you want the otel-collector agent to scrape
# prometheus metrics from pods that have prometheus-style annotations like
# "prometheus.io/scrape".
# Set autodetect.istio=true in istio environment.
################################################################################

autodetect:
prometheus: false
# This option is recommended for istio environments. It does the following things:
# - Enables scraping istio control plane metrics from Promethes endpoints.
# - Add a `service.name` resource attribute to logs with the same value as istio generates for
# traces to enable correlation between logs and traces usign this attribute.
istio: false

################################################################################
# Optional: Configuration for additional metadata that will be added to all the
# telemetry as extra attributes.
# IMPORTANT: Additional attributes configured with `fromLabels` and
# `fromAnnotations` options are only applied to traces and logs. Pod labels are
# always sent to Splunk Observability (if enabled) as metric properties.
################################################################################

extraAttributes:

# Labels that will be collected from k8s pods (or namespaces) (in case they are set)
# and added as extra attributes to the telemetry in the following format:
# k8s.<pod|namespace>.labels.<label_name>: <label_value>
# For example, if you want to collect "my_key" label from your namespaces, you could use the following:
# fromLabels:
# - key: my_key
# from: namespace
#
# If you want to change the default attribute name `k8s.pod.labels.<label_name>`, you could do that using a `tag_name` field:
# fromLabels:
# - key: my_key
# tag_name: my_tag
# from: pod
#
# `key_regex` field can be used to get a specific set of labels that match a regex.
# If `key_regex` used is used, the `key` field accepts regexp matching groups.
# The following example will fetch all the pod labels and propagate them to the attributes as is,
# without "k8s.pod.labels." prefix. "$" from the matching group must be escaped as "$$".
# fromLabels:
# - key_regex: (.*)
# from: pod
# tag_name: "$$1"
fromLabels:
- key: app

# Annotations that will be collected from k8s pods (or namespaces) (in case they are set)
# and added as extra attributes to the telemetry in the following format:
# k8s.<pod|namespace>.annotations.<annotation_name>: <annotation_value>
# fromAnnotations uses the same extraction rules as fromLabels option so refer examples from the fromLabels option.
fromAnnotations: []

# List of hardcoded key/value pairs that will be added as attributes to
# all the telemetry.
custom: []
# - name: "account_id"
# value: "1234567890"

################################################################################
# OPTIONAL CONFIGURATIONS OF PARTICULAR O11Y COLLECTOR COMPONENTS
################################################################################

################################################################################
# OpenTelemetry collector running as a daemonset agent on every node.
# It collects metrics and traces and send them to the Observability Cloud backend.
################################################################################

agent:
enabled: true

# Metric collection from k8s control plane components.
# For control plane configuration details see: docs/advanced-configuration.md#control-plane-metrics
controlPlaneMetrics:
apiserver:
# Specifies whether to collect apiserver metrics.
enabled: true
controllerManager:
# Specifies whether to collect controller manager metrics.
enabled: true
coredns:
# Specifies whether to collect coredns metrics.
enabled: true
etcd:
# Specifies whether to collect etcd metrics.
# For set up etcd metrics details see: docs/advanced-configuration.md#setting-up-etcd-metrics
enabled: false
secret:
# The name of the secret the helm chart will create (if name is empty the default name is used) or the name
# of a secret that the user created (empty names are not valid for user created secrets).
name: ""
# Option for creating a new secret or using an existing one.
# When secret.create=true, a new kubernetes secret will be created by the helm chart that will contain the
# values from clientCert, clientKey, and caFile.
# When secret.create=false, the user must set secret.name to a name of a k8s secret the user created.
create: false
# Used when secret.create=true. The PEM-format CA certificate for the etcd client.
# NOTE: The content of the certificate itself should be used here, not the
# file path. The certificate will be stored as a secret in kubernetes.
clientCert: ""
# Used when secret.create=true. The private key for the etcd client.
# NOTE: The content of the key itself should be used here, not the file path.
# The key will be stored as a secret in kubernetes.
clientKey: ""
# Optional. Used when secret.create=true and skipVerify=false. The PEM-format CA certificate file.
# NOTE: The content of the file itself should be used here, not the file path.
# The file will be stored as a secret in kubernetes.
caFile: ""
# Secret annotations
annotations: {}
# Specifies whether the etcd's TLS cert will be verified. If set to false, a CA certificate must be made
# available as part of the etcd secret to verify the TLS cert with.
skipVerify: true
proxy:
# Specifies whether to collect proxy metrics.
enabled: true
scheduler:
# Specifies whether to collect scheduler metrics.
enabled: true

# The ports to be exposed by the agent to the host.
# Make sure that only necessary ports are exposed, <hostIP, hostPort, protocol> combination must
# be unique across all the nodes in k8s cluster. Any port can be disabled,
# For example to disable zipkin ports set `agent.ports.zipkin: null`.
ports:
otlp:
containerPort: 4317
hostPort: 4317
protocol: TCP
enabled_for: [traces, metrics, logs, profiling]
otlp-http:
containerPort: 4318
protocol: TCP
enabled_for: [metrics, traces, logs, profiling]
sfx-forwarder:
containerPort: 9080
hostPort: 9080
protocol: TCP
enabled_for: [traces]
zipkin:
containerPort: 9411
hostPort: 9411
protocol: TCP
enabled_for: [traces]
jaeger-thrift:
containerPort: 14268
hostPort: 14268
protocol: TCP
enabled_for: [traces]
jaeger-grpc:
containerPort: 14250
hostPort: 14250
protocol: TCP
enabled_for: [traces]
fluentforward:
containerPort: 8006
hostPort: 8006
protocol: TCP
enabled_for: [logs]
signalfx:
containerPort: 9943
hostPort: 9943
protocol: TCP
enabled_for: [metrics]

resources:
limits:
cpu: 200m
# This value is being used as a source for default memory_limiter processor configurations
memory: 500Mi

# To collect container logs and journald logs, it will run the agent as a root user.
# To run it as non root user, uncomment below `securityContext` options.
# Setting runAsUser and runAsGroup to a non root user enables an init container that patches group
# permissions of container logs directories on the host filesystem to make logs readable by this non root user.
# Please note that on uninstallation of the chart, the permissions added to the
# host log directories for given uid/gid are not reverted.

securityContext: {}
# runAsUser: 20000
# runAsGroup: 20000

# Specifies DaemonSet update strategy.
# Possible values: "OnDelete" and "RollingUpdate".
updateStrategy: RollingUpdate

# Specifies the maximum of pods that can be unavailable during update process.
# Applicable only when updateStrategy is set to "RollingUpdate".
# Can be an absolute number or a percentage. The default is 1.
maxUnavailable: 1

service:
# create a service for the agents with a local internalTrafficPolicy
# so that agent pods can be discovered via dns etc
enabled: true

# hostNetwork schedules the pod with the host's network namespace.
# Disabling this value will affect monitoring of some control plane
# components. Enabling the agent service is recommended (see above).
# Disregarded for Windows (unsupported by k8s).
hostNetwork: true

# Set this to true to skip all the init containers. If you are running the agent as a non-root user,
# you must ensure to handle patching of log directories on the host filesystem manually.
skipInitContainers: false

# OTel agent annotations
annotations: {}
podAnnotations: {}

# OTel agent extra pod labels
podLabels: {}

# Extra environment variables to be set in the OTel agent container
extraEnvs: []

# Extra volumes to be mounted to the agent daemonset.
# The volumes will be available for both OTel agent and fluentd containers.
extraVolumes: []
extraVolumeMounts: []

# Enable or disable features of the agent.
featureGates: ""

# OpenTelemetry Collector configuration for otel-agent daemonset can be overriden in this field.
# Default configuration defined in templates/config/_otel-agent.tpl
# Any additional fields will be merged into the defaults,
# existing fields can be disabled by setting them to null value.
config: {}

# Discovery mode attempts to automatically configure the agent with bundled metric receiver configuration.
# For more details, refer to: https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#discovery-mode

discovery:
enabled: false
properties:
extensions: {}
receivers: {}

################################################################################
# OpenTelemetry Kubernetes cluster receiver
# This is an extra 1-replica deployment of OpenTelemetry collector used
# specifically for collecting metrics from kubernetes API.
################################################################################

# Cluster receiver collects cluster level metrics from the Kubernetes API.
# It has to be running on one pod, so it uses its own dedicated deployment with 1 replica.

clusterReceiver:
enabled: true

# Need to be adjusted based on size of the monitored cluster
resources:
limits:
cpu: 200m
memory: 500Mi

# Scheduling configurations
nodeSelector: {}
tolerations: []
affinity: {}

# Pod configurations
podSecurityContext: {}
terminationGracePeriodSeconds: 600
priorityClassName: ""

# Security context applied to the otel-collector container in the cluster receiver deployment.
containerSecurityContext: {}

# k8s cluster receiver collector annotations
annotations: {}
podAnnotations: {}

# This flag enables Kubernetes events collection using OpenTelemetry Kubernetes Events Receiver
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8seventsreceiver
# This option requires `logsEnabled` to be set to `true` for either `splunkObservability` or `splunkPlatform`
# depending on where you want to send the events. Otherwise this option will not have any effect.
# The receiver currently is in alpha state which means that events format might change over time.
# Once the receiver is stabilized, it'll be enabled by default in this helm chart
eventsEnabled: false

# Kubernetes objects collection using OpenTelemetry Kubernetes Object Receiver
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sobjectsreceiver
# This option requires `logsEnabled` to be set to `true` for either `splunkObservability` or `splunkPlatform`
# depending on where you want to send the events. Otherwise, this option will not have any effect.
# The receiver currently is in alpha state which means that events format might change over time.
# Once the receiver is stabilized, it'll be enabled by default in this helm chart

#
# == Schema ==
# ‍```
# k8sObjects:
# - <objectDefinition>
# ‍```
# Each `objectDefinition` has the following fields:
# * mode:
# define in which way it collects this type of object, either "pull" or "watch".
# - "pull" mode will read all objects of this type use the list API at an interval. Default mode.
# - "watch" mode will setup a long connection using the watch API to just get updates.
# * name: [REQUIRED]
# name of the object, e.g. `pods`, `namespaces`.
# * namespace:
# only collects objects from the specified namespace, by default it's all namespaces
# * labelSelector:
# select objects by label(s)
# * fieldSelector:
# select objects by field(s)
# * interval:
# the interval at which object is pulled, default 60 seconds.
# Only useful for "pull" mode.
#
#
# == Example ==
# ‍```
# k8sObjects:
# - name: pods
# mode: pull
# label_selector: environment in (production),tier in (frontend)
# field_selector: status.phase=Running
# interval: 15m
# - name: events
# mode: watch
# group: events.k8s.io
# namespaces: [default]
# ‍```
#
# The configuration format in details is described here:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sobjectsreceiver
k8sObjects: []

# k8s cluster receiver extra pod labels
podLabels: {}

# Extra enviroment variables to be set in the OTel Cluster Receiver container
extraEnvs: []

# Extra volumes to be mounted to the k8s cluster receiver container.
extraVolumes: []
extraVolumeMounts: []

# Enable or disable features of the cluster receiver.
featureGates: ""

# OpenTelemetry Collector configuration for K8s Cluster Receiver deployment can be overriden in this field.
# Default configuration defined in templates/config/_otel-k8s-cluster-receiver-config.tpl
# Any additional fields will be merged into the defaults,
# existing fields can be disabled by setting them to null value.
config: {}

#################################################################
# Native OpenTelemetry logs collection
# Applicable only if "logsEngine: otel" (set by default).
# Receiver Documentation: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver
# OpenTelemetry Logging Documentation: https://opentelemetry.io/docs/specs/otel/logs
#################################################################

logsCollection:

# Container logs collection
containers:
enabled: true
# Container runtime. One of `docker`, `cri-o`, or `containerd`
# Automatically discovered if not set.
containerRuntime: ""
# Paths of logfiles to exclude. object type is array:
# i.e. to exclude `kube-system` namespace,
# excludePaths: ["/var/log/pods/kube-system_*/*/*.log"]
excludePaths: []
# Boolean for ingesting the agent's own log
excludeAgentLogs: true
# Extra operators for container logs.
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/README.md
extraOperators: []

# Multiline logs processing configuration. Multiline logs that written by containers to stdout
# are usually broken down into several one-line logs and can be reconstructed with a regex
# expression that matches the first line of each logs batch. The following operator is being
# utilized for this purpose:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/recombine.md
# By the time of reconstructing a multiline log the following information is available to
# identify source of the logs: namespace, pod and container names. At least one source
# identifier has to be specified in for each multiline config.
# The following example shows how to setup multiline log processing for logs having subsequent
# log lines written with an offset. Let's say a k8s deployment called "buttercup-app" is
# scheduled to run in "default" namespace with a java container called "server", and the
# container produces the following log example:
# .........
# Exception in thread "main" java.lang.NumberFormatException: For input string: "3.1415"
# at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
# at java.lang.Integer.parseInt(Integer.java:580)
# at ExampleCli.parseNumericArgument(ExampleCli.java:47)
# at ExampleCli.parseCliOptions(ExampleCli.java:27)
# at ExampleCli.main(ExampleCli.java:11)
# .........
# The following sample configuration will handle multiline logs from that specific container:
# multilineConfigs:
# - namespaceName:
# value: default
# podName:
# value: buttercup-app-.*
# useRegexp: true
# containerName:
# value: server
# firstEntryRegex: ^[^\s].*
# combineWith: ""
multilineConfigs: []
# Set useSplunkIncludeAnnotation flag to `true` to collect logs from pods with `splunk.com/include: true` annotation and ignore others.
# All other logs will be ignored.
useSplunkIncludeAnnotation: false
# maxRecombineLogsSize sets the maximum size in bytes of a message recombined from cri-o, containerd and docker log entries.
# Set to 0 to remove any size limit.
maxRecombineLogSize: 1048576

# Configuration for collecting journald logs using otel collector
journald:
enabled: false
# Please update directory path for journald if it's different from below default value "/run/log/journal"
directory: /run/log/journal
# List of service units to collect journald logs for and configuration for each.
units:
- name: kubelet
priority: info
- name: docker
priority: info
- name: containerd
priority: info
# Route journald logs to its own Splunk Index by specifying the index value below, else leave it blank. Please make sure the index exist in Splunk and is configured to receive HEC traffic. Not applicable to Splunk Observability.
index: ""

checkpointPath: "/var/addon/splunk/otel_pos"

# Files on k8s nodes to tail.
# Make sure to configure volume mounts properly at `agent.extraVolumes` and `agent.extraVolumeMounts`.
extraFileLogs: {}
# Sample configuration to collect Audit logs. Please note hostPath can vary depending on the audit-policy.yaml configuration.
# extraFileLogs:
# filelog/audit-log:
# include: [/var/log/kubernetes/apiserver/audit.log]
# start_at: beginning
# include_file_path: true
# include_file_name: false
# resource:
# com.splunk.source: /var/log/kubernetes/apiserver/audit.log
# host.name: 'EXPR(env("K8S_NODE_NAME"))'
# com.splunk.sourcetype: kube:apiserver-audit

################################################################################
# Fluentd sidecar configuration for logs collection.
# Applicable only if "logsEngine: fluentd".
# Fluentd logs engine is now deprecated and will reach End Of Support in October 2025, it is strongly recommended to use "logsEngine: otel" instead.
################################################################################

fluentd:
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 200Mi

securityContext:
runAsUser: 0

# Extra enviroment variables to be set in the FluentD container
extraEnvs: []

config:
# Configurations for container logs
containers:
# Path to root directory of container logs
path: /var/log
# Final volume destination of container log symlinks
pathDest: /var/lib/docker/containers
# Log format type, "json" or "cri".
# If omitted (default), the value is detected automatically based on container runtime.
# "json" is set if docker runtime detected, otherwise it defaults to "cri".
logFormatType: ""
# Specify the log format for "cri" logFormatType
# It can be "%Y-%m-%dT%H:%M:%S.%N%:z" for openshift and "%Y-%m-%dT%H:%M:%S.%NZ" for IBM IKS
criTimeFormat: "%Y-%m-%dT%H:%M:%S.%N%:z"

# Directory where to read journald logs. (docker daemon logs, kubelet logs, and anyother specified serivce logs)
journalLogPath: /run/log/journal

# Controls the output buffer for the fluentd daemonset
# Note that, for memory buffer, if `resources.limits.memory` is set,
# the total buffer size should not bigger than the memory limit, it should also
# consider the basic memory usage by fluentd itself.
# All buffer parameters (except Argument) defined in
# https://docs.fluentd.org/v1.0/articles/buffer-section#parameters
# can be configured here.
buffer:
"@type": memory
total_limit_size: 600m
chunk_limit_size: 1m
chunk_limit_records: 100000
flush_interval: 5s
flush_thread_count: 1
overflow_action: block
retry_max_times: 3

# logLevel is to set log level of the Splunk log collector.
# Available values are: trace, debug, info, warn, error
logLevel: info

# path of logfiles, default /var/log/containers/*.log
path: /var/log/containers/*.log
# paths of logfiles to exclude. object type is array as per fluentd specification:
# https://docs.fluentd.org/input/tail#exclude_path
excludePath: []
# - /var/log/containers/kube-svc-redirect*.log
# - /var/log/containers/tiller*.log

# Prefix for pos_file tail source parameter
# Can be used if you want to run multiple instances of fluentd on the same host
# https://docs.fluentd.org/input/tail#pos_file-highly-recommended
posFilePrefix: /var/log/splunk-fluentd

# Specify the interval of refreshing the list of watch file. Defaults to 60 (seconds)
# uncomment the line below to override default behaviour
# refreshInterval: 60

# Enables the stat_watcher. Defaults to true.
# See: https://docs.fluentd.org/v1.0/articles/in_tail#enable_stat_watcher"
# uncomment the line below to disable it
# enableStatWatcher: false

# `customFilters` defines the custom filters to be used.
# This section can be used to define custom filters using plugins like https://github.com/splunk/fluent-plugin-jq
# Its also possible to use other filters like https://www.fluentd.org/plugins#filter
#
# The scheme to define a custom filter is:
#
# ‍```
# <name>:
# tag: <fluentd tag for the filter>
# type: <fluentd filter type>
# body: <definition of the fluentd filter>
# ‍```
#
# = fluentd tag for the filter =
# This is the fluentd tag for the record
#
# = fluentd filter type =
# This is the fluentd filter that the user wants to use for record manipulation.
#
# = definition of the fluentd filter =
# This defines the body/logic for using the filter for record manipulation.
#
# For example if you want to define a filter which sets cluster_name field to "my_awesome_cluster" you would the following filter
# <filter tail.containers.**>
# @type jq_transformer
# jq '.record.cluster_name = "my_awesome_cluster" | .record'
# </filter>
# This can be defined in the customFilters section as follows:
# ‍```
# customFilters:
# NamespaceSourcetypeFilter:
# tag: tail.containers.**
# type: jq_transformer
# body: jq '.record.cluster_name = "my_awesome_cluster" | .record'
# ‍```
customFilters: {}

# `logs` defines the source of logs, multiline support, and their sourcetypes.
#
# The scheme to define a log is:
#
# ‍```
# <name>:
# from:
# <source>
# timestampExtraction:
# regexp: "<regexp_to_extract_timestamp_from_log>"
# format: "<format_of_the_timestamp>"
# multiline:
# firstline: "<regexp_to_detect_firstline_of_multiline>"
# flushInterval: 5s
# sourcetype: "<sourcetype_of_logs>"
# ‍```
#
# = <source> =
# It supports 3 kinds of sources: journald, file, and container.
# For `journald` logs, `unit` is required for filtering using _SYSTEMD_UNIT, example:
# ‍```
# docker:
# from:
# journald:
# unit: docker.service
# ‍```
#
# For `file` logs, `path` is required for specifying where is the log files. Log files are expected in `/var/log`, example:
# ‍```
# docker:
# from:
# file:
# path: /var/log/docker.log
# ‍```
#
# For `container` logs, `pod` field is required. It represents part of
# the pod name, can be name of a deployment or replica set. Use "*" to
# apply the configuration to all pods. Optional `container` value can be
# used to apply configuration to a particular container.
# ‍```
# kube-apiserver:
# from:
# pod: kube-apiserver
#
# etcd:
# from:
# pod: etcd-server
# container: etcd-container
# ‍```
#
# = timestamp =
# `timestampExtraction` defines how to extract timestamp from logs. This *only* works for `file` source.
# To use `timestampExtraction` you need to define both:
# - `regexp`: the Regular Expression used to find the timestamp from a log entry.
# The timestamp part must be in a `time` named group. E.g.
# (?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})
# - `format`: a format string defintes how to parse the timestamp, e.g. "%Y-%m-%d %H:%M:%S".
# More details can be find: http://ruby-doc.org/stdlib-2.5.0/libdoc/time/rdoc/Time.html#method-c-strptime
#
# = multiline =
# `multiline` options provide basic multiline support. Two options:
# - `firstline`: a Regular Expression used to detect the first line of a multiline log.
# - `flushInterval`: The interval between data flushes, default value: 5s.
#
# = sourcetype =
# sourcetype of each kind of log can be defined using the `sourcetype` field.
# If `sourcetype` is not defined, `name` will be used.
#
# ---
# Here we have some default timestampExtraction and multiline settings for kubernetes components.
# So, usually you just need to redefine the source of those components if necessary.
logs:
docker:
from:
journald:
unit: docker.service
timestampExtraction:
regexp: time="(?<time>\d{4}-\d{2}-\d{2}T[0-2]\d:[0-5]\d:[0-5]\d.\d{9}Z)"
format: "%Y-%m-%dT%H:%M:%S.%NZ"
sourcetype: kube:docker
kubelet: &glog
from:
journald:
unit: kubelet.service
timestampExtraction:
regexp: \w(?<time>[0-1]\d[0-3]\d [^\s]*)
format: "%m%d %H:%M:%S.%N"
multiline:
firstline: /^\w[0-1]\d[0-3]\d/
sourcetype: kube:kubelet
etcd:
from:
pod: etcd-server
container: etcd-container
timestampExtraction:
regexp: (?<time>\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d\.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
etcd-minikube:
from:
pod: etcd-minikube
container: etcd
timestampExtraction:
regexp: (?<time>\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d\.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
etcd-events:
from:
pod: etcd-server-events
container: etcd-container
timestampExtraction:
regexp: (?<time>\d{4}-[0-1]\d-[0-3]\d [0-2]\d:[0-5]\d:[0-5]\d\.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
kube-apiserver:
<<: *glog
from:
pod: kube-apiserver
sourcetype: kube:kube-apiserver
kube-scheduler:
<<: *glog
from:
pod: kube-scheduler
sourcetype: kube:kube-scheduler
kube-controller-manager:
<<: *glog
from:
pod: kube-controller-manager
sourcetype: kube:kube-controller-manager
kube-proxy:
<<: *glog
from:
pod: kube-proxy
sourcetype: kube:kube-proxy
kubedns:
<<: *glog
from:
pod: kube-dns
sourcetype: kube:kubedns
dnsmasq:
<<: *glog
from:
pod: kube-dns
sourcetype: kube:dnsmasq
dns-sidecar:
<<: *glog
from:
pod: kube-dns
container: sidecar
sourcetype: kube:kubedns-sidecar
dns-controller:
<<: *glog
from:
pod: dns-controller
sourcetype: kube:dns-controller
kube-dns-autoscaler:
<<: *glog
from:
pod: kube-dns-autoscaler
container: autoscaler
sourcetype: kube:kube-dns-autoscaler
kube-audit:
from:
file:
path: /var/log/kube-apiserver-audit.log
timestampExtraction:
format: "%Y-%m-%dT%H:%M:%SZ"
sourcetype: kube:apiserver-audit

################################################################################
# Docker image configuration
################################################################################

image:
# Secrets to attach to the respective serviceaccount to pull docker images
imagePullSecrets: []

fluentd:
# The registry and name of the fluentd image to pull
repository: splunk/fluentd-hec
# The tag of the fluentd image to pull
tag: 1.3.3
# The policy that specifies when the user wants the fluentd images to be pulled
pullPolicy: IfNotPresent

otelcol:
# The registry and name of the opentelemetry collector image to pull
repository: quay.io/signalfx/splunk-otel-collector
# For the FIPS-140 enabled version, use this repository instead:
# repository: quay.io/signalfx/splunk-otel-collector-fips
# The tag of the Splunk OTel Collector image, default value is the chart appVersion
tag: ""
# The policy that specifies when the user wants the opentelemetry collector images to be pulled
pullPolicy: IfNotPresent

# Image to be used by init container that patches log directories on the host, so the collector can read from them as a non-root user.
# Effective only if `agent.securityContext.runAsUser` and `agent.securityContext.runAsGroup` are set to non-zero values.
initPatchLogDirs:
# The registry and name of the Universal Base Image 9 image to pull
repository: registry.access.redhat.com/ubi9/ubi
# The tag of the Universal Base Image 9, default value is latest
tag: ""
# The policy that specifies when the user wants the Universal Base images to be pulled
pullPolicy: IfNotPresent

# Image to be used by a container to validate the secret's presence ahead of starting a helm install or upgrade using pre-install and pre-upgrade Helm hooks.
# Effective only if `secret.create` is set to false and `secret.validateSecret` is set to true (default).
validateSecret:
# The registry and name of the Universal Base Image 9 image to pull
repository: registry.access.redhat.com/ubi9/ubi
# The tag of the Universal Base Image 9, default value is latest
tag: ""
# The policy that specifies when the user wants the Universal Base images to be pulled
pullPolicy: IfNotPresent


################################################################################
# Extra system configuration
################################################################################

## Limits how many pods may be unavailable due to voluntary disruptions.
## https://kubernetes.io/docs/tasks/run-application/configure-pdb/
podDisruptionBudget: {}
# Minimum number of pods (as a number or percentage) that must remain available.
# minAvailable:
# Maximum number of pods (as a number or percentage) that can be unavailable.
# maxUnavailable:

serviceAccount:
# Specifies whether a ServiceAccount should be created
create: true
# The name of the ServiceAccount to use.
# If not set and create is true, a name is generated using the fullname template
name: ""

# Service account annotations
annotations: {}

rbac:
# Create or use existing RBAC resources
create: true
# Specifies additional rules that will be added to the clusterRole.
customRules: []

# Create or use existing secret if name is empty default name is used
secret:
create: true
name: ""
# Specifies whether secret provided by user should be validated.
validateSecret: true
# Secret annotations
annotations: {}

# The tolerations for deploying the agent collector daemonset. By default, it targets control-plane, worker,
# and k8s distribution-specific nodes (infrastructure or system) to ensure logs and metrics collection from nodes.
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
operator: Exists
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
operator: Exists
- key: kubernetes.io/system-node
effect: NoSchedule
operator: Exists
- key: node-role.kubernetes.io/infra
effect: NoSchedule
operator: Exists

# Defines which nodes should be selected to deploy the agent collector daemonset.
nodeSelector: {}
terminationGracePeriodSeconds: 600

# Defines node affinity to restrict deployment of the agent collector daemonset.
affinity: {}

# Defines priorityClassName to assign a priority class to pods.
priorityClassName: ""

# This tells the kubelet that it should wait for x seconds before performing the first probe.
# This is required in case you are using windows worker nodes.
# It is recommended to keep it a 60-second window but it depends on cluster specification.
readinessProbe:
initialDelaySeconds: 0
livenessProbe:
initialDelaySeconds: 0

# Specifies whether to apply for k8s cluster with windows worker node.
isWindows: false

# Whether to automatically create Openshift SCC or to create it manually.
# NOTE: This config will only be used when distribution=openshift
securityContextConstraints:
create: true

# Openshift SecurityContextConstraints can be overriden in this field.
# This fields will be merged into the default config that can be found at
# https://github.com/signalfx/splunk-otel-collector-chart/blob/main/helm-charts/splunk-otel-collector/templates/securityContextConstraints.yaml
# NOTE: This config will only be used when distribution=openshift
securityContextConstraintsOverwrite: {}

################################################################################
# OpenTelemetry "collector" k8s deployment configuration.
# This is an additional deployment of OpenTelemetry collector that can be used
# to pass traces trough it, make k8s metadata enrichment and batching.
# Another use case is to point tracing instrumentation libraries directly to
# the collector endpoint instead of local agents. The collector running in the
# passthrough mode is recommended for large k8s clusters, disabled by default.
################################################################################

gateway:
# Defines if collector deployment is enabled
# Recommended for large k8s clusters, disabled by default.
enabled: false

# Number of collector replicas
replicaCount: 3

# The ports exposed by the collector container.
# Any port can be disabled by setting to null.
# Any changes should be aligned with service.ports configuration below.
ports:
otlp:
containerPort: 4317
protocol: TCP
enabled_for: [metrics, traces, logs]
otlp-http:
containerPort: 4318
protocol: TCP
enabled_for: [metrics, traces, logs]
jaeger-thrift:
containerPort: 14268
protocol: TCP
enabled_for: [traces]
jaeger-grpc:
containerPort: 14250
protocol: TCP
enabled_for: [traces]
zipkin:
containerPort: 9411
protocol: TCP
enabled_for: [traces]
signalfx:
containerPort: 9943
protocol: TCP
# SignalFx metrics enabled in gateway for all telemetry types since there may be
# bundled metrics.
enabled_for: [metrics, traces, logs]
http-forwarder:
containerPort: 6060
protocol: TCP
# Enabled for all because SignalFx exporter will always send metadata updates when enabled.
enabled_for: [metrics, traces, logs]

resources:
limits:
cpu: 4
# Memory limit value is used as a source for default memory_limiter configuration
memory: 8Gi

# Scheduling configurations
nodeSelector: {}
tolerations: []
affinity: {}

# Pod configurations
podSecurityContext: {}
terminationGracePeriodSeconds: 600
priorityClassName: ""

# Security context applied to the otel-collector container in the gateway deployment.
containerSecurityContext: {}

# OTel collector annotations
annotations: {}
podAnnotations: {}

# OTel collector extra pod labels
podLabels: {}

# Extra enviroment variables to be set in the standalone OTel collector container
extraEnvs: []

# Extra volumes to be mounted to the OTel Collector container.
extraVolumes: []
extraVolumeMounts: []

# Enable or disable features of the gateway.
featureGates: ""

# OpenTelemetry Collector configuration for standalone otel-collector deployment can be overriden in this field.
# Default configuration defined in config/otel-collector-config.yaml
# Any additional fields will be merged into the defaults,
# existing fields can be disabled by setting them to `null`.
config: {}

################################################################################
# OpenTelemetry service config, used for otel collector deployment.
# Disabled by default
################################################################################

# opentelemetry collector service created only if collector.enabled = true
service:
# Service type
type: ClusterIP
# Service annotations
annotations: {}

################################################################################
# Notice: Operator related features should be considered to have an alpha
# maturity level and be experimental. There may be breaking changes or Operator
# features may be replaced entirely with a better alternative in the future.
#
# The OpenTelemetry Operator running as a deployment with a replica count of 1.
# It auto-instruments applications to emit telemetry data.
# Related documentation: https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/auto-instrumentation-install.md
# Full list of Helm value configurations: https://artifacthub.io/packages/helm/opentelemetry-helm/opentelemetry-operator?modal=values
################################################################################

# Specify whether the chart should install CRDs automatically.
# Related Documentation: https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/auto-instrumentation-install.md#crd-management
operatorcrds:
# Set to true to install CRDs automatically, or false to manage them manually.
install: false

operator:
enabled: false
# This is disabled by default in favor of using `operatorcrds.install=true`, as doing so creates
# a race condition with helm.
# See: https://github.com/open-telemetry/opentelemetry-helm-charts/issues/677
# Users of this chart should _never_ set this to be true. If a user wishes
# to install the CRDs through the opentelemetry-operator chart, it is recommended
# to install the opentelemetry-operator chart separately and prior to the installation
# of this chart.
crds:
create: false
admissionWebhooks:
certManager:
# Annotate the certificate and issuer to ensure they are created after the cert-manager CRDs have been installed.
certificateAnnotations:
"helm.sh/hook": post-install,post-upgrade
"helm.sh/hook-weight": "1"
issuerAnnotations:
"helm.sh/hook": post-install,post-upgrade
"helm.sh/hook-weight": "1"
# Collector deployment via the operator is not supported at this time.
# The collector image repository is specified here to meet operator subchart constraints.
manager:
collectorImage:
repository: quay.io/signalfx/splunk-otel-collector

# The default Splunk Instrumentation object deployed when operator.enabled=true.
# For more details see:
# - Splunk Documentation: https://docs.splunk.com/observability/en/gdi/opentelemetry/automatic-discovery/k8s/k8s-backend.html#optional-configure-the-instrumentation
# - OpenTelemetry Documentation: https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#instrumentation
instrumentation:
# Optional "endpoint" parameter for exporting data to a specific target.
# By default, the endpoint will be set to the agent if it's enabled. If the agent is not enabled, the endpoint
# will default to the gateway, given it is enabled. If neither the agent nor the gateway is enabled, the endpoint
# must be overridden here.
endpoint: ""
# endpoint: http://$(SPLUNK_OTEL_AGENT):4317
# endpoint: http://splunk-otel-collector:4317

# Optional "sampler" parameter for enabling trace sampling, see: https://opentelemetry.io/docs/concepts/sdk-configuration/general-sdk-configuration/#otel_traces_sampler
sampler: {}
# type: traceidratio
# argument: "0.95"

# Optional "environment variable" parameters that can configure all instrumentation libraries.
# - If splunkObservability.profilingEnabled=true, environment variables enabling profiling will be added automatically.
# - If the agent is used as the endpoint to receive traces, the SPLUNK_OTEL_AGENT environment variables will be added automatically.
env: []
# - name: ENV_VAR1
# value: value1
# - name: ENV_VAR2
# value: value2

# Auto-instrumentation Libraries
# Below are configurations for the instrumentation libraries utilized in Auto-instrumentation.
# Highlights:
# - Maturity varies among libraries (e.g., Java is more mature than Go). Check each library's stability here: https://opentelemetry.io/docs/instrumentation/#status-and-releases
# - Some libraries may be enabled by default. The current status can be checked here: https://github.com/open-telemetry/opentelemetry-operator#controlling-instrumentation-capabilities
# - Splunk provides best-effort support for native OpenTelemetry libraries, while offering full support for its own distributions.
# Each library supports the following fields:
# - repository: Specifies the Docker image repository.
# - tag: Indicates the Docker image tag.
# - env: (Optional) Allows you to add any additional environment variables.
java:
repository: ghcr.io/signalfx/splunk-otel-java/splunk-otel-java
tag: v2.12.0
# env:
# - name: JAVA_ENV_VAR
# value: java_value
nodejs:
repository: ghcr.io/signalfx/splunk-otel-js/splunk-otel-js
tag: v2.15.0
# env:
# - name: NODEJS_ENV_VAR
# value: nodejs_value
dotnet:
repository: ghcr.io/signalfx/splunk-otel-dotnet/splunk-otel-dotnet
tag: v1.8.0
env:
- name: OTEL_DOTNET_AUTO_PLUGINS
value: Splunk.OpenTelemetry.AutoInstrumentation.Plugin,Splunk.OpenTelemetry.AutoInstrumentation
go:
repository: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-go
tag: v0.10.1-alpha
# env:
# - name: GO_ENV_VAR
# value: go_value
apache-httpd:
repository: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd
tag: 1.0.4
# env:
# - name: APACHE_ENV_VAR
# value: apache_value
python:
repository: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python
tag: 0.50b0
# env:
# - name: PYTHON_ENV_VAR
# value: python_value
nginx:
repository: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd
tag: 1.0.4
# env:
# - name: NGINX_ENV_VAR
# value: nginx_value
# Auto-instrumentation Libraries (End)

# The cert-manager is a CNCF application deployed as a subchart and used for supporting operators that require TLS certificates.
# Full list of Helm value configurations: https://artifacthub.io/packages/helm/cert-manager/cert-manager?modal=values
certmanager:
enabled: false
installCRDs: true

################################################################################
# Target Allocator
# Notice: Target Allocator related features should be considered to have an alpha
# maturity level and be experimental. There may be breaking changes or Operator
# features may be replaced entirely with a better alternative in the future.
#
# The Target Allocator is running as a deployment with a replica count of 1.
# It discovers scraping configurations from ServiceMonitor and PodMonitor CRDs and
# assigns them to collectors.
# Related documentation: https://github.com/open-telemetry/opentelemetry-operator/tree/main/cmd/otel-allocator
################################################################################

targetAllocator:
enabled: false
image: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:v0.105.0
serviceAccount:
# Specifies whether a ServiceAccount should be created
create: true
# The name of the ServiceAccount to use.
# If not set and create is true, a name is generated using the fullname template
name: ""

# Service account annotations
annotations: {}
config:
allocation_strategy: per-node
collector_selector:
matchlabels:
component: otel-collector-agent
prometheus_cr:
enabled: true
scrapeInterval: 30s
# An empty value means any service monitor will be accepted.
service_monitor_selector: {}
# An empty value means any pod monitor will be accepted.
pod_monitor_selector: {}

filter_strategy: relabel-config

################################################################################
# Helm Chart Feature Gates.
# The following feature gates are used to enable/disable features in the Helm chart
# that are not yet ready for general availability.
# Options in this section are not guaranteed to be stable and may change at any time.
################################################################################

featureGates:
# Use Light Prometheus Receiver for metrics collection from discovered Prometheus endpoints.
# https://github.com/signalfx/splunk-otel-collector/tree/main/internal/receiver/lightprometheusreceiver
# Light Prometheus Receiver is optimized for performance and reduced memory footprint.
# From the other hand, it does not support all Prometheus configuration options.
useLightPrometheusReceiver: false
# The feature gate enables experimental exporter batching instead of batch processor. This feature
# ensures the backpressure is propagated to the file readers and no data is dropped.
# Not recommended to use with enabled gateway.
noDropLogsPipeline: false
# The feature gate enables an experiment to define explicitly tokens on the daemonset and gateway, cluster receiver
# and target allocator deployments.
explicitMountServiceAccountToken: false
# Use a specific metrics pipeline to report control plane metrics as histograms.
useControlPlaneMetricsHistogramData: false

06 Splunk 搜索指标数据

1
| mpreview index=local_k8s_metrics

1bb1c3746b0fb3f1f240d5127d46725

4a2c725b31c4a9fd98fd6655d888db3

参考链接:

https://github.com/signalfx/splunk-otel-collector-chart/blob/main/helm-charts/splunk-otel-collector/values.yaml

https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md

https://docs.splunk.com/observability/en/get-started/o11y-architecture.html


Splunk OpenTelemetry Collector for Kubernetes笔记
https://hesc.info/post/splunk-opentelemetry-collector-for-kubernetesbi-ji-zwobtk.html
作者
需要哈气的纸飞机
发布于
2025年1月17日
许可协议