Skip to main content

Command Palette

Search for a command to run...

初嚐 OpenTelemetry Go Log Beta

Updated
6 min read
初嚐 OpenTelemetry Go Log  Beta

在當初寫書跟在公司應用時,OpenTelemetry Go 其實對於 Log 這類型遙測信號還沒實踐,去年都還在草案討論,今年終於在 OTel Go v1.24.0 提出設計。從它們迭代過程中我們也能看到都是從 API 和 Bridge API 開始著手,接著是 Record 結構(v1.25.0),最後才是 SDK (v1.26.0)的實踐。

當初小弟在書上第五章就層介紹到 OTel log record 的結構︰

今天就能來嘗試看看。在此之前 Go 有出 slog 套件,但我之前都覺得用起來沒 OTel + 自己客製化一點點 zap 來的順手。但直到今日我覺得 OTel log bridge 搭配 slog ,用起來幾乎零門檻!

絕大部分內容都參考自官方網站 Doc/Language APIs & SDKs/Go/Getting Started 的內容。今天的內容我有上傳至 OpenTelemetry 入門指南的 Ch5/rolldice 中。

在 Log instrument 的頁面還有提到 log 直接傳送給 collector 與寫到 stdout和file的比較。這部份有機會再寫一篇介紹。

Logger Provider

Logger Provider 主要是給 OTel Log API 的的一個入口點,提供對 Logger 的建立存取與管理。因此 API 應該會提供方法來設定或註冊以及訪問全域的 LoggerProvider。

Logger 才是真的去發送 Log Record 的實例。

這些都在書上第5章有更全面的說明。

func newLoggerProvider() (*log.LoggerProvider, error) {
    logExporter, err := stdoutlog.New(stdoutlog.WithPrettyPrint())
    if err != nil {
        return nil, err
    }

    loggerProvider := log.NewLoggerProvider(
        log.WithProcessor(log.NewSimpleProcessor(logExporter)),
    )
    return loggerProvider, nil
}

接著就能開始設定 SDK ,shutdownFuncs很重要,如果我們要做 Graceful shutdown,把還queue在logger中的records 都送出去,就要依賴它去調用loggerProvider.Shutdown

Shutdown 的作用在書上第6章有提。

// setupOTelSDK bootstraps the OpenTelemetry pipeline.
// If it does not return an error, make sure to call shutdown for proper cleanup.
func setupOTelSDK(ctx context.Context) (shutdown func(context.Context) error, err error) {
    var shutdownFuncs []func(context.Context) error

    // shutdown calls cleanup functions registered via shutdownFuncs.
    // The errors from the calls are joined.
    // Each registered cleanup will be invoked once.
    shutdown = func(ctx context.Context) error {
        var err error
        for _, fn := range shutdownFuncs {
            err = errors.Join(err, fn(ctx))
        }
        shutdownFuncs = nil
        return err
    }

    // handleErr calls shutdown for cleanup and makes sure that all errors are returned.
    handleErr := func(inErr error) {
        err = errors.Join(inErr, shutdown(ctx))
    }

    // Set up propagator.
    prop := newPropagator()
    otel.SetTextMapPropagator(prop)

    // Set up trace provider.

    // Set up meter provider.

    // Set up logger provider.
    loggerProvider, err := newLoggerProvider()
    if err != nil {
        handleErr(err)
        return
    }
    shutdownFuncs = append(shutdownFuncs, loggerProvider.Shutdown)
    global.SetLoggerProvider(loggerProvider)

    return
}

業務模組的 Logger

通常每個業務模組都會有一個具名的 Logger。這裡只用 Slog 來示範。OTel Go contrib 有提供了 Slog 的 bridge。

import (
    "go.opentelemetry.io/contrib/bridges/otelslog"
)

const name = "go.opentelemetry.io/otel/example/dice"

var (
    logger  = otelslog.NewLogger(name)
)

// 業務行為
func rolldice(w http.ResponseWriter, r *http.Request) {
    // 從 parent span 產生 current span
    ctx, span := tracer.Start(r.Context(), "roll")
    defer span.End()

    roll := 1 + rand.Intn(6)

    var msg string
    if player := r.PathValue("player"); player != "" {
        msg = fmt.Sprintf("%s is rolling the dice", player)
    } else {
        msg = "Anonymous player is rolling the dice"
    }

    // 呼叫 logger 紀錄 Info 事件
    logger.InfoContext(ctx, msg, 
        "result", roll)

    rollValueAttr := attribute.Int("roll.value", roll)
    span.SetAttributes(rollValueAttr)
    rollCnt.Add(ctx, 1, metric.WithAttributes(rollValueAttr))

    resp := strconv.Itoa(roll) + "\n"
    if _, err := io.WriteString(w, resp); err != nil {
        log.Printf("Write failed: %v\n", err)
    }

    // 呼叫 logger 紀錄 Error 事件
    logger.ErrorContext(ctx, errors.New("errrrr").Error(),
        "argName", "1212",
        "argName2", 4444)
}

首次執行

make run

curl -X GET http://localhost:8080/rolldice/Alice

Info Log

從下方 Json資料能看見上面表5-2的欄位幾乎都存在於這資料中。連 Trace 與 Span 還有重要的 Recouce 資訊都自動代入了。這裡的Severity Number 是9

{
    "Timestamp": "2024-11-17T23:30:14.457791572+08:00",
    "ObservedTimestamp": "2024-11-17T23:30:14.457828932+08:00",
    "Severity": 9,
    "SeverityText": "",
    "Body": {
        "Type": "String",
        "Value": "Alice is rolling the dice"
    },
    "Attributes": [
        {
            "Key": "result",
            "Value": {
                "Type": "Int64",
                "Value": 3
            }
        }
    ],
    "TraceID": "6c554d8e71da75aec261bbd8dcfa9171",
    "SpanID": "e0a4005b7be246ef",
    "TraceFlags": "01",
    "Resource": [
        {
            "Key": "service.name",
            "Value": {
                "Type": "STRING",
                "Value": "unknown_service:rolldice_log_demo"
            }
        },
        {
            "Key": "telemetry.sdk.language",
            "Value": {
                "Type": "STRING",
                "Value": "go"
            }
        },
        {
            "Key": "telemetry.sdk.name",
            "Value": {
                "Type": "STRING",
                "Value": "opentelemetry"
            }
        },
        {
            "Key": "telemetry.sdk.version",
            "Value": {
                "Type": "STRING",
                "Value": "1.32.0"
            }
        }
    ],
    "Scope": {
        "Name": "go.opentelemetry.io/otel/example/dice",
        "Version": "",
        "SchemaURL": "",
        "Attributes": {}
    },
    "DroppedAttributes": 0
}

Error Log

我們自定義的參數也都被自動帶入於Attributes 欄位中且判斷出型別。

這裡的Severity Number 是17

{
    "Timestamp": "2024-11-17T23:30:14.458313377+08:00",
    "ObservedTimestamp": "2024-11-17T23:30:14.45831527+08:00",
    "Severity": 17,
    "SeverityText": "",
    "Body": {
        "Type": "String",
        "Value": "errrrr"
    },
    "Attributes": [
        {
            "Key": "argName",
            "Value": {
                "Type": "String",
                "Value": "1212"
            }
        },
        {
            "Key": "argName2",
            "Value": {
                "Type": "Int64",
                "Value": 4444
            }
        }
    ],
    "TraceID": "6c554d8e71da75aec261bbd8dcfa9171",
    "SpanID": "e0a4005b7be246ef",
    "TraceFlags": "01",
    "Resource": [
        {
            "Key": "service.name",
            "Value": {
                "Type": "STRING",
                "Value": "unknown_service:rolldice_log_demo"
            }
        },
        {
            "Key": "telemetry.sdk.language",
            "Value": {
                "Type": "STRING",
                "Value": "go"
            }
        },
        {
            "Key": "telemetry.sdk.name",
            "Value": {
                "Type": "STRING",
                "Value": "opentelemetry"
            }
        },
        {
            "Key": "telemetry.sdk.version",
            "Value": {
                "Type": "STRING",
                "Value": "1.32.0"
            }
        }
    ],
    "Scope": {
        "Name": "go.opentelemetry.io/otel/example/dice",
        "Version": "",
        "SchemaURL": "",
        "Attributes": {}
    },
    "DroppedAttributes": 0
}

SeverityNumber 與 Log級別 的對應

但其實 OTel 還有個 SeverityText ,這部份可以自己團隊定義,或者就按照 Log level 填入也行。

看起來都很完美!只是!

我習慣看 Caller 資訊,是哪隻模組的哪個檔案的那一行產生 Log event 的。

自定義 Logger

package main

import (
    "context"
    "log/slog"
    "runtime"
)

type CustLogger struct {
    base *slog.Logger
}

func NewCustLoggerLogger(base *slog.Logger) *CustLogger {
    return &CustLogger{base: base}
}

func getCallerInfo() (file string, line int, funcName string) {
    pc, file, line, ok := runtime.Caller(2)
    if !ok {
        return "unknown", 0, "unknown"
    }
    funcName = runtime.FuncForPC(pc).Name()
    return file, line, funcName
}

func (l *CustLogger) InfoContext(ctx context.Context, msg string, keysAndValues ...any) {
    file, line, funcName := getCallerInfo()
    extendedKeysAndValues := append(keysAndValues,
        "caller.file", file,
        "caller.line", line,
        "caller.func", funcName,
    )
    l.base.InfoContext(ctx, msg, extendedKeysAndValues...)
}

func (l *CustLogger) ErrorContext(ctx context.Context, err error, keysAndValues ...any) {
    file, line, funcName := getCallerInfo()
    extendedKeysAndValues := append(keysAndValues,
        "caller.file", file,
        "caller.line", line,
        "caller.func", funcName,
    )
    l.base.ErrorContext(ctx, err.Error(), extendedKeysAndValues...)
}

Logger 也改一下

logger = NewCustLoggerLogger(otelslog.NewLogger(name))

再次執行

make run

curl -X GET http://localhost:8080/rolldice/Alice

漂亮!我有 Caller 資訊了。

{
    "Timestamp": "2024-11-17T23:39:49.11538527+08:00",
    "ObservedTimestamp": "2024-11-17T23:39:49.11542815+08:00",
    "Severity": 9,
    "SeverityText": "",
    "Body": {
        "Type": "String",
        "Value": "Alice is rolling the dice"
    },
    "Attributes": [
        {
            "Key": "result",
            "Value": {
                "Type": "Int64",
                "Value": 5
            }
        },
        {
            "Key": "caller.file",
            "Value": {
                "Type": "String",
                "Value": "/home/nathan/Project/OpenTelemetryEntryBeook/ch5/rolldice/rolldice.go"
            }
        },
        {
            "Key": "caller.line",
            "Value": {
                "Type": "Int64",
                "Value": 51
            }
        },
        {
            "Key": "caller.func",
            "Value": {
                "Type": "String",
                "Value": "main.rolldice"
            }
        }
    ],
    "TraceID": "aa3f83b7b738ce028cfb9f00e1c566dd",
    "SpanID": "b3fa02d56626b129",
    "TraceFlags": "01",
    "Resource": [
        {
            "Key": "service.name",
            "Value": {
                "Type": "STRING",
                "Value": "unknown_service:rolldice_log_demo"
            }
        },
        {
            "Key": "telemetry.sdk.language",
            "Value": {
                "Type": "STRING",
                "Value": "go"
            }
        },
        {
            "Key": "telemetry.sdk.name",
            "Value": {
                "Type": "STRING",
                "Value": "opentelemetry"
            }
        },
        {
            "Key": "telemetry.sdk.version",
            "Value": {
                "Type": "STRING",
                "Value": "1.32.0"
            }
        }
    ],
    "Scope": {
        "Name": "go.opentelemetry.io/otel/example/dice",
        "Version": "",
        "SchemaURL": "",
        "Attributes": {}
    },
    "DroppedAttributes": 0
}
{
    "Timestamp": "2024-11-17T23:39:49.115839158+08:00",
    "ObservedTimestamp": "2024-11-17T23:39:49.115841733+08:00",
    "Severity": 17,
    "SeverityText": "",
    "Body": {
        "Type": "String",
        "Value": "errrrr"
    },
    "Attributes": [
        {
            "Key": "argName",
            "Value": {
                "Type": "String",
                "Value": "1212"
            }
        },
        {
            "Key": "argName2",
            "Value": {
                "Type": "Int64",
                "Value": 4444
            }
        },
        {
            "Key": "caller.file",
            "Value": {
                "Type": "String",
                "Value": "/home/nathan/Project/OpenTelemetryEntryBeook/ch5/rolldice/rolldice.go"
            }
        },
        {
            "Key": "caller.line",
            "Value": {
                "Type": "Int64",
                "Value": 64
            }
        },
        {
            "Key": "caller.func",
            "Value": {
                "Type": "String",
                "Value": "main.rolldice"
            }
        }
    ],
    "TraceID": "aa3f83b7b738ce028cfb9f00e1c566dd",
    "SpanID": "b3fa02d56626b129",
    "TraceFlags": "01",
    "Resource": [
        {
            "Key": "service.name",
            "Value": {
                "Type": "STRING",
                "Value": "unknown_service:rolldice_log_demo"
            }
        },
        {
            "Key": "telemetry.sdk.language",
            "Value": {
                "Type": "STRING",
                "Value": "go"
            }
        },
        {
            "Key": "telemetry.sdk.name",
            "Value": {
                "Type": "STRING",
                "Value": "opentelemetry"
            }
        },
        {
            "Key": "telemetry.sdk.version",
            "Value": {
                "Type": "STRING",
                "Value": "1.32.0"
            }
        }
    ],
    "Scope": {
        "Name": "go.opentelemetry.io/otel/example/dice",
        "Version": "",
        "SchemaURL": "",
        "Attributes": {}
    },
    "DroppedAttributes": 0
}

結論

這次 OTel GO 於 log 的開發結果讓應用開發者的我很滿意,幾乎能無腦轉移。只是覺得SeverityText 能幫忙自動設定log level 進去阿 XD。不過這其實自己客製化也不難。

重點是保證是 structured log 以及自動整合了resource context 與 trace context。蠻多框架輸出的其實並不太是 structured log。但目前還是處於 Beta 階段,很可能使用的方式上還會改動。

至於 slog 怎麼能寫入 log 時被多塞這麼多 OTel 資訊呢?那是因為OTel slog bridge , 就新增一個handler 給slog了。這細節能自己在追蹤就好。

// NewLogger returns a new [slog.Logger] backed by a new [Handler]. See
// [NewHandler] for details on how the backing Handler is created.
func NewLogger(name string, options ...Option) *slog.Logger {
    return slog.New(NewHandler(name, options...))
}
func (l *Logger) log(ctx context.Context, level Level, msg string, args ...any) {
    ... ignore
    _ = l.Handler().Handle(ctx, r)
}

有興趣能購買OpenTelemetry 入門指南:建立全面可觀測性架構(iThome鐵人賽系列書) 電子書來閱讀。

程式內容︰OpenTelemetry 入門指南的 Ch5/rolldice

More from this blog

Claude Code 監控秘錄:OpenTelemetry(OTel/OTLP)實戰指南

稟告主公:此乃司馬懿進呈之兵書,詳解如何以 OpenTelemetry 陣法,令臥龍神算之一舉一動盡在掌握,知糧草消耗、察兵器效能、辨戰報異常,使主公運籌帷幄於大帳之中。 為何需要斥候情報? 司馬懿稟告主公: 臥龍神算(Claude Code)乃當世利器,然若無斥候回報,主公便如蒙眼行軍——兵器耗損幾何、糧草消費幾許、哪路斥候出了差錯,一概不知。臣以為,此乃兵家大忌。 無情報之弊,有四: 軍

Feb 19, 202610 min read162
Claude Code 監控秘錄:OpenTelemetry(OTel/OTLP)實戰指南

工程師的 Claude Code 實戰指南:從零開始到高效開發

工程師的 Claude Code 實戰指南:從零開始到高效開發 本文整合 Anthropic 官方 Best Practices 與社群實戰 Tips,帶你由淺入深掌握 Claude Code。 什麼是 Claude Code?為什麼值得學? 如果你還在用「複製程式碼貼到 ChatGPT,再複製答案貼回去」的工作流程,Claude Code 會讓你大開眼界。 Claude Code 是 Anthropic 推出的命令列工具,它直接活在你的 terminal 裡,能夠讀懂你的整個 codeb...

Feb 18, 20265 min read72
工程師的 Claude Code 實戰指南:從零開始到高效開發

System Design Interview Ch 12 Digital Wallet

確立問題與設計範疇 角色對話內容 面試者我們應該只關注兩個數位錢包之間的餘額轉帳操作嗎?我們是否需要擔心其他功能? 面試官讓我們只關注餘額轉帳操作。 面試者該系統需要支援多少 TPS(每秒交易次數)? 面試官讓我們假設是 1,000,000 TPS (每秒 100 萬次交易)。 面試者數位錢包對正確性有嚴格的要求。我們可以假設事務保證 就足夠了嗎? 面試官聽起來不錯。 面試者我們需要證明正確性嗎? 面試官這是一個很好的問題。正確性(Correctness)通常只有在交...

Feb 2, 202610 min read190
System Design Interview Ch 12 Digital Wallet

Claude Code 利用 Event-Driven Hooks 打造自動化開發大腦

在現代 AI 輔助開發中,我們不僅需要 AI 寫程式,更需要它懂規則、記性好,並且能自動處理那些繁瑣的雜事。透過 Claude Code Hooks 機制,我們可以介入 AI 的思考與執行迴圈,實現真正的「人機協作自動化」。 一、 動機與痛點:為什麼你需要介入 AI 的生命週期? 在預設狀態下,Claude Code 雖然強大,但它是「被動」且「無狀態」的,這導致了開發者常遇到以下痛點: 記憶重置 (Session Amnesia): 痛點:每次重啟終端機,AI 就像失憶一樣。 解法:你...

Jan 24, 20266 min read441
Claude Code 利用 Event-Driven Hooks 打造自動化開發大腦
M

MicroFIRE

71 posts