資訊安全新聞

讀一個檔案也能拿到 root？
AF_ALG 零複製攻擊與 Copy Fail 在 Linux kernel 的驚人威力

摘要

這份研究報告提供 CVE-2026-31431 的深入技術分析，這是一個在 Linux kernel 的 authencesn 加密模板中發現的嚴重邏輯漏洞。此漏洞被俗稱為「Copy Fail」，允許無權限的本機使用者對任何可讀取檔案的 page cache 執行確定性的 4 位元組寫入。藉由 AF_ALG socket 介面與 splice() 系統呼叫，攻擊者可以繞過傳統檔案系統的寫入保護機制，進而取得本機權限提升或容器脫逃。這份報告深入剖析 kernel 內部 scatterlist 管理的根本原因，以及 authencesn 演算法中特定的暫存區寫入行為 ^[1] 。

讀一個檔案也能拿到 root？AF_ALG 零複製攻擊與 Copy Fail 在 Linux kernel 的驚人威力 | 資訊安全新聞

1. 簡介

Linux kernel 的加密子系統提供了多種加密與認證演算法的強健框架。然而，此子系統中的記憶體管理複雜度，特別是在與 AF_ALG 這類使用者空間介面互動時，可能引發細微的邏輯缺陷。CVE-2026-31431 與傳統的記憶體毁損錯誤（如 Race Condition 或緩衝區溢位）有顯著不同，它利用了 authencesn 模板在解密過程中處理內部狀態時的一個直線邏輯錯誤 ^[1] 。

不同於先前如 Dirty COW (CVE-2016-5195) 等知名漏洞，後者依賴虛擬記憶體子系統中複雜的 Race Condition，CVE-2026-31431 具有高度的可攜性與確定性。它利用 AF_ALG 就地解密路徑的基本設計：page cache 頁面直接被參照到一個可寫入的 scatterlist 中。此分析聚焦於讓這種「隱密」寫入得以實現的技術機制，這種寫入不會改變磁碟上的檔案，卻會毁損作業系統使用的 in-memory representation ^[1] 。

2. 技術背景：AF_ALG 與 Splice

AF_ALG socket 類型允許使用者空間應用程式存取 kernel 內部的 crypto API。Linux 中一項重要的效能最佳化是 splice() 系統呼叫，它能在 file descriptor 與 pipe 之間實現零複製資料傳輸。當一個檔案被拼接 (spliced) 到 AF_ALG socket 時，kernel 不會複製資料；相反地，它會建立一個 scatterlist (SGL)，直接指向 page cache 中支援該檔案的實體頁面 ^[1] 。

在 AEAD(Authenticated Encryption with Associated Data) 的 Context 中，輸入通常包含 AAD(Associated Authenticated Data)、密文(ciphertext)以及一個認證標籤(authentication tag)。為了節省記憶體，kernel 在解密時常採用「in-place」操作，也就是讓來源與目的 scatterlist 重疊。如同相關的 kernel 漏洞研究（例如針對 CVE-2025-37947 的分析）所指出的，不當處理這些共享記憶體區域可能導致越界 (Out-Of-Bounds, OOB) 或使用已釋放記憶體 (Use-After-Free, UAF) 的狀況 ^[2] ^[3] 。

3. 根本原因分析：authencesn 暫存區寫入

此漏洞的核心在於 authencesn 模板，該模板主要由 IPsec 用於支援 Extended Sequence Numbers (ESN)。在解密過程中， authencesn 需要重新排列序號位元組以計算 HMAC。它使用呼叫端提供的 destination 緩衝區作為此重新排列的臨時「暫存區」 ^[1] 。

下方的程式碼片段說明了 crypto_authenc_esn_decrypt() 中的有漏洞的邏輯：

/* 
 * Detailed Analysis of authencesn scratch write logic.
 * The algorithm uses the destination scatterlist (dst) as temporary storage
 * to rearrange sequence numbers for HMAC verification.
 */
// Step 1: Read the first 8 bytes of AAD (Associated Authenticated Data).
// This contains the sequence numbers (seqno_hi and seqno_lo).
// scatterwalk_map_and_copy(buffer, sgl, offset, length, direction)
// direction 0 = read from SGL to buffer.
scatterwalk_map_and_copy(tmp, dst, 0, 8, 0); 
// Step 2: Overwrite bytes 4-7 of the destination with seqno_hi.
// This is a temporary modification required for the HMAC calculation.
// direction 1 = write from buffer to SGL.
scatterwalk_map_and_copy(tmp, dst, 4, 4, 1); 
// Step 3: The Vulnerable Write - Writing seqno_lo past the AEAD tag boundary.
// assoclen + cryptlen points to the memory immediately following the authentication tag.
// The algorithm assumes this is safe scratch space, but in AF_ALG in-place mode,
// this offset can point into the next page of the page cache.
scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1);

在第三步中，該演算法在偏移量 assoclen + cryptlen 的位置寫入 4 個位元組。在標準的 AEAD 操作中，有效的輸出範圍是 assoclen + (cryptlen - authsize) 。透過寫入 assoclen + cryptlen ， authencesn 會寫入到預期 destination 緩衝區之外的 4 個位元組。雖然若 destination 是私有使用者空間緩衝區，這可能無害，但一旦與 AF_ALG 和 splice() 結合，就會變得極具破壞性 ^[1] 。

4. 攻擊機制：頁面快取損毀

當 AF_ALG 執行就地解密時，它會將使用者的接收緩衝區與包含 Authentication Tag 的原始輸入頁面串接起來，以建構輸出 scatterlist。由於這些 tag 頁面是透過 splice() 取得的，因此它們實際上是 page cache 頁面 ^[1] 。

sg_chain() 函式將 page cache 頁面連結到輸出 SGL 的末端。接著， authencesn 暫存區寫入會瞄準緊接在 tag 之後的偏移量。如果攻擊者仔細對齊輸入，讓 tag 正好結束在一個頁面邊界上，則這 4 位元組的暫存區寫入會落在 page cache 中「下一頁」的開頭 ^[1] 。

此寫入特別危險的原因如下：

沒有 Dirty 位元： 此寫入經由 crypto API 的 scatterwalk 機制進行，不會觸發 kernel 的 VFS 寫回機制。該頁面永遠不會被標記為 "dirty"。
於記憶體中持續存在： 由於頁面不是 dirty，kernel 不會將其寫回磁碟。然而，任何讀取該檔案的 Process 都會在記憶體中看到被損毀的版本。
繞過完整性檢查： 檢查磁碟上檔案 Hash 的工具（如 sha256sum ）會回報該檔案有效，但 execve() 系統呼叫卻會執行記憶體中被損毀的版本 ^[1] 。

5. 與其他 Kernel 漏洞的技術比較

為了理解 CVE-2026-31431 的嚴重性，將其與其他 kernel 記憶體管理缺陷進行比較是很有幫助的。例如， ksmbd 模組中的 CVE-2025-37947 涉及一個因處理延伸屬性時長度截斷不當所導致的 OOB 寫入 ^[2] 。雖然兩者都涉及 OOB 寫入，但 ksmbd 漏洞攻擊的是 kernel 堆積 (Heap)，需要複雜的 Heap 佈局（heap shaping）才能達成程式碼執行。相比之下，CVE-2026-31431 直接攻擊 page cache，提供了更穩定、更直接的權限提升路徑 ^[1] 。

同樣地， CVE-2024-50264 涉及 AF_VSOCK 子系統中的一個 Race Condition，導致 Use-After-Free ^[3] 。雖然 AF_VSOCK 的攻擊需要精確的時序與訊號操作才能贏得 Race Condition，但「Copy Fail」攻擊是一個直線邏輯錯誤，可以用簡單的 Script 可靠地觸發。損毀 setuid 執行檔（例如 /usr/bin/sudo ）的 page cache 之能力，讓無權限使用者只需修改記憶體中該執行檔的幾個位元組，就能取得 root 權限 ^[1] 。

6. 程式碼分析：algif_aead.c 中的 Scatterlist Chaining

algif_aead.c 如何處理就地操作期間的 Authentication Tag，是此漏洞得以發生的關鍵。下方的 pseudocode 呈現了將輸入頁面串接到輸出 SGL 的邏輯：

/*
 * Simplified logic from algif_aead.c showing SGL chaining.
 * This is where page cache pages are placed into a writable SGL.
 * Detailed comments explain the memory management implications.
 */
// The input SGL contains AAD, Ciphertext, and the Tag.
// For in-place decryption, the kernel copies AAD and CT to the user buffer.
// This is a safe memory copy operation.
memcpy_sglist(user_rx_buf, input_sgl, assoclen + cryptlen - authsize);
// The Tag pages are NOT copied; they are chained to the output SGL.
// This is the critical design choice: it retains direct references 
// to the original page cache pages of the source file.
// sg_chain(dst_sgl, max_ents, src_sgl)
sg_chain(output_sgl, MAX_SGL_ENTS, input_sgl_tag_start);
// The AEAD request is then initialized with the combined SGL.
// req->src and req->dst now both point to a chain that includes 
// both user memory and kernel page cache memory.
req->src = output_sgl;
req->dst = output_sgl;

此設計假設 AEAD 演算法只會寫入 SGL 中的 user_rx_buf 部分。然而，如前所述， authencesn 違反了這個假設，它會寫入指向 page cache 的 input_sgl_tag_start 區域 ^[1] 。

7. 結論

CVE-2026-31431 揭露了 Linux kernel zero-copy 能力與其加密模板之間互動的一個基本弱點。透過允許 AEAD 演算法將自己的 destination 緩衝區當作暫存區使用，卻沒有強制進行嚴格的邊界檢查，kernel 允許了對系統層級 page cache 的未經授權修改。此漏洞突顯了在 kernel 的 crypto API 內部，尤其是處理像 page cache 這類共享資源時，需要更嚴格的記憶體隔離。緩解措施包括修補 authencesn 模板以避免越界暫存區寫入，並可能修訂 AF_ALG 的實作，以防止 page cache 頁面被包含在可寫入的 scatterlist 中 ^[1] 。

摘要