資訊安全新聞

告別手動除錯：Python + dnlib 自動化提取 QuasarRAT 惡意設定

簡介

遠端存取木馬 (Remote Access Trojans, RATs) 的分析是現代網路資安和威脅情報的關鍵組成部分。QuasarRAT 是一個用 C# 在 .NET Framework 上編寫的開源 RAT [1]，由於其易用性和強大的功能集，已被 Threat actor 廣泛採用。分析任何惡意軟體的關鍵步驟是萃取其設定，其中通常包含關鍵資訊，例如 Command and Control (C2) 伺服器位址、通訊埠號和安裝設定。這份文件詳細技術檢驗了用於以程式設計方式從 QuasarRAT 二進位檔中萃取設定的方法，重點在於使用 Intermediate Language (IL) 和 dnlib 函式庫 [1]。

告別手動除錯：Python + dnlib 自動化提取 QuasarRAT 惡意設定 | 資訊安全新聞

.NET Intermediate Language 和 dnlib

QuasarRAT 是一個 .NET 應用程式，它被編譯成 Intermediate Language (IL) ，也稱為 MSIL 或 CIL。IL 是一種基於堆疊的 bytecode，由 Common Language Runtime (CLR) 在運行時即時 (Just-In-Time, JIT) 編譯為 Native API 機器碼 [1]。了解 IL 的結構對於靜態分析和設定萃取至關重要。

萃取程序利用了 dnlib ，這是一個開源的 .NET 函式庫，專為深度檢查和修改 .NET 組件而設計。結合 pythonnet （作為從 Python 呼叫 .NET API 的橋樑），dnlib 允許分析師以程式設計方式掃描組件結構，包括命名空間、類別、方法和 IL 指令 [1]。

QuasarRAT 中設定資料的核心是作為靜態欄位儲存在特定類別中，通常是 Config.Settings 。在 .NET 中，靜態欄位在其類別的靜態建構函式中初始化，該建構函式被稱為 .cctor (Class Constructor) [1]。萃取策略的重點是分析此 .cctor 方法中的 IL 指令。

設定萃取方法論

設定萃取程序可以系統地分為三個主要階段：組件載入、類別和方法識別，以及指令分析。

1. 組件載入與初始化

第一步涉及設定環境並載入目標 .NET 組件。這需要從 pythonnet 和 dnlib 匯入必要的組件。

# Python prelude to load a .NET sample and initialize dnlib
import clr
# Add reference to System.Memory for low-level memory handling
clr.AddReference("System.Memory") 
# Import necessary System types
from System.Reflection import Assembly, MethodInfo, BindingFlags
from System import Type
# Import dnlib components
import dnlib
from dnlib.DotNet import *
from dnlib.DotNet.Emit import OpCodes
# Define the path to the malware sample
MALWARE_PE_PATH = "path/to/QuasarRAT.exe" 
# Load the .NET module using dnlib
# This provides programmatic access to the assembly's structure
module = dnlib.DotNet.ModuleDefMD.Load(MALWARE_PE_PATH)

2. 類別和方法識別

設定位於 Config.Settings 類別中。萃取程式碼必須首先透過反覆載入模組中的所有類型來定位此類別。

# Python code to identify the Config.Settings class
target_ns = "Config"
target_class = "Settings"
found_setting_class = None
# Iterate through all types (classes) in the module
for t in module.Types:
    # Check if the namespace ends with "Config" and the name ends with "Settings"
    # This handles cases where the full namespace might be Quasar.Client.Config
    if t.Namespace.EndsWith(target_ns) and t.Name.EndsWith(target_class):
        found_setting_class = t
        break
if found_setting_class:
    # Locate the static constructor (.cctor) method
    cctor_method = None
    for m in found_setting_class.Methods:
        # The static constructor is marked as static and named ".cctor"
        if m.IsStatic and m.Name == ".cctor":
            cctor_method = m
            break
    if cctor_method:
        print(f"Successfully located Config.Settings class and .cctor method.")
    else:
        print("Error: .cctor method not found in Config.Settings.")
else:
    print("Error: Config.Settings class not found.")

3. 指令分析和數值萃取

關鍵步驟是分析 .cctor 方法中的 IL 指令。在未被混淆的 QuasarRAT 樣本中，設定欄位由兩個指令序列初始化： 1. ldstr (Load String) : 將設定值（一個字串常值）推送到評估堆疊上。 2. stsfld (Store into Static Field) : 從堆疊中彈出數值並將其儲存到對應的靜態欄位中 (例如， HOSTS 、 VERSION )。

萃取程式碼會反覆找出 IL 指令，追蹤前一個指令以識別這個 ldstr / stsfld 配對。

# Python code to extract configuration values from the .cctor method
# Assumes cctor_method is the dnlib MethodDefMD instance for .cctor
config_data = {}
prev_instr = None # Track the instruction immediately preceding the current one
# Iterate through all IL instructions in the method body
for instr in cctor_method.Body.Instructions:
    # Check if the current instruction is 'stsfld' (Store into Static Field)
    if instr.OpCode == OpCodes.Stsfld and instr.Operand is not None:
        # The operand of stsfld is the field definition (e.g., Config.Settings::HOSTS)
        field = instr.Operand
        # Check if the previous instruction was 'ldstr' (Load String)
        # This indicates that the value was pushed onto the stack just before being stored
        if prev_instr is not None and prev_instr.OpCode == OpCodes.Ldstr:
            # The operand of ldstr is the string value itself
            value = prev_instr.Operand
            # Extract the field name (e.g., HOSTS)
            field_name = field.Name
            # Store the extracted configuration pair
            config_data[field_name] = value
            # Detailed logging of the extraction process
            # print(f"Extracted: {field_name} = {value}")
    # Update the previous instruction for the next iteration
    prev_instr = instr
# Output the extracted configuration
# print("\n--- Extracted Configuration ---")
# for key, value in config_data.items():
#     print(f"{key}: {value}")

這種程式設計方法允許萃取所有靜態設定欄位，例如 VERSION 、 HOSTS 、 RECONNECTDELAY 、 MUTEX 和 ENCRYPTIONKEY ，而無需依賴動態分析或除錯器 [1]。

程序流程視覺化

整個設定萃取程序，從載入組件到萃取最終的 key-value 配對，可以使用流程圖進行視覺化。此圖強調了分析 IL 程式碼所涉及的順序和條件步驟。

graph TD A[Start:
Load QuasarRAT Assembly] --> B{Find
Config.Settings
Class?}; B -- Yes --> C{Find
.cctor Method?}; B -- No --> F[Error:
Class Not Found]; C -- Yes --> D[Initialize:
prev_instr = None]; C -- No --> G[Error:
.cctor Not Found]; D --> H{Iterate
through
IL Instructions}; H --> I{Current Instruction
is
stsfld?}; I -- Yes --> J{Previous Instruction
is ldstr?}; I -- No --> K[Update:
prev_instr =
Current Instruction]; J -- Yes --> L[Extract:
field =
stsfld.Operand,
value = ldstr.Operand]; J -- No --> K; L --> M[Store
Config Pair]; M --> K; K --> H; H -- End of Instructions --> E[End:
Output
Extracted Configuration]; F --> E; G --> E;

處理混淆

雖然上述方法對於乾淨、未混淆的 QuasarRAT 樣本非常有效，但真實世界的惡意軟體通常會採用混淆技術來阻礙靜態分析。.NET 惡意軟體中常見的混淆方法包括： * String Encryption (字串加密) : 設定字串在運行時被加密和解密，通常在 .cctor 方法本身中進行。 * Control Flow Flattening (控制流扁平化) : .cctor 的邏輯流程被遮蓋，使得 ldstr / stsfld 配對不是順序性的。 * Renaming (重新命名) : 類別和方法名稱 (例如， Config.Settings , .cctor ) 被替換為無意義的字串。

要處理字串加密，萃取邏輯必須擴展。程式碼將會尋找一個呼叫 ( call ) 指令，該指令呼叫字串解密 routine，而不是尋找 ldstr 。該 call 指令的 operand (運算元) 將是加密的字串或用於解密的 key/IV。然後，程式碼需要模擬或呼叫解密函數以還原 plaintext 設定[2]。

對於重新命名，分析師必須依賴啟發式方法，例如識別具有大量靜態欄位的類別或執行網路相關操作的方法，以定位設定類別。對於控制流扁平化，通常需要更進階的符號執行或動態分析技術來重建邏輯流程，然後才能應用靜態萃取。

結論

使用 dnlib 和 IL analysis 以程式設計方式萃取 QuasarRAT 的設定為惡意軟體分析提供了一種穩健且可重複的方法。透過專注於 .NET 靜態建構函式 ( .cctor ) 的可預測結構以及 ldstr 和 stsfld 的指令配對，分析師可以可靠地恢復關鍵的 C2 和操作設定。雖然混淆帶來了挑戰，但對 IL 的基礎理解和 dnlib 這類工具的模組化允許開發量身定制的擴充功能來處理加密字串和複雜的控制流，確保設定萃取在威脅情報中仍然是一種可行且重要的技術。

簡介