變數變換:證據方法與頻率方法
Objective
當需要進行變數變換時:
Evidential Inference(證據方法)
在統計推論中,我們常用 likelihood ratio(或其對數,即 support $S$)來衡量資料 $x$ 對不同假設的相對支持程度:
\[ S = \log\frac{L(\theta_1)}{L(\theta_2)}, \]
任何與資料 $x$ 無關的比例常數(例如由參數轉換引入的 Jacobian 部分)則不是我們關注的;這使得在單調轉換下,support $S$ 的排序保持不變,提供一個直觀且不受轉換影響的證據量度。
Frequency Approach(頻率方法)
通常需要根據完整的概率密度函數(pdf)計算 $p$ 值或進行檢定,這就必須在變數轉換時同時正確的轉換 pdf(通過 Jacobian 變換或數值化的方法),從而增加了實現和解釋的複雜性。
Derivation
1. 原始尺度下的 Likelihood
給定資料 $x$ 與參數 \(\theta\),假設其 likelihood 定義為
\[ L(\theta)= f(x\mid \theta), \]
其中 \(f(x\mid \theta)\) 完全反映了資料 $x$ 帶來的信息。注意到 likelihood 僅定義到比例常數,也就是說如果定義
\[ L^*(\theta)= c\cdot L(\theta) \quad (c \text{ 與 } x,\theta \text{ 無關}), \]
那麼在比較不同參數值(例如計算 likelihood ratio)時,常數 $c$ 在分子和分母中會抵消。
2. 相對證據(Support)的定義
比較兩個參數值 \(\theta_1\) 與 \(\theta_2\) 時,我們定義 likelihood ratio 為
\[ \text{LR} = \frac{L(\theta_1)}{L(\theta_2)} = \frac{f(x\mid \theta_1)}{f(x\mid \theta_2)}, \]
並取對數得到 support
\[ S = \log\frac{f(x\mid \theta_1)}{f(x\mid \theta_2)}. \]
這個 $S$ 僅依賴於資料 $x$ 對不同參數值的支持情形。
3. 單調變數轉換
若對參數進行單調變換,令
\[ \phi = g(\theta), \]
其中 $g$ 為一對一且單調的函數,其反函數為 \(g^{-1}\)。根據變數轉換公式,新參數下的 likelihood 為
\[ L^*(\phi)= L\bigl(g^{-1}(\phi)\bigr)\,\left|\frac{d\,g^{-1}(\phi)}{d\phi}\right|. \]
在這裡:
- \(L\bigl(g^{-1}(\phi)\bigr)= f\bigl(x\mid g^{-1}(\phi)\bigr)\) 為與資料 $x$ 有關的部分,
- 而 \(\left|\frac{d\,g^{-1}(\phi)}{d\phi}\right|\)(Jacobian 因子)僅反映參數轉換的測度變換效應,與 $x$ 無關。
4. 相對證據下忽略 Jacobian
比較兩個轉換後的參數值 \(\phi_1\) 與 \(\phi_2\) 時,完整的 likelihood ratio 為
\[ \frac{L^*(\phi_1)}{L^*(\phi_2)} = \frac{f\bigl(x\mid g^{-1}(\phi_1)\bigr)}{f\bigl(x\mid g^{-1}(\phi_2)\bigr)} \cdot \frac{\left|\frac{d\,g^{-1}(\phi_1)}{d\phi_1}\right|}{\left|\frac{d\,g^{-1}(\phi_2)}{d\phi_2}\right|}. \]
此處後面的比值,雖然數值上不一定等於 1,但因為它僅由轉換函數 $g$ 決定,與資料 $x$ 無關。在 evidential inference 中,我們只關注資料帶來的部分
\[ \frac{f\bigl(x\mid g^{-1}(\phi_1)\bigr)}{f\bigl(x\mid g^{-1}(\phi_2)\bigr)}, \]
忽略那個純粹來自參數轉換的比例因子,就可得到
\[ S = \log\frac{f\bigl(x\mid \theta_1\bigr)}{f\bigl(x\mid \theta_2\bigr)}, \]
這與原始尺度下完全一致。
Practical Implications and Comparison with Frequency Approach
Evidential Inference:
我們只關注 likelihood ratio 或 support $S$(即資料 $x$ 對不同假設支持程度的相對比較),因此在單調轉換下若忽略 Jacobian 部分,support $S$ 保持不變。這使得 evidential 方法具有不受參數化影響的穩健性,簡化了解釋和比較。Frequency Approach:
頻率方法通常需要使用完整的概率密度函數(pdf)來計算 $p$ 值和檢定統計量。在進行變數轉換時,必須正確納入 Jacobian 因子以保證轉換後的密度正確,這會改變分布的形狀,增加了計算和實踐中的複雜性。

正如我們的圖中展示的:
- 在 evidential inference 中(忽略 Jacobian 部分),不同參數化下 support $S$ 數值相同,直觀顯示資料對假設支持程度的排序不變。
- 而頻率方法中,由於必須完整計算 pdf(包含 Jacobian),轉換後的密度曲線形狀會改變,進而影響 $p$ 值的計算,增加了實踐的困難度。
Code for Visualization
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.integrate import quad
# ------------------- Data and Likelihood Functions -------------------
# Data: 60 deaths out of 100 patients
x = 60
n = 100
# Reference and MLE values on the original scale
p_ref = 0.5 # reference value
p_mle = 0.6 # maximum likelihood estimate
# Binomial likelihood (ignoring combinatorial constant)
def likelihood(p, x, n):
return (p ** x) * ((1 - p) ** (n - x))
# Transformed likelihood with Jacobian for q = 1/p (正確變換)
def likelihood_transformed(q, x, n):
p = 1 / q
return likelihood(p, x, n) / (q ** 2)
# Transformed likelihood without Jacobian for q = 1/p (直接轉換,不修正 Jacobian)
def likelihood_transformed_no_jacobian(q, x, n):
p = 1 / q
return likelihood(p, x, n)
# ------------------- Likelihood-based Calculations -------------------
# 提高解析度:使用 5000 個點
p_vals = np.linspace(0.3, 0.8, 5000)
L_vals = likelihood(p_vals, x, n)
L_vals_norm = L_vals / np.max(L_vals)
idx_ref = np.argmin(np.abs(p_vals - p_ref))
L_at_p_ref = L_vals_norm[idx_ref]
# 統一採用 mask2: 將 q = 1/p 的範圍設定在 [1.2, 2.5]
q_vals = 1 / p_vals
mask2 = (q_vals >= 1.2) & (q_vals <= 2.5)
q_vals2 = q_vals[mask2]
# 具有 Jacobian 修正的轉換
L_q_vals = likelihood_transformed(q_vals, x, n)
L_q_vals_norm = L_q_vals / np.max(L_q_vals)
L_q_vals_norm2 = L_q_vals_norm[mask2]
# 不修正 Jacobian 的轉換
Lq_vals_nojac = likelihood_transformed_no_jacobian(q_vals, x, n)
Lq_vals_nojac_norm = Lq_vals_nojac / np.max(Lq_vals_nojac)
Lq_vals_nojac_norm2 = Lq_vals_nojac_norm[mask2]
q_ref = 1 / p_ref # 2.0
q_mle = 1 / p_mle # ≈1.667
idx_q_ref = np.argmin(np.abs(q_vals2 - q_ref))
L_at_q_ref = L_q_vals_norm2[idx_q_ref]
L_at_q_ref_nojac = Lq_vals_nojac_norm2[idx_q_ref]
# ------------------- p-value-based Calculations -------------------
# 原始尺度下的 z-test: p ~ N(0.5, 0.05^2)
p_pdf = norm.pdf(p_vals, loc=p_ref, scale=0.05)
p_value_orig = 1 - norm.cdf(p_mle, loc=p_ref, scale=0.05)
# Delta 方法調整:T = 1/p
T0 = 1 / p_ref # 2.0
SE_T = 4 * 0.05 # 0.2
T_obs = 1 / p_mle # ≈1.667
# 提高 T 軸解析度:使用 2000 個點
x_T = np.linspace(1.2, 2.5, 2000)
T_pdf = norm.pdf(x_T, loc=T0, scale=SE_T)
p_value_delta = norm.cdf((T_obs - T0) / SE_T)
# Full PDF Transformation(包含 Jacobian):正確轉換後的 T 密度
def f_T(t):
return norm.pdf(1/t, loc=p_ref, scale=0.05) * (1 / t**2)
x_T_full = np.linspace(1.2, 2.5, 2000)
T_full_pdf = f_T(x_T_full)
p_value_full, _ = quad(f_T, 0, T_obs)
# ------------------- Plot Settings -------------------
pink_color = 'pink'
pink_alpha = 0.5
# ------------------- Create 2x3 Subplots -------------------
fig, axs = plt.subplots(2, 3, figsize=(18, 10))
# 計算上排中對數化的 normalized likelihood
logL_vals = np.log(L_vals_norm)
logL_q_vals = np.log(L_q_vals_norm2)
logL_q_nojac = np.log(Lq_vals_nojac_norm2)
logL_ref = np.log(L_at_p_ref)
logL_q_ref = np.log(L_at_q_ref)
logL_q_ref_nojac = np.log(L_at_q_ref_nojac)
# 固定上排 y 軸範圍(較小範圍)
overall_ymin = -10.0
overall_ymax = 0.0
# ----- Top Row: Log Likelihood Plots -----
# Column 1: Original ($\pi$)
axs[0, 0].plot(p_vals, logL_vals, color='gray', label=r'$\ln(L)$ (Original)')
axs[0, 0].axvline(p_ref, color='red', linestyle='--', label=r'$\pi_0 = 0.5$')
axs[0, 0].axhline(y=logL_ref, color='red', linestyle=':', label=r'Support at $\pi_0$')
axs[0, 0].fill_between(p_vals, logL_ref, 0, color=pink_color, alpha=pink_alpha)
axs[0, 0].set_xlabel(r'$\pi$')
axs[0, 0].set_xlim(0.3, 0.8)
axs[0, 0].set_ylim(overall_ymin, overall_ymax)
axs[0, 0].legend(loc="center")
axs[0, 0].grid(True)
axs[0, 0].set_ylabel("Support")
# Column 2: Transformed (No Jacobian) ($T = 1/\pi$)
axs[0, 1].plot(q_vals2, logL_q_nojac, color='gray',
label=r'$\ln(L)$ ($T=1/\pi$, no Jacobian)')
axs[0, 1].axvline(q_ref, color='red', linestyle='--',
label=r'$T_0 = 1/\pi_0 = 2.0$')
axs[0, 1].axhline(y=logL_q_ref_nojac, color='red', linestyle=':',
label=r'Support at $T_0$')
axs[0, 1].fill_between(q_vals2, logL_q_ref_nojac, 0, color=pink_color, alpha=pink_alpha)
axs[0, 1].set_xlabel(r'$T=1/\pi$')
axs[0, 1].set_xlim(1.2, 2.5)
axs[0, 1].set_ylim(overall_ymin, overall_ymax)
axs[0, 1].legend(loc="center")
axs[0, 1].grid(True)
# Column 3: Transformed (With Jacobian) ($T = 1/\pi$)
axs[0, 2].plot(q_vals2, logL_q_vals, color='gray',
label=r'$\ln(L)$ ($T=1/\pi$, with Jacobian)')
axs[0, 2].axvline(q_ref, color='red', linestyle='--',
label=r'$T_0 = 1/\pi_0 = 2.0$')
axs[0, 2].axhline(y=logL_q_ref, color='red', linestyle=':',
label=r'Support at $T_0$')
axs[0, 2].fill_between(q_vals2, logL_q_ref, 0, color=pink_color, alpha=pink_alpha)
axs[0, 2].set_xlabel(r'$T=1/\pi$')
axs[0, 2].set_xlim(1.2, 2.5)
axs[0, 2].set_ylim(overall_ymin, overall_ymax)
axs[0, 2].legend(loc="center")
axs[0, 2].grid(True)
# 統一上排 y 軸的標籤顯示為正值(取絕對值),並用 mathjax 統一格式
for ax in axs[0]:
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, pos: f"{abs(x):.2f}"))
ax.set_ylabel(r"Support")
# 在上排每個子圖中加入 Support 值的註解(使用 mathjax 格式,文字顏色改為紅色)
support1 = abs(logL_ref)
support2 = abs(logL_q_ref_nojac)
support3 = abs(logL_q_ref)
axs[0, 0].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support1,
transform=axs[0, 0].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
axs[0, 1].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support2,
transform=axs[0, 1].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
axs[0, 2].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support3,
transform=axs[0, 2].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
# ----- Bottom Row: Probability Density Plots (Linear Scale) -----
# Column 1: Original ($\pi$)
axs[1, 0].plot(p_vals, p_pdf, color='gray', label=r'PDF (Original)')
axs[1, 0].fill_between(p_vals, p_pdf, where=(p_vals >= p_mle),
color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 0].axvline(p_mle, color='red', linestyle='--', label=r'Observed $\pi = 0.6$')
axs[1, 0].set_xlabel(r'$\pi$')
axs[1, 0].set_ylabel('Probability Density')
axs[1, 0].set_xlim(0.3, 0.8)
axs[1, 0].legend(loc="center")
axs[1, 0].grid(True)
axs[1, 0].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_orig,
transform=axs[1, 0].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
# Column 2: Transformed (Delta Method, no Jacobian)
axs[1, 1].plot(x_T, T_pdf, color='gray',
label=r'PDF ($T=1/\pi$, SE adjusted by Delta Method, no Jacobian)')
axs[1, 1].fill_between(x_T, T_pdf, where=(x_T <= T_obs),
color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 1].axvline(T_obs, color='red', linestyle='--', label=r'Observed $T\approx 1.667$')
axs[1, 1].set_xlabel(r'$T=1/\pi$')
axs[1, 1].set_ylabel('Probability Density')
axs[1, 1].set_xlim(1.2, 2.5)
axs[1, 1].legend(loc="center")
axs[1, 1].grid(True)
axs[1, 1].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_delta,
transform=axs[1, 1].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
# Column 3: Transformed (Full PDF with Jacobian)
axs[1, 2].plot(x_T_full, T_full_pdf, color='gray',
label=r'PDF ($T=1/\pi$, with Jacobian)')
axs[1, 2].fill_between(x_T_full, T_full_pdf, where=(x_T_full <= T_obs),
color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 2].axvline(T_obs, color='red', linestyle='--', label=r'Observed $T\approx 1.667$')
axs[1, 2].set_xlabel(r'$T=1/\pi$')
axs[1, 2].set_ylabel('Probability Density')
axs[1, 2].set_xlim(1.2, 2.5)
axs[1, 2].legend(loc="center")
axs[1, 2].grid(True)
axs[1, 2].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_full,
transform=axs[1, 2].transAxes, fontsize=12, color='red',
bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
plt.tight_layout()
plt.show()
print(f"Original test one-tailed p-value: {p_value_orig:.4f}")
print(f"Delta method one-tailed p-value: {p_value_delta:.4f}")
print(f"Full PDF transformation one-tailed p-value: {p_value_full:.4f}")
Summary
- Evidential Inference:只關注資料對假設的相對支持,僅計算 likelihood ratio(support $S$),理論上在單調變換下若忽略 Jacobian 部分,$S$ 保持不變。
- Frequency Approach:必須根據完整的 pdf 計算 $p$ 值,在參數轉換時需要納入 Jacobian 因子,這會改變密度曲線形狀,增加了實踐難度。
References
- Cahusac, P. M. B. (2021). Evidence Based Statistics: An Introduction to the Evidential Approach – from Likelihood Principle to Statistical Practice. John Wiley & Sons, Inc.
- Edwards, A. W. F. (1992). Likelihood. Baltimore: John Hopkins University Press.