Objective

When variable transformations are needed:

Evidential Inference (Evidence-based Approach)

In statistical inference, we often use the likelihood ratio (or its logarithm, i.e. the support $S$) to measure the relative degree to which data $x$ supports different hypotheses:

\[ S = \log\frac{L(\theta_1)}{L(\theta_2)}, \]

Any multiplicative constant that is independent of the data $x$ (for example, the Jacobian part introduced by a parameter transformation) is not our primary concern; because under a monotonic transformation, the ordering of the support $S$ remains unchanged, providing an intuitive measure of evidence that is invariant to the transformation.

Frequency Approach

In contrast, the frequency approach usually requires the complete probability density function (pdf) to calculate $p$-values or perform hypothesis tests. This necessitates the correct transformation of the pdf (using the Jacobian transformation or numerical methods) when variables are transformed, thereby increasing the complexity in both implementation and interpretation.

Derivation

1. Likelihood on the Original Scale

Given data $x$ and parameter $\theta$, suppose the likelihood is defined as

\[ L(\theta)= f(x\mid \theta), \]

where $f(x\mid \theta)$ fully reflects the information contained in the data $x$. Note that the likelihood is defined only up to a multiplicative constant; that is, if we define

\[ L^*(\theta)= c\cdot L(\theta) \quad (c \text{ is independent of } x,\theta), \]

then when comparing different parameter values (for example, computing the likelihood ratio), the constant $c$ cancels out in the numerator and denominator.

2. Definition of Relative Evidence (Support)

When comparing two parameter values $\theta_1$ and $\theta_2$, we define the likelihood ratio as

\[ \text{LR} = \frac{L(\theta_1)}{L(\theta_2)} = \frac{f(x\mid \theta_1)}{f(x\mid \theta_2)}, \]

and taking the logarithm gives the support

\[ S = \log\frac{f(x\mid \theta_1)}{f(x\mid \theta_2)}. \]

This $S$ depends solely on how the data $x$ supports different parameter values.

3. Monotonic Parameter Transformation

Suppose we perform a monotonic transformation of the parameter by letting

\[ \phi = g(\theta), \]

where $g$ is a one-to-one and monotonic function with an inverse $g^{-1}$. According to the change of variable formula, the likelihood on the new parameter scale is given by

\[ L^*(\phi)= L\bigl(g^{-1}(\phi)\bigr)\,\left|\frac{d\,g^{-1}(\phi)}{d\phi}\right|. \]

Here:

$L\bigl(g^{-1}(\phi)\bigr)= f\bigl(x\mid g^{-1}(\phi)\bigr)$ is the part that depends on the data $x$,
while $\left|\frac{d\,g^{-1}(\phi)}{d\phi}\right|$ (the Jacobian factor) merely reflects the change in measure due to the parameter transformation and is independent of $x$.

4. Ignoring the Jacobian in Relative Evidence

When comparing two transformed parameter values $\phi_1$ and $\phi_2$, the full likelihood ratio becomes

\[ \frac{L^*(\phi_1)}{L^*(\phi_2)} = \frac{f\bigl(x\mid g^{-1}(\phi_1)\bigr)}{f\bigl(x\mid g^{-1}(\phi_2)\bigr)} \cdot \frac{\left|\frac{d\,g^{-1}(\phi_1)}{d\phi_1}\right|}{\left|\frac{d\,g^{-1}(\phi_2)}{d\phi_2}\right|}. \]

Although the second ratio is not necessarily equal to 1 numerically, it depends only on the transformation function $g$ and is independent of the data $x$. In evidential inference, we are interested only in the part driven by the data:

\[ \frac{f\bigl(x\mid g^{-1}(\phi_1)\bigr)}{f\bigl(x\mid g^{-1}(\phi_2)\bigr)}, \]

and by ignoring the purely transformational multiplicative factor, we obtain

\[ S = \log\frac{f\bigl(x\mid \theta_1\bigr)}{f\bigl(x\mid \theta_2\bigr)}, \]

which is completely consistent with the original scale.

Practical Implications and Comparison with the Frequency Approach

Evidential Inference:
We focus solely on the likelihood ratio or support $S$ (i.e. the relative measure of evidence provided by the data $x$ for different hypotheses). Thus, if we ignore the Jacobian factor under a monotonic transformation, the numerical value of $S$ remains unchanged across different parameterizations. This renders the evidential approach robust to the choice of parameterization and simplifies both interpretation and comparison.
Frequency Approach:
In contrast, the frequency approach relies on the complete pdf to compute $p$-values and test statistics. When transforming variables, the Jacobian factor must be correctly incorporated to ensure the transformed density is accurate. This changes the shape of the density curve and adds complexity in practical computations.

The figure below illustrates this concept:

In evidential inference (ignoring the Jacobian), the support $S$ remains the same under different parameterizations, clearly showing that the ordering of evidence provided by the data is invariant.
In the frequency approach, because the complete pdf (including the Jacobian) is used, the density curves change shape after transformation, affecting the computation of $p$-values and increasing practical difficulty.

Code for Visualization

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.integrate import quad

# ------------------- Data and Likelihood Functions -------------------
# Data: 60 deaths out of 100 patients
x = 60
n = 100

# Reference and MLE values on the original scale
p_ref = 0.5   # reference value
p_mle = 0.6   # maximum likelihood estimate

# Binomial likelihood (ignoring combinatorial constant)
def likelihood(p, x, n):
    return (p ** x) * ((1 - p) ** (n - x))

# Transformed likelihood with Jacobian for q = 1/p (correct transformation)
def likelihood_transformed(q, x, n):
    p = 1 / q
    return likelihood(p, x, n) / (q ** 2)

# Transformed likelihood without Jacobian for q = 1/p (direct transformation, without Jacobian correction)
def likelihood_transformed_no_jacobian(q, x, n):
    p = 1 / q
    return likelihood(p, x, n)

# ------------------- Likelihood-based Calculations -------------------
# Increase resolution: 5000 points
p_vals = np.linspace(0.3, 0.8, 5000)
L_vals = likelihood(p_vals, x, n)
L_vals_norm = L_vals / np.max(L_vals)
idx_ref = np.argmin(np.abs(p_vals - p_ref))
L_at_p_ref = L_vals_norm[idx_ref]

# Use mask2: set the range for q = 1/p to [1.2, 2.5]
q_vals = 1 / p_vals
mask2 = (q_vals >= 1.2) & (q_vals <= 2.5)
q_vals2 = q_vals[mask2]

# Transformed likelihood with Jacobian
L_q_vals = likelihood_transformed(q_vals, x, n)
L_q_vals_norm = L_q_vals / np.max(L_q_vals)
L_q_vals_norm2 = L_q_vals_norm[mask2]

# Transformed likelihood without Jacobian
Lq_vals_nojac = likelihood_transformed_no_jacobian(q_vals, x, n)
Lq_vals_nojac_norm = Lq_vals_nojac / np.max(Lq_vals_nojac)
Lq_vals_nojac_norm2 = Lq_vals_nojac_norm[mask2]

q_ref = 1 / p_ref   # 2.0
q_mle = 1 / p_mle   # ≈1.667
idx_q_ref = np.argmin(np.abs(q_vals2 - q_ref))
L_at_q_ref = L_q_vals_norm2[idx_q_ref]
L_at_q_ref_nojac = Lq_vals_nojac_norm2[idx_q_ref]

# ------------------- p-value-based Calculations -------------------
# z-test on the original scale: p ~ N(0.5, 0.05^2)
p_pdf = norm.pdf(p_vals, loc=p_ref, scale=0.05)
p_value_orig = 1 - norm.cdf(p_mle, loc=p_ref, scale=0.05)

# Delta method adjustment: T = 1/p
T0 = 1 / p_ref         # 2.0
SE_T = 4 * 0.05        # 0.2
T_obs = 1 / p_mle      # ≈1.667
# Increase resolution for T axis: 2000 points
x_T = np.linspace(1.2, 2.5, 2000)
T_pdf = norm.pdf(x_T, loc=T0, scale=SE_T)
p_value_delta = norm.cdf((T_obs - T0) / SE_T)

# Full PDF Transformation (including Jacobian): correctly transformed T density
def f_T(t):
    return norm.pdf(1/t, loc=p_ref, scale=0.05) * (1 / t**2)
x_T_full = np.linspace(1.2, 2.5, 2000)
T_full_pdf = f_T(x_T_full)
p_value_full, _ = quad(f_T, 0, T_obs)

# ------------------- Plot Settings -------------------
pink_color = 'pink'
pink_alpha = 0.5

# ------------------- Create 2x3 Subplots -------------------
fig, axs = plt.subplots(2, 3, figsize=(18, 10))

# Calculate the log-transformed normalized likelihood for the top row
logL_vals = np.log(L_vals_norm)
logL_q_vals = np.log(L_q_vals_norm2)
logL_q_nojac = np.log(Lq_vals_nojac_norm2)
logL_ref = np.log(L_at_p_ref)
logL_q_ref = np.log(L_at_q_ref)
logL_q_ref_nojac = np.log(L_at_q_ref_nojac)

# Fix the y-axis range for the top row (narrow range)
overall_ymin = -10.0
overall_ymax = 0.0

# ----- Top Row: Log Likelihood Plots -----
# Column 1: Original (\\(\pi\\))
axs[0, 0].plot(p_vals, logL_vals, color='gray', label=r'$\ln(L)$ (Original)')
axs[0, 0].axvline(p_ref, color='red', linestyle='--', label=r'$\pi_0 = 0.5$')
axs[0, 0].axhline(y=logL_ref, color='red', linestyle=':', label=r'Support at $\pi_0$')
axs[0, 0].fill_between(p_vals, logL_ref, 0, color=pink_color, alpha=pink_alpha)
axs[0, 0].set_xlabel(r'$\pi$')
axs[0, 0].set_xlim(0.3, 0.8)
axs[0, 0].set_ylim(overall_ymin, overall_ymax)
axs[0, 0].legend(loc="center")
axs[0, 0].grid(True)
axs[0, 0].set_ylabel("Support")

# Column 2: Transformed (No Jacobian) (\\(T = 1/\pi\\))
axs[0, 1].plot(q_vals2, logL_q_nojac, color='gray',
               label=r'$\ln(L)$ ($T=1/\pi$, no Jacobian)')
axs[0, 1].axvline(q_ref, color='red', linestyle='--',
                  label=r'$T_0 = 1/\pi_0 = 2.0$')
axs[0, 1].axhline(y=logL_q_ref_nojac, color='red', linestyle=':',
                  label=r'Support at $T_0$')
axs[0, 1].fill_between(q_vals2, logL_q_ref_nojac, 0, color=pink_color, alpha=pink_alpha)
axs[0, 1].set_xlabel(r'$T=1/\pi$')
axs[0, 1].set_xlim(1.2, 2.5)
axs[0, 1].set_ylim(overall_ymin, overall_ymax)
axs[0, 1].legend(loc="center")
axs[0, 1].grid(True)

# Column 3: Transformed (With Jacobian) (\\(T = 1/\pi\\))
axs[0, 2].plot(q_vals2, logL_q_vals, color='gray',
               label=r'$\ln(L)$ ($T=1/\pi$, with Jacobian)')
axs[0, 2].axvline(q_ref, color='red', linestyle='--',
                  label=r'$T_0 = 1/\pi_0 = 2.0$')
axs[0, 2].axhline(y=logL_q_ref, color='red', linestyle=':',
                  label=r'Support at $T_0$')
axs[0, 2].fill_between(q_vals2, logL_q_ref, 0, color=pink_color, alpha=pink_alpha)
axs[0, 2].set_xlabel(r'$T=1/\pi$')
axs[0, 2].set_xlim(1.2, 2.5)
axs[0, 2].set_ylim(overall_ymin, overall_ymax)
axs[0, 2].legend(loc="center")
axs[0, 2].grid(True)

# Format the y-axis labels of the top row as positive values (absolute value), using unified mathjax formatting
for ax in axs[0]:
    ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, pos: f"{abs(x):.2f}"))
    ax.set_ylabel(r"Support")

# Add annotations for support values in each top row subplot (using mathjax formatting, red text)
support1 = abs(logL_ref)
support2 = abs(logL_q_ref_nojac)
support3 = abs(logL_q_ref)
axs[0, 0].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support1,
               transform=axs[0, 0].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
axs[0, 1].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support2,
               transform=axs[0, 1].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))
axs[0, 2].text(0.05, 0.90, r"$\mathrm{Support} = %.2f$" % support3,
               transform=axs[0, 2].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))

# ----- Bottom Row: Probability Density Plots (Linear Scale) -----
# Column 1: Original (\\(\pi\\))
axs[1, 0].plot(p_vals, p_pdf, color='gray', label=r'PDF (Original)')
axs[1, 0].fill_between(p_vals, p_pdf, where=(p_vals >= p_mle),
                       color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 0].axvline(p_mle, color='red', linestyle='--', label=r'Observed $\pi = 0.6$')
axs[1, 0].set_xlabel(r'$\pi$')
axs[1, 0].set_ylabel('Probability Density')
axs[1, 0].set_xlim(0.3, 0.8)
axs[1, 0].legend(loc="center")
axs[1, 0].grid(True)
axs[1, 0].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_orig,
               transform=axs[1, 0].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))

# Column 2: Transformed (Delta Method, no Jacobian)
axs[1, 1].plot(x_T, T_pdf, color='gray', 
               label=r'PDF ($T=1/\pi$, SE adjusted by Delta Method, no Jacobian)')
axs[1, 1].fill_between(x_T, T_pdf, where=(x_T <= T_obs),
                       color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 1].axvline(T_obs, color='red', linestyle='--', label=r'Observed $T\approx 1.667$')
axs[1, 1].set_xlabel(r'$T=1/\pi$')
axs[1, 1].set_ylabel('Probability Density')
axs[1, 1].set_xlim(1.2, 2.5)
axs[1, 1].legend(loc="center")
axs[1, 1].grid(True)
axs[1, 1].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_delta,
               transform=axs[1, 1].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))

# Column 3: Transformed (Full PDF with Jacobian)
axs[1, 2].plot(x_T_full, T_full_pdf, color='gray', 
               label=r'PDF ($T=1/\pi$, with Jacobian)')
axs[1, 2].fill_between(x_T_full, T_full_pdf, where=(x_T_full <= T_obs),
                       color=pink_color, alpha=pink_alpha, label='Rejection Region')
axs[1, 2].axvline(T_obs, color='red', linestyle='--', label=r'Observed $T\approx 1.667$')
axs[1, 2].set_xlabel(r'$T=1/\pi$')
axs[1, 2].set_ylabel('Probability Density')
axs[1, 2].set_xlim(1.2, 2.5)
axs[1, 2].legend(loc="center")
axs[1, 2].grid(True)
axs[1, 2].text(0.05, 0.90, r"$p\text{-value} = %.4f$" % p_value_full,
               transform=axs[1, 2].transAxes, fontsize=12, color='red',
               bbox=dict(facecolor='white', alpha=0.5, edgecolor='none'))

plt.tight_layout()
plt.show()

print(f"Original test one-tailed p-value: {p_value_orig:.4f}")
print(f"Delta method one-tailed p-value: {p_value_delta:.4f}")
print(f"Full PDF transformation one-tailed p-value: {p_value_full:.4f}")

Summary

Evidential Inference: Focuses solely on the relative support provided by the data, by computing the likelihood ratio (support $S$). Theoretically, if the Jacobian factor is ignored under a monotonic transformation, $S$ remains unchanged.
Frequency Approach: Requires the complete pdf to compute $p$-values, and the Jacobian factor must be included during parameter transformation. This alters the shape of the density curve and increases practical complexity.

References

Cahusac, P. M. B. (2021). Evidence Based Statistics: An Introduction to the Evidential Approach – from Likelihood Principle to Statistical Practice. John Wiley & Sons, Inc.
Edwards, A. W. F. (1992). Likelihood. Baltimore: John Hopkins University Press.

Variable Transformation: Evidential Method vs. Frequency Method

Objective

Derivation

1. Likelihood on the Original Scale

2. Definition of Relative Evidence (Support)

3. Monotonic Parameter Transformation

4. Ignoring the Jacobian in Relative Evidence

Practical Implications and Comparison with the Frequency Approach

Code for Visualization

Summary

References

Comments

More from this blog

UCITS Factor ETF Analysis as of 202602

UCITS Factor ETF Analysis as of 202601

UCITS Factor ETF Analysis as of 202512

UCITS Factor ETF Analysis as of 202511

UCITS Factor ETF Analysis as of 202510

Command Palette

Objective

Derivation

1. Likelihood on the Original Scale

2. Definition of Relative Evidence (Support)

3. Monotonic Parameter Transformation

4. Ignoring the Jacobian in Relative Evidence

Practical Implications and Comparison with the Frequency Approach

Code for Visualization

Summary

References

Comments

More from this blog