Hi-Light A path to high-fidelity, high-resolution video relighting — with a refined evaluation paradigm.

Xiangrui Liu1, Haoxiang Li2, Yezhou Yang1
1 Arizona State University  ·  2 Pixocial Technology
Lighting is the blank page. It's the canvas. It's the thing that you start with — you can't do anything until you have a light.
— Sir Roger Deakins
§ 01 / Abstract

A training-free framework for stable, detailed video relighting.

Video relighting offers immense creative potential and commercial value, yet remains hindered by three persistent challenges: severe light flickering, the degradation of fine-grained detail, and the absence of an adequate evaluation metric.

We introduce Hi-Light, a novel, plug-and-play framework that delivers high-fidelity, high-resolution video relighting through three technical innovations: a lightness-prior anchored guided relighting diffusion that stabilises intermediate relit video, a Hybrid Motion-Adaptive Lighting Smoothing Filter that leverages optical flow to ensure temporal stability without motion blur, and a LAB-based Detail Fusion module that preserves high-frequency information from the original video.

To address the evaluation gap, we further propose the Light Stability Score — the first quantitative metric designed specifically to measure lighting consistency over time. Hi-Light significantly outperforms state-of-the-art methods in both qualitative and quantitative comparisons, producing stable, highly detailed relit videos.

Comparison between CapCut filters and Hi-Light relighting from a grayscale input
Fig. 02 Traditional filters (centre) recolour the scene globally — they cannot synthesize directionality, highlights, or new shadows. Hi-Light (right) projects a physically grounded sunset onto the same input.
+56%
SSIM gain
vs. second-best on detail preservation
+80%
SLS gain
vs. second-best on light stability
2160p
resolution
only method that scales to 4K input
0 training
data-free
backbone-agnostic, plug-and-play
§ 02 / Contributions

What's new.

Contribution 01

A training-free, backbone-agnostic framework

The only method capable of processing high-resolution video on a single GPU, extending accessibility to a wider community of users without specialised hardware or paired-light datasets.

Contribution 02

Two plug-and-play stabilisation modules

A lightness-prior–anchored progressive fusion scheme paired with HMA-LSF (flicker removal) and LAB-DF (texture restoration). Drop-in to any video diffusion backbone.

Contribution 03

A principled evaluation paradigm

The Light Stability Score — the first quantitative metric for lighting consistency — complements standard fidelity metrics, with 95.6% rank-1 agreement against a 30-rater blind human study.

§ 03 / Method

Three modules

Hi-Light decouples the problem: generate stable lighting on a downsampled latent, then project it back onto the high-resolution source. Each module surgically targets one failure mode of existing SOTA.

The Hi-Light framework: guided relighting diffusion, HMA-LSF, and LAB-DF
Fig. 03 Overall architecture. The downsampled video passes through a guided relighting diffusion loop anchored by a lightness prior; the intermediate output is stabilised by HMA-LSF; LAB-DF then transfers the stabilised illumination onto the high-resolution source.
▸ Step 01

Lightness-prior anchored diffusion

A static, DC-insensitive residual extracted from the input's L channel is injected at every diffusion step (γ = 0.3), damping low-to-mid-frequency luminance oscillations without drifting global brightness.

▸ Step 02

HMA-LSF — Hybrid smoothing

An optical-flow-aware motion-adaptive blend (Farneback, CPU-only) cancels flicker on stable regions while a bilateral filter removes residual compression noise — without smearing fast-moving objects.

▸ Step 03

LAB Detail-Preserve Fusion

Inverts the transfer: the input's L carries the detail, only the low-frequency illumination from the relit L is added back. The relit chroma (A, B) preserves the new colour. No afterimages.

LAB feature maps and the LAB-DF detail preservation strategy
Fig. 04 LAB-DF fuses high-frequency detail from the input video with low-frequency illumination from the intermediate relit video, recombined with relit chroma to retain the new colour and tonal style.
§ 04 / Results

A new SOTA across every metric.

Evaluated on 100 clips spanning 1080p–2160p, 81 frames at 24 fps, indoor / outdoor, static / dynamic, portrait / environment. Hyperparameters fixed across all comparisons.

Qualitative comparison: TC-Light, LAV (AnimateDiff/CogVideoX/Wan), Hi-Light
Fig. 05 Qualitative comparison under the prompt "sunset lighting". TC-Light produces a washed-out look; LAV variants smear textures on the building and foliage. Hi-Light renders sharp highlights while preserving the original high-frequency detail.
Model SSIM ↑ LPIPS ↓ FID ↓ VBench ↑ SLS
TC-Light0.4840.4641200.7180.281
LAV (AnimateDiff)0.5520.4342410.7140.098
LAV (CogVideoX)0.5970.4021330.7360.267
LAV (Wan)0.6040.3951350.7280.279
Hi-Light 0.943 0.247 76 0.736 0.509

Table 01 · Quantified comparison against open-source SOTA. Hi-Light achieves the best score on every metric — the largest margin appearing on SSIM (+56%) and Light Stability (+80%).

Performance scatter: SSIM vs Light Stability Score
Fig. 06 Hi-Light occupies the top-right quadrant — high detail fidelity and robust lighting stability — and sits closest to the input video, the ideal target.
Frequency-domain spectra and per-frame brightness traces
Fig. 07 Left: Fourier magnitude spectra. Hi-Light's spectrum is nearly indistinguishable from the input — the others show a pronounced attenuation of high-frequency content, a quantitative signature of detail loss. Right: per-frame brightness traces. Competitors exhibit erratic flicker; Hi-Light maintains a smooth, stable profile.
§ 05 / Citation

If you find this useful…

@inproceedings{liu2026hilight, title = {Hi-Light: A Path to High-fidelity, High-resolution Video Relighting With A Refined Evaluation Paradigm}, author = {Liu, Xiangrui and Li, Haoxiang and Yang, Yezhou}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2026} }