Hi-Light A path to high-fidelity, high-resolution video relighting — with a refined evaluation paradigm.

Xiangrui Liu¹, Haoxiang Li², Yezhou Yang¹

¹ Arizona State University · ² Pixocial Technology

“Lighting is the blank page. It's the canvas. It's the thing that you start with — you can't do anything until you have a light.”

— Sir Roger Deakins

§ 01 / Abstract

A training-free framework for stable, detailed video relighting.

Video relighting offers immense creative potential and commercial value, yet remains hindered by three persistent challenges: severe light flickering, the degradation of fine-grained detail, and the absence of an adequate evaluation metric.

We introduce Hi-Light, a novel, plug-and-play framework that delivers high-fidelity, high-resolution video relighting through three technical innovations: a lightness-prior anchored guided relighting diffusion that stabilises intermediate relit video, a Hybrid Motion-Adaptive Lighting Smoothing Filter that leverages optical flow to ensure temporal stability without motion blur, and a LAB-based Detail Fusion module that preserves high-frequency information from the original video.

To address the evaluation gap, we further propose the Light Stability Score — the first quantitative metric designed specifically to measure lighting consistency over time. Hi-Light significantly outperforms state-of-the-art methods in both qualitative and quantitative comparisons, producing stable, highly detailed relit videos.

Comparison between CapCut filters and Hi-Light relighting from a grayscale input — Fig. 02 Traditional filters (centre) recolour the scene globally — they cannot synthesize directionality, highlights, or new shadows. Hi-Light (right) projects a physically grounded sunset onto the same input.

+56%

SSIM gain

vs. second-best on detail preservation

+80%

S_LS gain

vs. second-best on light stability

2160p

resolution

only method that scales to 4K input

0 training

data-free

backbone-agnostic, plug-and-play

§ 02 / Contributions

What's new.

Contribution 01

A training-free, backbone-agnostic framework

The only method capable of processing high-resolution video on a single GPU, extending accessibility to a wider community of users without specialised hardware or paired-light datasets.

Contribution 02

Two plug-and-play stabilisation modules

A lightness-prior–anchored progressive fusion scheme paired with HMA-LSF (flicker removal) and LAB-DF (texture restoration). Drop-in to any video diffusion backbone.

Contribution 03

A principled evaluation paradigm

The Light Stability Score — the first quantitative metric for lighting consistency — complements standard fidelity metrics, with 95.6% rank-1 agreement against a 30-rater blind human study.

§ 03 / Method

Three modules

Hi-Light decouples the problem: generate stable lighting on a downsampled latent, then project it back onto the high-resolution source. Each module surgically targets one failure mode of existing SOTA.

The Hi-Light framework: guided relighting diffusion, HMA-LSF, and LAB-DF — Fig. 03 Overall architecture. The downsampled video passes through a guided relighting diffusion loop anchored by a lightness prior; the intermediate output is stabilised by HMA-LSF; LAB-DF then transfers the stabilised illumination onto the high-resolution source.

▸ Step 01

Lightness-prior anchored diffusion

A static, DC-insensitive residual extracted from the input's L channel is injected at every diffusion step (γ = 0.3), damping low-to-mid-frequency luminance oscillations without drifting global brightness.

▸ Step 02

HMA-LSF — Hybrid smoothing

An optical-flow-aware motion-adaptive blend (Farneback, CPU-only) cancels flicker on stable regions while a bilateral filter removes residual compression noise — without smearing fast-moving objects.

▸ Step 03

LAB Detail-Preserve Fusion

Inverts the transfer: the input's L carries the detail, only the low-frequency illumination from the relit L is added back. The relit chroma (A, B) preserves the new colour. No afterimages.

LAB feature maps and the LAB-DF detail preservation strategy — Fig. 04 LAB-DF fuses high-frequency detail from the input video with low-frequency illumination from the intermediate relit video, recombined with relit chroma to retain the new colour and tonal style.

§ 04 / Results

A new SOTA across every metric.

Evaluated on 100 clips spanning 1080p–2160p, 81 frames at 24 fps, indoor / outdoor, static / dynamic, portrait / environment. Hyperparameters fixed across all comparisons.

Qualitative comparison: TC-Light, LAV (AnimateDiff/CogVideoX/Wan), Hi-Light — Fig. 05 Qualitative comparison under the prompt *"sunset lighting"*. TC-Light produces a washed-out look; LAV variants smear textures on the building and foliage. Hi-Light renders sharp highlights while preserving the original high-frequency detail.

Model	SSIM ↑	LPIPS ↓	FID ↓	VBench ↑	S_LS ↑
TC-Light	0.484	0.464	120	0.718	0.281
LAV (AnimateDiff)	0.552	0.434	241	0.714	0.098
LAV (CogVideoX)	0.597	0.402	133	0.736	0.267
LAV (Wan)	0.604	0.395	135	0.728	0.279
Hi-Light ★	0.943	0.247	76	0.736	0.509

Table 01 · Quantified comparison against open-source SOTA. Hi-Light achieves the best score on every metric — the largest margin appearing on SSIM (+56%) and Light Stability (+80%).

Performance scatter: SSIM vs Light Stability Score — Fig. 06 Hi-Light occupies the top-right quadrant — high detail fidelity *and* robust lighting stability — and sits closest to the input video, the ideal target.

Frequency-domain spectra and per-frame brightness traces — Fig. 07 **Left:** Fourier magnitude spectra. Hi-Light's spectrum is nearly indistinguishable from the input — the others show a pronounced attenuation of high-frequency content, a quantitative signature of detail loss. **Right:** per-frame brightness traces. Competitors exhibit erratic flicker; Hi-Light maintains a smooth, stable profile.

§ 05 / Citation

If you find this useful…

@inproceedings{liu2026hilight, title = {Hi-Light: A Path to High-fidelity, High-resolution Video Relighting With A Refined Evaluation Paradigm}, author = {Liu, Xiangrui and Li, Haoxiang and Yang, Yezhou}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2026} }