DPO Loss

Direct Preference Optimization (DPO) and its variants are used for aligning models with human preferences without requiring a separate reward model.

DPOLoss

The standard DPO loss supports multiple loss types and optional reference-free mode.

from twinkle.loss import DPOLoss

loss_fn = DPOLoss(
    loss_type='sigmoid',  # 'sigmoid', 'hinge', 'ipo', 'kto'
    beta=0.1,
    sft_weight=0.0,       # optional SFT regularization weight
    reference_free=False,
)

model.set_loss(loss_fn)

Parameters:

  • loss_type: DPO variant — sigmoid (default), hinge, ipo, or kto

  • beta: Temperature parameter controlling preference strength

  • sft_weight: Weight for an additional SFT loss term on chosen responses

  • reference_free: If True, skips reference model log-probabilities

The loss expects interleaved chosen/rejected pairs in the batch. It computes sequence-level log-probabilities and optimizes the policy to prefer chosen over rejected responses.

SimPOLoss

Simplified Preference Optimization that removes the need for a reference model by using length-normalized log-probabilities.

from twinkle.loss import SimPOLoss

loss_fn = SimPOLoss(beta=2.0, gamma=1.0)

Parameters:

  • beta: Scaling factor for the logit difference

  • gamma: Margin term added to preference gap

CPOLoss

Contrastive Preference Optimization combines preference learning with behavior cloning.

from twinkle.loss import CPOLoss

loss_fn = CPOLoss(beta=0.1, cpo_alpha=1.0)

Parameters:

  • beta: Temperature for the preference loss

  • cpo_alpha: Weight of the behavior cloning (NLL) loss on chosen responses

ORPOLoss

Odds Ratio Preference Optimization unifies SFT and preference alignment in a single loss.

from twinkle.loss import ORPOLoss

loss_fn = ORPOLoss(beta=0.1)

The loss combines a standard NLL term on chosen responses with a log-odds-ratio penalty that pushes the model away from rejected responses.

All preference losses inherit shared utilities from PreferenceLossBase, including log-probability computation, chosen/rejected splitting, and sequence-level aggregation.