Submitted by Maksim Afanasyev 29 SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization Floating Point Sigma Lab 4 2