Two new papers about sampling from LLMs and AI alignment. Soft Best-of-n and Inference-Time Reward Hacking in LLMs.