We created a political bias evaluation that mirrors real-world usage and stress-tests our models’ ability to remain objective. Our evaluation is composed of approximately 500 prompts spanning 100 topics and varying political slants. It measures five nuanced axes of bias, enabling us to decompose what bias looks like and pursue targeted behavioral fixes to answer three key questions: Does bias exist? Under what conditions does bias emerge? When bias emerges, what shape does it take? Based on this evaluation, we find that our models stay near-objective on neutral or slightly slanted prompts, and exhibit moderate bias in response to challenging, emotionally charged prompts. When bias does present, it most often involves the model expressing personal opinions, providing asymmetric coverage or escalating the user with charged language. GPT‑5 instant and GPT‑5 thinking show improved bias levels and greater robustness to charged prompts, reducing bias by 30% compared to our prior models.