arXiv AI recent: Frame-Conditioned Moral Computation in LLaMA 3.1-8B-Instruct: A Mechanistic Interpretability Audit of Ethical Reasoning
Researchers used the Transluce mechanistic-interpretability platform to analyze the internal computations of LLaMA 3.1-8B-Instruct on 54 moral prompts.,They identified a Situational Ancho...
The audit examined four prompt batteries (B1, B3, B4, B5) covering dilemmas, policy, meta‑ethical questions, role‑playing scenarios, and controlled trolley variations.,Two metric families—five cluster‑level metrics and a six‑metric neuron‑level panel—converged on the finding that the model’s ethi...