I’m a postdoc at TakeLab, University of Zagreb. My research interests within natural language processing are faithful explainability, safety and controllability of language models.
Previously, I did a postdoc at the Technion working with Yonatan Belinkov working on unlearning and faithful explainability of language models. Before that, I did a postdoc at the UKP Lab at TU Darmstadt working with Iryna Gurevych on the InterText initiative. I obtained my PhD at the University of Zagreb under supervision of Jan Šnajder. Before my PhD I worked at the European Commission’s Joint Research Centre in Ispra on using NLP to update the the Sendai Framework for Disaster Risk Reduction 2015-2030.
I am on the job market for academic opportunities. Check my CV and reach out if you believe me a good fit.
News
November 2025.
- FUR has received an outstanding paper award at EMNLP 2025! If you haven’t yet, read our paper on using parametric interventions to measure CoT faithfulness!
- Our work presenting a benchmark investigating the capacity of LLMs to track and model local world states in conversations has been accepted to AAAI 2026 as an oral! Check out the paper!
October 2025.
- We released a benchmark evaluating whether LLM agents are safe for use in managerial decisions. Check out the [paper & data]!
September 2025.
- We released a new preprint on directly encoding contextual information into adapter parameters in a compositional manner! Check out the [paper]
August 2025.
- We released a new preprint on using SAEs to precisely & permanently erase harmful concepts from LMs! Check out the preprint: [paper].
July 2025.
- Predicting Success of Model Editing via Intrinsic Features is accepted to the Interplay workshop at COLM 2025!
June 2025.
- Our paper studying diachronic word embeddings trained on Croatian has been accepted to the Slavic NLP workshop at ACL 2025! Check out our [paper].
May 2025.
- REVS, our gradient-free method for erasing sensitive information from language models has been accepted to the Findings of ACL 2025! Check out the [paper & code].
April 2025.
- We released a Mechanistic Interpretability Benchmark, a step towards standardizing evaluation in mechanistic interpretability! The paper describing our effort has been accepted to ICML 2025.
