I’m a postdoc at TakeLab, the University of Zagreb, where I am substituting for Jan Šnajder in the summer semester of 2025. My research interests within natural language processing are interpretability, efficiency of prompting LLMs, LLM distillation and enforcing safety and faithfulness of (LLM-)generated texts.
Previously, I did a postdoc at the Technion working with Yonatan Belinkov and the UKP Lab at TU Darmstadt working on the InterText initiative. I obtained my PhD at the University of Zagreb under supervision of Jan Šnajder. Before my PhD I worked at the European Commission’s Joint Research Centre in Ispra on using NLP to update the the Sendai Framework for Disaster Risk Reduction 2015-2030.
I am currently on the job market for academic opportunities. Check my CV and reach out if you believe me a good fit.
News
April 2025.
- We released a Mechanistic Interpretability Benchmark, a step towards standardizing evaluation in mechanistic interpretability! The paper describing our effort has been accepted to ICML 2025.
February 2025.
- New preprint: Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps
- I am substituting Jan Šnajder at the University of Zagreb during the summer semester, teaching Introduction to AI.
September 2024.
- Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs is accepted to EMNLP 2024 main!
June 2024.
- New preprint: REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
March 2024.
February 2024.
- I started as a postdoc at Technion, working with Yonatan Belinkov
January 2024.
- New preprint: Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
- CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration is accepted to EACL 2024 main!
- BLOOD: Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness is accepted to ICLR 2024 as a poster!