Publications
2025
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Laura Kopf, Nils Feldhus, Kirill Bykov, Philine Lou Bommer, Anna Hedström, Marina M.-C. Höhne, and Oliver Eberle
NeurIPS 2025 & Mechanistic Interpretability Workshop @ NeurIPS 2025
OpenReview | arXiv | GitHub
Interpreting Language Models Through Concept Descriptions: A Survey
Nils Feldhus and Laura Kopf
BlackboxNLP @ EMNLP 2025
ACL Anthology | arXiv | OpenReview
Human and LLM-based Assessment of Teaching Acts in Expert-led Explanatory Dialogues
Aliki Anagnostopoulou, Nils Feldhus, Yi-Sheng Hsu, Milad Alshomary, Henning Wachsmuth, and Daniel Sonntag
CODI @ EMNLP 2025 (Rejected from ACL 2025, SIGDIAL 2025)
ACL Anthology | OpenReview | GitHub

Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI Systems
Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Fedor Splitt, Jiaao Li, Yoana Tsoneva, Sebastian Möller, and Vera Schmitt
EMNLP 2025 Findings
ACL Anthology | arXiv | GitHub
Truth or Twist? Model Selection Strategy for Reliable Label Flipping Evaluation in LLM-based Counterfactuals
Qianli Wang, Van Bach Nguyen, Nils Feldhus, Luis Felipe Villa-Arenas, Christin Seifert, Sebastian Möller, and Vera Schmitt
INLG 2025 (Oral)
ACL Anthology | arXiv | GitHub
Free-text Rationale Generation under Readability Level Control
Yi-Sheng Hsu, Nils Feldhus, and Sherzod Hakimov
GEM^2 @ ACL 2025 (Rejected from EMNLP 2024, NAACL 2025, ACL 2025)
ACL Anthology | arXiv | GitHub

FitCF: A Framework for Automatic Feature Importance-based Counterfactual Example Generation
Qianli Wang, Nils Feldhus, Simon Ostermann, Luis Felipe Villa-Arenas, Sebastian Möller, and Vera Schmitt
ACL 2025 Findings
ACL Anthology | arXiv | GitHub
Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data
Ekaterina Borisova, Fabio Barth, Nils Feldhus, Raia Abu Ahmad, Malte Ostendorff, Pedro Ortiz Suarez, Georg Rehm, and Sebastian Möller
TRL @ ACL 2025 (Best Runner-up Paper 🥈)
ACL Anthology | OpenReview | GitHub
Exploring Semantic Filtering Heuristics for Efficient Claim Verification
Max Upravitelev, Arthur Hilbert, Premtim Sahitaj, Tatiana Anikina, Nils Feldhus, Jing Yang, Veronika Solopova, Simon Ostermann, and Vera Schmitt
FEVER @ ACL 2025
ACL Anthology

Infherno: End-to-end Agent-based FHIR Resource Synthesis from Free-form Clinical Notes
Johann Frei, Nils Feldhus, Lisa Raithel, Roland Roller, Alexander Meyer, and Frank Kramer
In submission
arXiv | GitHub | Video | Demo | HuggingFace Spaces
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Mahdi Dhaini, Ege Erdogan, Nils Feldhus, and Gjergji Kasneci
FAccT 2025
ACM Digital Library | arXiv | GitHub

Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability
Qianli Wang, Mingyang Wang, Nils Feldhus, Simon Ostermann, Yuan Cao, Sebastian Möller, Hinrich Schütze, and Vera Schmitt
In submission
arXiv
Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem
Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, and Vera Schmitt
COLING 2025
ACL Anthology | arXiv | GitHub
2024
CoXQL: A Dataset for Explanation Request Parsing in Conversational XAI Systems
Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, and Sebastian Möller
EMNLP 2024 Findings
ACL Anthology | arXiv | GitHub
Democratizing Advanced Attribution Analyses for Generative Language Models with the Inseq Toolkit
Gabriele Sarti, Nils Feldhus, Jirui Qi, Malvina Nissim, and Arianna Bisazza
xAI 2024 Late-Breaking Work & Demos
CEUR-WS
Conversational XAI and Explanation Dialogues
Nils Feldhus
YRRSDS @ SIGDIAL 2024
ACL Anthology

German Voter Personas can Radicalize LLM Chatbots via the Echo Chamber Effect
Maximilian Bleick, Nils Feldhus, Aljoscha Burchardt, and Sebastian Möller
INLG 2024
ACL Anthology | GitHub

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues
Nils Feldhus, Aliki Anagnostopoulou, Qianli Wang, Milad Alshomary, Henning Wachsmuth, Daniel Sonntag, and Sebastian Möller
ACM GoodIT 2024 (Work in Progress track)
ACM Digital Library
The Value of Text for Multimodal Decision Support using XAI
Ajay Madhavan Ravichandran, Julianna Grune, Nils Feldhus, Roland Roller, Aljoscha Burchardt, and Sebastian Möller
BioNLP @ ACL 2024
ACL Anthology
QoEXplainer: Mediating Explainable Quality of Experience Models with Large Language Models
Nikolas Wehner, Nils Feldhus*, Michael Seufert, Sebastian Möller, and Tobias Hoßfeld
16th International Conference on Quality of Multimedia Experience (QoMEX 2024) Demos / ACM IMX 2024
IEEE Xplore
* joint first authorship

LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools
Qianli Wang, Tatiana Anikina, Nils Feldhus, Josef van Genabith, Leonhard Hennig, and Sebastian Möller
NAACL 2024 Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP) (Rejected from NAACL 2024 System Demonstrations)
ACL Anthology | arXiv | GitHub
The Role of Explainability in Collaborative Human-AI Disinformation Detection
Vera Schmitt, Luis Felipe Villa-Arenas, Nils Feldhus, Joachim Meyer, Robert P. Spang, and Sebastian Möller
FAccT 2024 (Rejected from Mediate @ ICWSM 2023, HCOMP 2023, and CHI 2024)
ACM Digital Library
2023
InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations
Nils Feldhus, Qianli Wang, Tatiana Anikina, Sahil Chopra, Cennet Oguz, and Sebastian Möller
EMNLP 2023 Findings & BlackboxNLP Workshop
ACL Anthology | arXiv | GitHub
Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods
Nils Feldhus, Leonhard Hennig, Maximilian Dustin Nasert, Christopher Ebert, Robert Schwarzenberg, and Sebastian Möller
ACL 2023 Workshop on Natural Language Reasoning and Structured Explanations (NLRSE) (Rejected from BlackboxNLP 2022 and EACL 2023)
ACL Anthology | arXiv | GitHub
Inseq: An Interpretability Toolkit for Sequence Generation Models
Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal, Malvina Nissim, and Arianna Bisazza
ACL 2023 System Demonstrations
ACL Anthology | arXiv | GitHub | Project page
Pre-trained Language Models for the Automatic Evaluation of Customer Chatbot Dialogs
Mika Rebensburg, Stefan Hillmann, and Nils Feldhus
ESSV 2023
Proceedings
Adapters for Resource-Efficient Deployment of NLU Models
Jan Nehring, Akhyar Ahmed, and Nils Feldhus
ESSV 2023 (Rejected from COLING 2022)
Proceedings | GitHub
Fighting Disinformation - Overview of Recent AI-based Collaborative Human-Computer Interaction for Intelligent Decision Support Systems
Tim Polzehl, Vera Schmitt, Nils Feldhus, Joachim Meyer, and Sebastian Möller
HUCAPP 2023
SciTePress
2022
XAINES: Explaining AI with Narratives
Mareike Hartmann, Han Du, Nils Feldhus, Ivana Kruijff-Korbayová, and Daniel Sonntag
KI - Künstliche Intelligenz
Journal article on Springer
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus, Ajay Madhavan Ravichandran, and Sebastian Möller
IJCAI-ECAI 2022 Workshop on XAI
arXiv | Slides
Towards Personality-aware Chatbots
Daniel Fernau, Stefan Hillmann, Nils Feldhus, Tim Polzehl, and Sebastian Möller
SIGDIAL 2022
ACL Anthology | Video (Live presentation)
Towards Automated Dialog Personalization using MBTI Personality Indicators
Daniel Fernau, Stefan Hillmann, Nils Feldhus, and Tim Polzehl
INTERSPEECH 2022
ISCA Proceedings
A Comparison of Feature Extraction Models for Medical Image Captioning
Sebastian Germer, Hristina Uzunova, Jan Ehrhardt, Nils Feldhus, Philippe Thomas, and Heinz Handels
GMDS-TMF 2022
PDF
An Annotated Corpus of Textual Explanations for Clinical Decision Support
Roland Roller, Aljoscha Burchardt, Nils Feldhus, Laura Seiffe, Klemens Budde, Simon Ronicke, and Bilgin Osmanodja
LREC 2022
ACL Anthology
What to explain when explaining is difficult? An interdisciplinary primer on XAI and meaningful information in automated decision-making
Hadi Asghari, Nadine Birner, Aljoscha Burchardt, Daniela Dicks, Judith Fassbinder, Nils Feldhus, Freya Hewett, Vincent Hofmann, Matthias C. Kettemann, Wolfgang Schulz, Judith Simon, Jakob Stolberg-Larsen, and Theresa Züger
Project report (published 2022-03-22)
Full report
2021
Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools
Nils Feldhus, Robert Schwarzenberg, and Sebastian Möller
2021 Conference on Empirical Methods in Natural Language Processing (EMNLP): System Demonstrations
ACL Anthology | arXiv | GitHub | Video
Efficient Explanations from Empirical Explainers
Robert Schwarzenberg, Nils Feldhus, and Sebastian Möller
4th workshop on analyzing and interpreting neural networks for NLP (co-located with EMNLP 2021) (Retracted from ACL 2021, Rejected from EMNLP 2021)
BlackboxNLP 2021 proceedings | arXiv | GitHub
Combining Open Domain Question Answering with a Task-Oriented Dialog System
Jan Nehring, Nils Feldhus, Harleen Kaur, and Akhyar Ahmed
1st Workshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc 2021)
ACL Anthology
European Language Grid: A Joint Platform for the European Language Technology Community
Georg Rehm et al.
16th Conference of the European Chapter of the Association for Computational Linguistics (EACL): System Demonstrations
EACL 2021 Proceedings
2020
Evaluating German Transformer Language Models with Syntactic Agreement Tests
Karolina Zaczynska, Nils Feldhus*, Robert Schwarzenberg, Aleksandra Gabryszak, and Sebastian Möller
5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS)
SwissText/KONVENS 2020 Proceedings | arXiv | GitHub
* joint first authorship
Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability
Georg Rehm et al.
1st International Workshop on Language Technology Platforms (c/w LREC 2020)
IWLTP 2020 Proceedings




