AI Visualization & Interpretability (AIVI) seminar

When and where the weekly meetings will take place

Description

As AI systems grow more powerful, there is an increasing need to make these complex black-box models interpretable and explainable. This seminar explores how data visualization techniques and interactive interfaces can provide crucial insights into how AI models operate and arrive at their outputs in order to understand what factors they are considering. The seminar covers effectively communicating AI interpretability visualizations and fairness and bias evaluations to different stakeholders.

Logistics

Attendees will take turns presenting explorables, demonstrators and research papers to gain an understanding of how visualization can demystify AI, foster transparency, and enable real-world deployment of these systems in high-stakes domains. The presentations will occur throughout the semester, starting in week 4. The attendees are expected to participate in person and provide feedback to the presenters if they are not presenting themselves.

Project requirements

  • Intermediate or advanced knowledge of machine learning
  • Course language: English (student consultations can optionally be done in German)

Grading

  • Regular in-person attendance
  • Preparatory reading
  • 20 min presentation (+ 10 min Q&A) of an existing paper (“journal club style”)
  • 4p paper on area determined by presented paper

Eligible modules

Schedule

DateSessionTopic
2025-10-24Session 1Introduction (presentation of areas and topics)
2025-10-31No session; Deadline for Moses registration (14:00); Deadline for the ranked-choice poll on papers to present (23:59)
2025-11-07No session; Preparation of presentations
2025-11-14Session 2
2025-11-21Session 3
2025-11-28Session 4
2025-12-05Session 5
2025-12-12Session 6
2025-12-19No session; Academic holidays
2025-12-26No session; Academic holidays
2026-01-02No session; Academic holidays
2026-01-09Session 7
2026-01-16Session 8
2026-01-23Session 9
2026-01-30Session 10
2026-02-06Session 11
2026-02-13Session 12
2026-03-13Submission deadline for papers

Literature

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models (Fel et al., ICML 2025)
A Visual Guide to LLM Agents (Grootendorst, 2025)
A Visual Guide to Quantization (Grootendorst, 2024)
Building Appropriate Mental Models: What Users Know and Want to Know about an Agentic AI Chatbot (Brachman et al., IUI 2025)
Circuit Tracing: Revealing Computational Graphs in Language Models (Ameisen et al., 2025)
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (Lam et al., CHI 2024)
DeepLens: Interactive Out-of-distribution Data Detection in NLP Models (Song et al., CHI 2023)
Demystifying Verbatim Memorization in Large Language Models (Huang et al., EMNLP 2024)
exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models (Hoover et al., ACL 2020)
Explainability Perspectives on a Vision Transformer: From Global Architecture to Single Neuron (Marx et al., VISxAI 2024)
Explaining Text-to-Command Conversational Models (Stupar et al., VISxAI 2024)
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping (Wang et al., CHI 2024)
From Discovery to Adoption: Understanding the ML Practitioners’ Interpretability Journey (Ashtari et al., DIS 2023)
Interactive Model Cards: A Human-Centered Approach to Model Documentation (Crisan et al., FAccT 2022)
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models (Kahng et al., CHI 2024)
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models (Tufanov et al., ACL 2024)
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens (Liu et al., ACL 2025)
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models (Ghandeharioun et al., ICML 2024)
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning (Bricken et al., Transformer Circuits 2023)
Toy Models of Superposition (Elhage et al., Transformer Circuits 2022)
Transformer Explainer: Interactive Learning of Text-Generative Models (Cho et al., VIS 2024)
Understanding and Comparing Multi-Modal Models (Humer et al., VISxAI 2024)
Where is the information in data? (Murphy & Bassett, VISxAI 2024)
Xplique: A Deep Learning Explainability Toolbox (Fel et al., CVPR 2022)

(sorted alphabetically)