Heat score
1Topic analysis
ICLR 2026 – Institutional Affiliations Dataset and Analysis
End-to-end pipeline that turns 5,356 ICLR 2026 accepted papers into a clean, PDF-derived institutional-affiliation dataset and a publication-ready treemap of who is shaping AI research right now. This avoids the OpenReview-profile drift problem (where authors' current job appears on every paper they ever wrote — e.g. listing Wyoming as the affiliation for a paper actually written at UBC). Affiliations come from the paper's title block PDF , not from author profiles. Follow me for more analysis like this, plus AI engineering & research insights: If this dataset or the pipeline is useful to your work, a follow / star is the easiest way to encourage me to keep publishing this kind of analysis. Each rectangle is one institution sized by the number of accepted papers it appears on (counted once per paper , regardless of how many of the paper's authors are affiliated with it). Region cells are sized by the cumulative count of their top-50 institutions. Lighter shade = academia / research institute, darker shade = industry. Square version (for social posts): charts/iclr2026_top50_treemap_unique_grouped_square.png This reads data/iclr2026_public.csv and writes the treemap PNGs/SVGs into charts/ . Add --shape square for a 1:1 version. Add --source openreview to compare against the OpenReview-profile-only version (requires running the scraper first). You only need this if you want to re-derive the dataset (e.g., for a new conference). It takes ~1–2 hours of network time and ~5 GB of disk for the PDF cache. parse_pdf_affiliations.py handles four layout patterns common in ICLR template papers: Plus a footnote-text filter that catches and discards "Equal contribution", "Corresponding author", "Project lead", "These authors contributed equally" — these used to leak into affiliation strings before being filtered out. Result: 96% of papers parse successfully ; the remaining 4% fall back to OpenReview profile data (transparently flagged in the Affiliation_source column). MIT . The data is derived from publicly available OpenReview submissions and ICLR 2026 paper PDFs; please cite this repository if you use it in published work. If you build something on top of this, ping me — I'm always interested in seeing where this kind of pipeline gets used. And if you want more posts like this (research-engineering deep dives, applied AI analysis, papers I'm reading), the best place is: — Dmytro Lopushanskyy
Sources
1Platforms
1Relations
0- First seen
- May 15, 2026, 6:50 AM
- Last updated
- May 15, 2026, 8:01 AM
Why this topic matters
ICLR 2026 – Institutional Affiliations Dataset and Analysis is currently shaped by signals from 1 source platforms. This page organizes AI analysis summaries, 1 timeline events, and 0 relationship edges so search engines and AI systems can understand the topic's factual basis and propagation arc.
Keywords
9 tagsSource evidence
1 evidence itemsICLR 2026 – Institutional Affiliations Dataset and Analysis
News · 1Timeline
ICLR 2026 – Institutional Affiliations Dataset and Analysis
May 15, 2026, 6:50 AM
Related topics
No related topics have been aggregated yet, but this page still preserves the AI summary, source links, and timeline.