Hour 2 · Bibliometrics · Step 14 of 1593%
2.5
What can go wrong
~5 min
Author disambiguation
Same name, different people. Different spellings, same person. Left alone, it skews every count. Check author profiles and verify before you trust the numbers.
Messy data
Clean the export before you map it. Deduplicate, standardise (e.g. 'covid-19' vs 'COVID 19' vs 'SARS-CoV-2'), filter, then visualise. Rubbish in, confident rubbish out.
Over-trusting AI labels
A model will name a cluster with total confidence and be wrong. Use AI as a starting point, not the final word. Read the papers, then decide.
AI prompts (1)
Prompt
Author disambiguation checker
When: Your top-authors table looks suspicious.
I'll give you a list of author name variants from a bibliometric export. Identify likely duplicates (same person, different spellings) and likely collisions (same name, different people). Author list (name; affiliation if available; total papers; years active): <PASTE> Return: 1. Likely-same-person groups, each with a recommended canonical name and the reason (initial style, accent, hyphenation, affiliation overlap). 2. Likely-different-people warnings (same name but mismatched affiliation/era). 3. Cases you can't tell from the data — list what extra field would resolve each. Be cautious. When in doubt, mark UNCERTAIN rather than merge.