At CCC, we talk a lot about “affiliation disambiguation.” It’s a mouthful, but it’s a struggle publishers know very well: the challenge of untangling authors and institutions in their customer records. Inaccurate and incomplete affiliation data disrupts operations across the research lifecycle, with negative impacts to researchers, institutions, funders, and publishers.
Fortunately, there are solutions and best practices to help publishers tackle this problem. I spoke with Shannon Reville and Stephen Howe, product management leaders at CCC, to break down the concept of affiliation disambiguation, why it matters in scholarly publishing, and what publishers can do to improve data quality.
In Part 2 of this blog series, we discuss solutions and best practices to these challenges of affiliation disambiguation in scholarly publishing. For more on the challenges, see Part 1.
What types of tools or technologies does CCC offer to help publishers tackle affiliation disambiguation?
CCC’s Ringgold service offers an expertly curated database of organizations that fund, create, and use scholarly research. With over 750,000 Ringgold persistent identifiers and metadata records with rich hierarchies and more than 30 descriptive metadata elements, Ringgold helps publishers normalize and disambiguate organization data. Ringgold is the only solution to leverage structured organizational hierarchies and consortium relationships to describe complex relationships among parties in the scholarly communications ecosystem. The database also includes ISNI IDs, an open ISO standard, to enable flexibility and broad interoperability. Over 99.9% of Ringgold identifiers have a corresponding ISNI ID.
Ringgold also includes auditing services to help publishers obtain expertly curated, normalized, and enriched organization data to enable data-driven business decisions.
Sister to Ringgold, CCC’s OA Intelligence can help publishers automate disambiguation of historical publication records, especially where unstandardized or optional standardized collection has taken place in the past. A simple import of publication data results in an overnight overhaul.
How does OA Intelligence approach the problem of affiliation disambiguation?
OA Intelligence steps in post-acceptance or post-publication when earlier attempts to collect PIDs have failed or are inconsistent. Our AI-powered disambiguation engine reads a variety of manuscript metadata as it flows from CCC’s RightsLink for Scientific Communications solution into OA Intelligence real-time, or from metadata imported by publishers when they do not leverage RightsLink for their publishing workflow. OA Intelligence can disambiguate some of the sparsest affiliation data by using email domains (even long subdomains), free text institution names inclusive of spelling mistakes and abbreviations, APC payment details, and more. Every match is assigned a Ringgold ID, then rated from 0-100% confidence, and those with high confidence are revealed in a searchable interface.
What role do authoritative databases like Ringgold play in disambiguation solutions?
The world is full of different references to the same thing. Look at any set of addresses and you will see references to ‘United States’, ‘US’, ‘USA’, ‘United States of America’, ‘U.S.A.’, etc. How do we know what country all these different examples refer to and whether they refer to the same country? Well, it helps to have a standardized list of countries that accurately describes the real countries of the world and describes each of those countries in a standardized way. The same is true with organizations. We need a standardized and authoritative reference dataset of organizations to help us interpret and evaluate the local organization references we encounter in our workflows. The Ringgold database provides an accurate list of organizations relevant to scholarly research and publishing, and it describes each organization using standardized naming conventions and a unique identifier.
What best practices would you recommend to publishers looking to improve their affiliation data quality?
- Review your submission process, in particular:
- The way you communicate the need for affiliation information to your submitting authors. Why is it important for them to provide up-to-date info for the corresponding author? What benefits might they expect from an accurate institution affiliation? Will errors cause delays in publication?
- Stop allowing free-text entry of affiliation info. Invest in and require selection of a standardized institution name, backed by a persistent identifier.
- Make the affiliation field required. Yes, there are edge cases. Yes, you’ll have to deal with them or allow a workaround when requested—but don’t let perfection be the enemy of very, very good.
- Be sure to push that affiliation data everywhere! To all the downstream publication systems and beyond.
- Consider a pre-acceptance affiliation checkpoint. We try to avoid recommending manual work, but publishers doing this reap the benefits of automation and program scalability in the long run. If affiliation data is collected right the first time, and you’ve made sure of it, you will save time making sense of gaps for years to come.
- Tackle that backlog. You’ve got a long list of ambiguous authors or manuscripts sitting in an Excel spreadsheet, and your sales/operations/OA/executive team needs to know how many came from the University of Anywhere.
Affiliation disambiguation may seem like a behind-the-scenes concern, but the impact of inaccurate metadata ripples across nearly every aspect of scholarly publishing. Investing in affiliation disambiguation solutions is more than a metadata cleanup project—it’s a strategic move toward greater transparency, accountability, and trust.
Fortunately, there are tools like Ringgold and OA Intelligence to help make sense of the chaos. Reach out to CCC to learn more about how we support publishers with affiliation disambiguation challenges.
Keep learning
- What Sets Ringgold Solutions Apart
- Case Study: IEEE Uses Ringgold Solutions to Normalize and Disambiguate Organization Data to Strengthen the Integrity of Article Metadata
- Case Study: The Company of Biologists shortens publication timeline and scales Read & Publish program with CCC’s RightsLink for Scientific Communications and OA Intelligence