Skip to content

Access COVID-19 Information and Resources

Copyright Clearance Center - Copyright & Licensing Experts
  • Businesses
    • < Main Menu
    • Businesses Overview
    • Products
      • Annual Copyright License
      • RightFind® Enterprise
      • RightFind® Navigate
      • RightFind® Document Delivery
      • RightFind® Insight
      • RightFind® Business Intelligence
      • RightFind® XML for Mining
      • Motion Picture License
      • Reprints and ePrints
      • Pay-Per-Use Permissions
    • Solutions
      • Case Studies
      • Get Content
      • Reuse Content
      • Track & Manage Content
      • Managed Knowledge Services
      • Professional Services for Businesses
  • Academia
    • < Main Menu
    • Academia Overview
    • Products
      • Annual Copyright License
      • Get It Now
      • Pay-Per-Use Permissions
      • MOOC Content Licensing Solution
    • Solutions
      • Get Content
      • Share Content
  • Publishers
    • < Main Menu
    • Publishers Overview
    • Products
      • Annual and Pay-Per-Use Permissions
      • RightsLink for Permissions
      • RightsLink for Scientific Communications
      • RightsLink Author Services
      • Reprints Services for Publishers
      • Get It Now
      • RightsCentral
      • RightFind® XML for Mining Solution
    • Solutions
      • Case Studies
      • Automate Rights & Permissions
      • Manage Publication Charges
      • Manage Agreements
      • Deliver Content
      • Manage Your CCC Business
      • Professional Services for Publishers
  • Learn
    • < Main Menu
    • Learn Overview
    • About Copyright
    • News & Events
    • Resource Library
    • International Copyright
    • Contact the Education Team
  • Blog
  • Contact
Home > Velocity of Content > Businesses > Information Management > Knowledge Graphs as Belief System Encapsulations

Knowledge Graphs as Belief System Encapsulations

By Babis Marmanis, 6 January 2021

ShareTweetShare
Mail

This blog is the continuation of a talk that I gave at the Outsell Signature Event on 12 November 2020, where I participated in a panel discussion with CCC President & CEO Tracey Armstrong and moderator David Worlock on “Using AI to Create Collaboration, Partnership, and New Business Opportunities: Launching the CCC Knowledge Graph.”

When the COVID-19 pandemic hit us, the global research community went into high gear to study the disease and to share their research in hopes of finding a solution. This increase in research output created a new challenge for scientific publishers: finding enough qualified peer reviewers to keep up with the influx of manuscript submissions.

There had been thousands of manuscripts submitted weekly since early in 2020, just in that area of research alone. From the perspective of a publisher overseeing peer reviewed journals, that is a tremendous number of new manuscripts to vet, edit, and publish. The demand for quick turnaround of high-quality reviews, in order to accelerate progress, was further intensifying the pressure of identifying good candidates.

As everyone rushed to combat COVID, we at CCC also wanted to contribute what we could. By leveraging our data and technology, we developed a knowledge graph to help publishers address the problem of identifying suitable candidates for peer reviews in the COVID space.  So, in what follows, I will describe in what sense our approach can be of significance beyond the COVID specific work.

To begin with, we need to emphasize that the key term in “knowledge graph” is the word “knowledge” rather than “graph.” Hence, we should first define what the word “knowledge” means for our discussion. There is a long-standing debate about the data value chain as promoted by Russell Ackoff. The Data-Information-Knowledge-Wisdom (DIKW) hierarchy, as it came to be known, was brought to prominence by his address to the International Society for General Systems Research in 1989. At the highest-level, it is generally accepted that the data value chain can be summarized by two key transitions:

  1. A transition from “raw data” to “information”, and
  2. A transition from “information” to “knowledge”.

Now, let us look at these terms more closely. We will define “information” as data that is fit for purpose within a specific context. For any set of data to be considered as “information,” a certain degree of data cleansing, data integration, and possibly data enrichment must take place.

With that in mind, let us now define “knowledge” as “actionable information.” It is important to note that knowledge must necessarily be associated with a degree of confidence that expresses the strength of our conviction about the accuracy of the information. Therefore, much like our own beliefs, it cannot be static. Our beliefs continuously evolve and adjust to accommodate new information, and, in turn, that results in adjustments of the confidence that we have about our knowledge.

True knowledge is not attainable. Take the field of physics, for example, the true nature of things is not possible to find. As Feynman put it: “We are never definitely right, we can only be sure we are wrong.” Yet, this hasn’t stopped us from creating very successful models of reality and using them to exert our control over nature in numerous ways.

Creating conceptual models, based on data, about our businesses will be essential for success in the 21st century, and a knowledge-based system is a great way of creating these conceptual models. Once you have a model, you can integrate it into your operational environment, measure its variables, observe its dynamics, incorporate operational measures based on different model criteria, and continuously refine and adjust it. In my opinion, that is where the true value of data science lies.

That’s something that any sensible person would agree with and many people claim to have accomplished. I think that it is far from trivial to accomplish even if you narrow the scope of your knowledge-based system to a specific area of your business. Take for example, the CCC knowledge graph that I mentioned earlier.

Our graph relies on a dataset that consists of published scientific articles in virology with special attention to coronaviruses including SARS, MERS, and SAR-CoV-2. We used bibliographic citation metadata for articles listed by LitCovid, CORD-19, and other sources. All in all, we processed over 120,000 articles.

Our thinking was fairly straightforward, if we can show the various authors, their associated literature, their collaborators (co-authors), and some general characterization of the field of their study, then a match between an arriving manuscript and an appropriate reviewer could be readily made. However, even with such a limited set of data there are plenty of questions to answer and a significant degree of uncertainty to deal with.

Is “Ralph S Baric” of publication A the same author as “R S Baric” of publication B? And, how about that “Ralph A Baric” guy from publication C? Is he the same person, a cousin, a lexicographic coincidence, or simply an error? When we assign a MeSH term to an article, at what level of the MeSH hierarchy should we make the assignment? Should that depend on our level of confidence or be fixed a priori? Should we consider the full text in making our classification (if available) or use only bibliographic metadata? Should we provide the provenance of our beliefs or simply store the present state? How about the institution names? At what level should we capture the affiliation? If there is more than one affiliation, are any of them transient? Which one really matters for the purpose of contacting the author? I could go on and on with a list of questions that one needs to consider in order to arrive at a stage that the information in the system has achieved a level of confidence that allows us to make it actionable. The state of the data that raises these questions is directly tied to the information entropy in the system, and therefore, these questions multiply as the size of the system grows.

To address the above questions, and many others, we processed the data through a specially crafted data pipeline in order to extract the appropriate metadata, and disambiguate author names, author affiliations, and their publishing relationships to other authors. That process produced approximately 440,000 unique authors.

Although we are only visualizing that knowledge at the moment, we have built an extensible and open architecture that will allow the knowledge to be transfused into many other applications. One can’t help but think of what would be possible when our approach brings together more data from our customers, our partners, and even other third parties. Since a knowledge graph represents a belief system, there isn’t a single knowledge graph to rule them all!

Sure, there is a common denominator between any two knowledge graphs that are produced from the same data or to serve in the same field, but a large part of the business value is to be sought in their differences, rather than their similarities. Our view at CCC is that building a knowledge graph system, essentially, means building a belief system for your business.

A system that can understand the intent of your users in various circumstances, and provide the power of knowledge to employees, and end-users alike, at the right place, and at the right time.

A living, breathing system that continuously evolves and absorbs new information, and that is tightly coupled to the “organs” of your business, presenting the “truth” as your business perceives it.

In that way, data, content, and services become semantically interoperable, allowing AI agents to understand your business and perform tasks with great effectiveness. The time when people were browsing through large number of documents, websites and other sources of content, and manually extracting and interpreting the information within them is not the future.

In fact, it is becoming increasingly the past. The users nowadays ask their personal assistant to perform knowledge-backed tasks without delving into the required process for that task themselves.

If you take nothing else away from this post, remember this:

  • A Knowledge Graph is a great way to encapsulate the view of the world in the context of your business, i.e. your belief system.
  • A Knowledge Graph will continuously provide a ROI if it constantly evolves and incorporates new information that enables new uses

Businesses who do that will be able to expand further the reach of their services, improve the quality of their operations, and bring new products to many new customers. That is not an easy task, but it can be a very rewarding endeavor. CCC’s data experts are here to help.

Babis Marmanis

Author: Babis Marmanis

Babis Marmanis, Ph.D., is responsible for defining the technology vision of all software systems at CCC. He leads a global team of professionals with a passion for continuous technological innovation and the delivery of high-quality software systems. Babis has written the books Spend Analysis and Algorithms of the Intelligent Web, and contributes to journals, conferences and technical magazines.

Don't Miss a Post

Subscribe by Email
ShareTweetShare
Mail

For inquiries related to this blog, email blog@copyright.com or join the conversation on social media with @copyrightclear.

Topics

Businesses

  • Analytics and Big Data
  • Artificial Intelligence
  • Copyright
  • Customer Experience
  • Digital Transformation
  • Ethics
  • Information Management
  • Knowledge Management
  • Music and Video
  • Pharma and Healthcare
  • Professional Development
  • Search and Discovery

Publishers

  • Author Experience
  • Collective Licensing
  • Copyright
  • Data
  • Digital Transformation
  • Ethics
  • Metadata
  • Open Access
  • Open Educational Resources
  • Peer Review
  • Professional Services
  • Smart Content
  • Societies
  • Transformative Agreements

CCC Highlights

  • Advocacy
  • Community
  • Podcasts
  • Solutions

Academic

Don't Miss a Post Subscribe by email to the Velocity of Content blog
About Us
  • About CCC
  • Careers
  • News & Events
  • Executive Leadership
  • Board of Directors
  • Contact
Community
  • Velocity of Content Blog
  • Velocity of Content Podcast Series
Permissions
  • Login
  • Pay an Invoice

Connect with Copyright Clearance Center on LinkedIn

Subscribe to Copyright Clearance Center's YouTube Channel

Follow Copyright Clearance Center on Facebook

Follow Copyright Clearance Center on Twitter

Subscribe to Copyright Clearance Center's RSS Feed

  • Terms & Conditions
  • Privacy Policy
  • Cookie Policy
  • Data Security
  • For California Residents
  • For EU and EEA Job Applicants
The Copyright Clearance Center Privacy Policy was updated on May 27, 2020.
Materials available on copyright.com are protected by the copyright laws of the United States and other countries.
© 1995–2020 Copyright Clearance Center, Inc. All rights reserved.

© 1995–2021 Copyright Clearance Center, Inc. All rights reserved.

Subscribe to CCC’s Velocity of Content blog today