6 Tips to Build and Work with Knowledge Graphs

As technology evolves and the amount of information available continues to grow, knowledge graphs will become increasingly vital as a tool for information and knowledge managers as well as the researchers and executives they support. But it’s important to remember not all knowledge graphs are created equal and some approaches yield better results than others.

The following tips are an excerpt from our latest white paper, Knowledge Graphs: Connecting Your Data to Solve Real-World Problems in R&D, Business Intelligence, and Strategy

1. Don’t throw away information unnecessarily

This first piece of advice speaks to the philosophy of knowledge graphs. Traditional information management approaches often rely on forcing information into controlled structures that inherently impose assumptions.

A simple example is controlled vocabularies. If two words are decided to be a synonym, and therefore one is changed to the other in your database, any potential difference in meaning is lost. It may be better to map the relationship between those two terms so that the relationship between them can be modified at a later date.

2. Take the time to understand the data and information you have

While it’s often better to not discard information, sometimes lack of data cleanliness creates the appearance of more information than there really is. It’s therefore important to understand your data before attempting to put it into a knowledge graph. Make informed decisions about where data needs to be cleaned and where it needs to be mapped.

Define data governance metrics such as completeness and data quality at each stage of the data processing pipeline and conduct quality control to catch bad data. That can be as simple as detecting entries of a single character length, impossible entries like dates in the future, character set encoding errors and even gene names that have been coerced into dates.

3. Think about the questions your organization needs to ask

Each organization’s data is unique and has its own management challenges driven, in part, by the types of business problems the organization needs to solve. Identification and deduplication of authors, for example, is a common challenge. Names are not unique, sometimes change, and can be styled in multiple ways. Mapping across affiliations and topics enables clusters of articles that likely share an author to be linked. Connections can be changed to refine the graph as new information becomes available.

Remember the value of knowledge graphs lies in the connections between objects. Those connections are answers to questions and surface the value of the data and the information. It is therefore important to design the connections to answer the sorts of questions you need to ask.

4. Maintain and show provenance

One criticism of Google’s knowledge graph is that it doesn’t tell the users enough about the reliability of its sources. If the interface you use doesn’t show researchers and executives where information comes from, they’re unlikely to believe it and might be vocal in expressing that they don’t.

5. Remember, knowledge graphs can be employed in ways that meet the needs of different users

Remember one of the big advantages of knowledge graphs is they are designed with the uses of the information in mind. The same knowledge graph may serve a range of user groups including researchers, analysts, executives, and even HR, but if they ask different questions, they may need the information presented in different ways.

For example, some knowledge graph interfaces allow users to visually navigate the graph, by jumping from node to node. This approach might be best used by data scientists or biomedical researchers who have the domain expertise to interpret the information in a direct way.

For other users, having a knowledge graph interface could be too complicated and time consuming for discovering useful information or insights. For those users, perhaps in business intelligence or strategy, applying the knowledge graph under-the-hood, and simply presenting its results the way that a Google search does is likely preferable.

6. Let them change; knowledge graphs should be living objects

Thinking carefully about the design of the graph and how things are mapped is vital, but it’s naive to think it will be perfect from the start. Allowing users to provide feedback is an excellent way to evolve the knowledge graph and make it even more powerful.

That feedback could be in the form of requests to make different connections or questions about the data. On the other hand, that feedback could be in the form of weighing of data points. If you maintain and show provenance, end users can indicate a particular article has erroneous results or a market sizing estimate is known to be wrong. Allowing users to feedback those judgments allows the capturing of that organizational tacit knowledge.

Interested in learning more?

Check out: