The following is an excerpt from Mary Ellen Bates’ recent white paper Vocabularies, Text Mining and FAIR Data: The Strategic Role Information Managers Play.

Information managers have a number of possible avenues for engagement with user groups involved in using AI technologies for insight. This requires a shift in perspective for information managers, as the focus changes from supporting Boolean searching of a data collection to enabling text mining and increasing discoverability. By using specialized ontologies across both structured and unstructured content, information managers can:

  • Add synonyms to queries automatically, ensuring more comprehensive results
  • Increase access points to unstructured text and content that is not in an easily-searched format, such as images and charts
  • Create links among classes of concepts, such as genes and diseases, so researchers can discover new relationships
  • Decrease query abandonment of dissatisfied users by increasing the number of relevant results to accelerate discovery and research

The abandonment of searches is particularly important as younger users expect to use simple keyword or natural language searches in structured databases, just as they query general search engines on a mobile device by speaking their question. Their searches in enterprise information sources only succeed if an information manager has set up domain-specific search filters to enhance precision and recall.

Some text mining projects will mine unstructured content to generate insights from a data collection, but these are one-off benefits, as the underlying content is not modified by the insights derived from the analysis. If, on the other hand, that content is semantically enriched with meaningful metadata and domain-specific ontologies, the information is more actionable across the organization, by a variety of users. While the dataset may have been acquired for a project in one field, adding metadata with links to other enterprise ontologies makes the same dataset valuable to researchers in other fields as well. In other words, enrich once, and benefit many times.

When information managers are brought into these one-off text mining projects to add semantic enrichment to the data, they can offer a unique perspective. They understand how various user groups discover and access content, what external and internal content a project requires, and how to enhance users’ queries to better get the insights they need. They understand the wide variety of formats to consider — everything from electronic laboratory notebooks to patent filings, news and press releases, and genomic mapping data — and they know what ontologies to use to add structure to the information.

One challenge information managers often encounter is identifying and accessing ontologies or vocabularies used by a group but not coordinated with other parts of the organization. These may be a locally modified version of a public ontology, a reference database developed by a nonprofit organization or a for-profit company, or a semantic vocabulary developed entirely in-house. Each group may have its own policies regarding who is allowed to contribute changes or who is responsible for keeping the local version reconciled with the original or authoritative version. Unless an information manager or other information professional with an enterprise-wide perspective is managing these local ontologies, the organization loses the ability to leverage the time, expense, and focus required to develop and maintain these resources to benefit the entire enterprise.

While decisions about ontologies are often driven by subject matter experts, information managers can provide centralized management of ontologies and vocabularies within an enterprise, mapping the domain and the different vocabularies being used by each group. Information managers can bring a FAIR mindset to ontology management, by looking at the entire workflow process and identifying points at which data, metadata and tools can be made more findable, accessible, interoperable and reusable, while also being mindful of oversight and authorization concerns.

With a robust ontology management tool, information managers can facilitate collaborative editing of vocabularies, strategic linking of related concepts between ontologies, better interoperability among data sets, consistency of terminology and meaning across projects, and support for multilingual vocabularies. Information managers can identify inconsistencies between ontologies—a particular problem in industries that are experiencing a high level of acquisitions and mergers—and identify new entities or synonyms in rapidly developing fields before third-party ontologies are updated.

Keep learning:


Author: Molly Tainter

Molly Buccini is a marketing communications manager at CCC. Her background before CCC includes B2B content marketing and local news reporting. Outside of the office, she enjoys reading, traveling, and theater.
Don't Miss a Post

Subscribe to the award-winning
Velocity of Content blog