The following is an excerpt from Phill Jones’ recent white paper The Top Trends in Knowledge and Information Management. Download the full paper here.
The goal of aggregated search is to provide integrated search across multiple, heterogeneous sources. There are broadly two technology approaches that have been used to achieve this; federated search and web-scale indexing, but those aren’t the only options.
Federated search
This approach passes the search query to multiple databases behind the scenes. Early versions used screen-scraping, which failed when a content provider changed their website. More recently web technologies like APIs have made this approach more robust.
Advantages:
- Compatible with a broader variety of information sources. Many proprietary content providers won’t allow content to be indexed.
Disadvantages:
- Can be fragile under some circumstances.
- Search speed is limited by content provider systems.
Web-scale indexing
The best-known web-scale indexing service is Google. This technique involves creating a database of all the content needed to be searched. The index is searched like the index of a book and linked back to the content.
Advantages:
- Fast and robust search creates a compelling user experience.
- Computational techniques like indexed knowledge graphs can automatically surface connection between different sources.
Disadvantages:
- Not all sources can be indexed. Google, for example, can’t index content behind a paywall.
- Content needs to be regularly re-crawled to keep the index up to date.
A third way — fully aggregated search
Neither federated search nor web-scale indexing provides the perfect solution for a commercial knowledge and information management environment. A fully integrated approach can index content when possible and a knowledge graph of objects, concepts and connections can be created. Although the creation of real-time knowledge graphs is not computationally feasible, results that need to be retrieved in real time can be readily mapped onto a pre-existing graph. This hybrid approach can provide the best of both worlds.
Keep learning:
- Download Phill Jones’ white paper: The Top Trends in Knowledge and Information Management
- Read a related blog post: What are FAIR Data Principles?
- Learn how one pharmaceutical company is bringing FAIR data to life: FAIR Data in Action Case Study
- Contact us to learn more about aggregated search solutions