The following is an excerpt from Accessing and Analyzing Relevant Content in Today’s Information Chaos.
As consumers, we experience personalization daily via targeted online advertising while browsing or in our social media accounts. We see it in the shows and movies that Netflix and Amazon suggest to us and in the music that Spotify or Pandora recommends while we scroll through our playlists. This technology is based on our past purchase and usage behaviors. What we have liked, disliked, who we follow, and what we have previously searched for and ultimately purchased are used by machine learning to predict — and suggest — what might be of interest to us in the future. Let’s consider how personalization techniques impact scientific content discovery and information management. According to the latest data, approximately 8.5 billion searches are conducted each day on Google. Google — the tool, the term, and the technology — is omnipresent in our collective global culture. This has caused a fundamental change in what individuals expect from search results. Our private life search habits and expectations have unsurprisingly spilled over into our business life.
Where personalization comes into play
With this shift in expectations, it’s important to understand the different types of personalization used by search engines so we can recognize the benefits of applying these tools to data searches in the business environment as well.
When users are searching through and finding content, personalization allows them to find relevant content faster by moving artificial intelligence (AI)-informed recommendations to the top. With explicit personalization, search results are driven by a user’s chosen preferences, such as setting specific data source selection and/or setting source “favorites.” Based on these choices, the user expects more relevant search results to appear. Implicit personalization delivers personalized content recommendations based upon a user’s past actions and behaviors.
Research indicates 75% of people will never scroll past the first page on a Google search, drastically limiting the range of potential information. In light of this data, it is more important than ever that the information most highly relevant to the individual researcher appears at the top of search results.
Challenges and opportunities
At R&D intensive companies, the questions that researchers and other employees attempt to answer are far more complex than a simple Google query can answer. For example, what is most relevant to a researcher working on a promising early-stage drug candidate for Fibrodysplasia ossificans progressive — otherwise known as FOP or Stoneman’s Disease, which is expected to affect only 4,000 individuals worldwide — is quite different from what that the same researcher would find valuable in the mature diabetes market.
Why? The Rare Disease field is known for its small patient populations, premature disease understanding, and overwhelming lack of education. Comprehensive and relevant information may be very difficult to find and would require content discovery solutions that scour scientific literature, patent information, real-world evidence, and patients’ lived experiences from as many sources as possible, including scientific societies and congresses, scholarly publications, clinical trials, social media platforms, etc. Casting as broad a net as possible would help generate novel insights and new discoveries and drive results in this market.
In contrast, let’s look at the diabetes market. Diabetes was accurately described for the first time in the 2nd century A.D.; by January 1922, the first insulin injection was given to a 14-year-old boy dying of the disease. For the last 100 years, diabetes has been studied by countless principal investigators, labs, and drug development companies. The sheer volume of clinical and observational data and content is vast, and as a result, this researcher’s challenge becomes one of technical relevancy and prioritization. R&D users in the diabetes field need solutions and methods to narrow down and contextualize information, to manage the deluge of data, and to help them recognize newly established patterns and trends.
Relevancy in scientific, medical, and technology search
An article’s median half-life (more than one-half its total downloads) across all publishers was between 2 and 4 years. This can bias traditional search engines to favor older publications because citations, impact factor, etc. can take years to develop — missing the mark in identifying potentially novel discoveries and innovation.
This leads us to recognize why R&D professionals need software solutions that are based on the right kind of machine learning — in particular, implicit and explicit personalization tools that better “understand” a user’s goals and result in more serviceable content discovery, regardless of the lifespan of a significant scientific paper. Using the right machine learning tool will combine implicit and explicit personalization with the right content to find relevant results.
One user of CCC’s RightFind Navigate at a global pharmaceutical company recognizes the value of creating a unified search experience from disparate, siloed content from trusted internal and external sources, saying, “RightFind offers a tremendous benefit as a place to go when you don’t know where to start.”
Breakthroughs in cancer treatments, rare diseases, and rocket science are possible, not because individual experts know everything on the subject but because people can draw knowledge that does not reside in their own heads— which makes finding the most relevant information at the right time so critical to drive innovation.
Learn more about personalized search across multiple sources of data and information for highly relevant discovery with RightFind Navigate.