Search & Discovery Archives | CCC

Beyond Standard Search: Getting More Value from Your Data

Carl Robinson — Tue, 20 Feb 2024 14:04:44 +0000

In the first post in this series, we identified the value that data “in the gaps” can bring to an organization and how taking a deep search approach can help an organization overcome the challenges associated with finding and utilizing this data. We now turn to an exploration of some specific ways that deep search can help organizations, beginning with maximizing return on investment into existing data sources.

Organizations using licensed or publicly available databases may not find all the data they need in these sources, or they may encounter the opposite problem, finding an overwhelming amount of data full of noise that’s distracting and difficult to sift through as they try to locate the information they need.

A common example of the above situation can be seen in researching grants and funders. Organizations interested in tracking grants of interest, and understanding more about the funders behind these offers, are often trying to better position themselves to receive them. There is also simple intelligence to be had from seeing what grants are out there, and who is funding what. Grant tracking can be hampered by potential issues, including gathering information that is irrelevant to the organization’s needs, perhaps because a broad database is used covering domain areas of no interest, or too large of a geographical region. In other situations, the organization can fail to obtain the data it needs, as when a database does not include relevant information, such as the link between a principal investigator and their institution, that would help an organization better understand the context around a grant or funding.

By having to either tune out the noise of unnecessary data or find additional data sources to get the information needed, organizations can lose valuable return on time invested. On top of this, the need to use multiple grant databases makes it highly unlikely that the data found will be consistently normalized, increasing the difficulty in getting a complete picture.

Deep search helps address the above issues in part by using database information as a starting point, and — when multiple databases are used — cleaning and normalizing this data. Potential noise is automatically tuned out through deep search by an organization’s ability to focus on what it wants to see through filtering, keywords, ontologies, etc. By helping connect an organization to relevant information, a more complete, targeted dataset is created that focuses exclusively on the area of interest extracted from the wealth of available data.

In this context, deep search can help an organization maximize its return on time invested and the value of the licensed data sources it may have paid for. This is also true when your licensed databases might not have certain information — deep search makes it possible to add new information to a combined dataset by crawling websites in a focused, targeted way. The information obtained through deep search in this way can then be shared with stakeholders via emailed reports, and alerts can be set up in case information changes.

This is the second in a four-part series on how deeper, automated searching can help your organization more easily find the information needed to make the right business decisions. View the first blog post in this series here: Beyond Standard Search: Getting the Targeted Data Your Organization Needs

The post Beyond Standard Search: Getting More Value from Your Data appeared first on Copyright Clearance Center.

Beyond Standard Search: Getting the Targeted Data Your Organization Needs

Carl Robinson — Tue, 13 Feb 2024 14:19:09 +0000

Sometimes the data an organization needs to make critical business decisions is simply not available, or at least not ready and waiting. Sometimes a dataset needs to be created before a company can begin to get insights that will help inform its strategy. And sometimes teams need to scale their processes around gathering and analyzing data without increasing manual effort — data that may itself be time-sensitive.

The information that companies need to make the right decisions can vary widely, but clinical trials, technology transfer offices, patent, conference, and competitive company information, and global regulatory, corporate, and organization news all offer data from which insights can be derived.

In this series, we explore the value that web crawling, creation of curated datasets, and delivery of targeted and relevant intelligence can provide to an entire organization, including data scientists, information managers, competitive intelligence professionals, and business development teams seeking information that will help them make the right decisions for their organization.

Finding Crucial Data in the Gaps

Organizations are increasingly seeking to make data-driven decisions, as doing so can:

Contribute to revenue growth
Improve productivity
Reduce costs
Streamline product development
Result in a stronger understanding of the competitive landscape

To become data-driven effectively, organizations naturally seek more and better data.

Responsible organizations must consider where the data that supports their decisions and analyses is sourced. Typical sources include the databases used by many businesses, including those for published literature and grants, as well as regulatory databases.

At CCC, we’ve discovered from our work with clients that desirable data for making better-informed decisions can exist outside these common sources — this data is found in “the gaps.” While licensed or structured datasets can be beneficial, they do not cover everything and sometimes an organization needs to fill the gaps between their data sources to create a single source that is unique and particularly relevant to the organization’s needs. When everyone is licensing and subscribing to the same datasets, a competitive advantage can come from using information somewhere in the middle, or in the gaps.

One example of data found in the gaps pertains to monitoring social media — organizations are interested in what others are saying about their products, and they are also interested in what their competitors are saying about themselves in the form of company news and press releases. Beyond social media, “gap data” can also be found in obscure, but critical, areas such as tracking product shipments across borders.

The Limitations of Search Engines

The limitations of using common search engines such as Google make it difficult to find vital data in the gaps. A gap, for example, might be key documents or pages that aren’t indexed by common search engines. Additionally, search engines commonly organize the results they present by what most users are searching for, so an engine such as Google uses its algorithm (“what do most people who are searching on this term want to see?”) to create a common answer.

But what if an organization is seeking an uncommon answer? To find it, a search might need to go well beyond the first few pages of results presented, potentially leading to a significant investment of time. Rather than use search engines that may apply an undesirable context to results, companies are better served controlling how their search results are presented by being able to apply their own context to their search.

While utilizing data found in the gaps can be hugely beneficial, there are challenges to doing so. This data likely hasn’t been tagged, curated, or normalized, leaving it unstructured and distributed and making the process to collect it manually intensive. Further, the information organizations are seeking can often lie alongside data not relevant to the context or need, leading to a lot of noise to filter out. Gathering relevant data in the gaps might bring in human error, then, as organizations assign workers who might not have deep knowledge of the business to sift through data.

Crawl, Enrich, and Deliver with Deep Search

Organizations seeking to successfully locate and utilize data found in the gaps have a solution they can turn to: “deep search.” Deep search is a solution encompassing a set of tools and processes that, subject to the terms of access and use of the relevant data source, can crawl targeted sources of data, enrich them, link and curate them, and then deliver them to a focused audience.

A deep search approach can help organizations get more value from their existing datasets, help them create new datasets from one or more sources, and even assist them in efficiently scaling processes and quickly collecting time-sensitive data with reduced manual effort. At CCC, we have more than 40 years of expertise in licensing and developing related solutions. If you’d like to learn more, we’re here to help.

This is the first in a four-part series on how deeper, automated searching can help your organization more easily find the information needed to make the right business decisions. Subscribe to CCC’s Velocity of Content blog to receive the next installments directly in your inbox.

The post Beyond Standard Search: Getting the Targeted Data Your Organization Needs appeared first on Copyright Clearance Center.

The Importance of Semantic Search Capabilities for Life Sciences Orgs

Gretchen Hover — Tue, 04 Apr 2023 12:48:34 +0000

The following is an excerpt from Accessing and Analyzing Relevant Content in Today’s Information Chaos. 

Semantic enrichment is the ingredient behind getting relevant search results even if they don’t use the same terminology as the query. For example, a query for “rare disease drug approval” would include results for the Orphan Drug Act from the FDA. Google recognizes that “drug approval” relates to “government regulations.” It also knows “orphan drug” and “rare disease” are associated, though different terms are used.

Compare this to another scenario. You’ve been asked to pull out an important piece of information that was emailed. You scour all your emails but cannot recall the exact verbiage or phrase in the subject line. The email’s text-based search function is unlikely to return the correct result unless you use the precise word, which inevitably leads to multiple search attempts and time lost hunting through emails.

The vast differences in algorithms between our two examples — Google and a simple email search — show the power and utility of semantic enrichment in our daily lives. We’ve grown to rely on search tools to automatically include appropriate synonyms.

Where it comes into play

To eliminate the noise and provide relevant search results, information solutions must go beyond simple keyword matching and to use search engines and algorithms that link concepts, topics, and associations to form a deeper understanding of a user’s intent.

For instance, a researcher in pharmacovigilance may need to identify and list all potential Injection Site Reactions (ISRs) before an upcoming clinical trial. Searching published materials might identify traditional symptoms such as sore arm, redness, and inflammation. However, without integrating the company’s Adverse Event or Safety database, the search results could miss other unknown reactions such as itching, eczema, and hives.

To tap into external and internal data sources, it becomes necessary to use biomedical vocabularies and ontologies (e.g., NIH’s MeSH [MeSH Browser, n.d.]) which are semantically enriched and indexed. The result would be that a search for “Injection Site Reactions” could produce results from known ISRs that had been published previously and catalogued and could also draw from adverse events gleaned through internal sources. A comprehensive solution would account for a company’s particular ontology as well as the various vocabularies specific to different organizations within the company.

Challenges and opportunities

While Google continues to evolve its search algorithms, biomedical research has its own set of challenges as noted in the article “Dug: A Semantic Search Engine Leveraging Peer-Reviewed Knowledge to Span Biomedical Data Repositories” by Waldrop et al: “Despite the practical utility of Google’s proprietary knowledge graph for general search, the provenance, depth, and quality of its biomedically relevant connections are not easily verifiable. There remains a need for a search tool capable of leveraging evidence-based biological connections to show researchers datasets useful for hypothesis generation or scientific support.”

This is where functionality beyond linking key terms evolves into topic-linking (or topic co-occurrence). Like Dug, scientific communities and commercial entities are collaborating to improve semantic search. Continuing to build dictionaries and structures to organize, link, and catalog scientific data will require standardization and sustained commitment.

Life science companies should look to software solutions that embed semantic enrichment to find relevant scientific concepts faster and to accelerate new discoveries.

Keep reading Accessing and Analyzing Relevant Content in Today’s Information Chaos. 

Learn more about finding relevant content across data sources with semantic search in RightFind Navigate.  

The post The Importance of Semantic Search Capabilities for Life Sciences Orgs appeared first on Copyright Clearance Center.

Semantic Search vs. Keyword Search

Phil Verdemato — Wed, 28 Dec 2022 06:13:25 +0000

Ever tried searching for medical papers using a standard search engine? Happy with results you get? Probably not. There are serious limitations to using keyword search in the pharmaceutical industry.

Imagine you were looking for papers featuring the enzyme type ‘GSK’. Using a generic search engine, you would get articles that mentioned ‘glycogen synthase kinase’, as well as articles about the company ‘GlaxoSmithKline’ – which is not particularly relevant to your search here. However, a semantic search engine, powered by scientific vocabularies and a disambiguation system, will just focus on results featuring the protein, giving you context specificity.

If you needed even more accuracy and wanted to find a specific protein such as GSK3, you would be required to do a search for:

glycogen synthase kinase 3 alpha, GSK-3-A, GSK3A, alpha glycogen synthase kinase-3, glycogen synthase kinase-3A…

It’s a pretty long list of synonym derivatives, right? A good semantic search system on the other hand, does all this for you when it indexes, so that you don’t have to worry when searching.

Transformative Data Integration

Having done this, you are then set up for better downstream data analysis because your conversion from unstructured to structured (typed) data is way more accurate.

You can then connect your enriched, structured data to databases and other systems, giving enhanced data connectivity across the organisation and speeding up analysis.

Group Level Searches

Great semantic search provides taxonomic relationships between its entities, so higher order searches are possible. Let’s take the example of ‘Viagra’ – whose current use was found as an adverse effect during its trials for pulmonary hypertension.

I’d find a bunch of articles that would mention things like Viagra’s protein target, Phosphodiesterase 5A (PDE5A). The image below shows how PDE5A and Phosphodiesterase 11A (PDE11A) were found in an article and where they sit in the taxonomy.

We can see that PDE5A sits in an enzyme taxonomy under the wider ‘Phosphodiesterase’ class. I could click on the ‘Phosphodiesterase’ class and get the system to search for anything under it:

You can see how PDE8B and PDE10A were identified in this way.

This becomes incredibly useful, say if you’re interested in finding out which competitors have developed drugs for a target you’re working on.

What you’re looking for is a rich set of taxonomies covering areas such as diseases, drugs, protein classes and so on.

A good semantic search engine will actually embed the concepts (that’s to say, entities such as “PDE5A”, entity classes, e.g. “gene”, or higher level abstractions like “protein class”) within the plain text. How is this useful? Well, query time is really quick and extremely accurate, all because you don’t have to do synonym expansion.

That’s hard to do in a generic search engine that doesn’t leverage life science taxonomy data in one step.

Connections

Additionally, you could examine the co-occurrence data to get a feel for the landscape. In this example, I could look at the indications commonly associated with documents mentioning PDE5A:

Here, we quickly see that Erectile Dysfunction and Pulmonary Hypotension are associated with PDE5A – and also how much time this can save when working in drug repurposing.

You could also look at co-occurrences on a sentence level. Sentence level co-occurrences are stronger indicators of a real association between entities than document level. Why? Because at a document level you might find entities in keywords section that hold spurious and unrelated terms.

A comprehensive autocomplete index helps guide your searches. A little bit more in depth than GSK the company or GSK the protein!

Fit-For-Science Search

But you’re not limited to entities and types that have already been curated. You can build your own vocabularies or use plain text.

Remember that semantically enabled search is as good as the vocabularies it’s built on. An excellent vocabulary with a huge number of synonyms means that typing in the brand name of a drug also brings up papers associated with its clinical name.

And there you have it – pitted against the depth and breadth that semantic search offers, keyword search simply cannot compete in terms of accuracy, full awareness, or efficiency. Semantic search allows you to buy back valuable time that would otherwise be spent sifting through huge amounts of documents, and even convert textual data into something you can integrate across your systems, thanks to entity recognition.

Ready to learn more? Check out:

RightFind Insight, powered by the SciBite® platform, brings the power of semantic enrichment to the search and reading experience to turn information into knowledge and accelerate new discoveries. Learn more here.

Editor’s Note: Updated December 2022

The post Semantic Search vs. Keyword Search appeared first on Copyright Clearance Center.

A Subscription Solution for Emerging Businesses

Adam Churchill — Tue, 12 Jul 2022 14:16:13 +0000

Content procurement and cost are top challenges for emerging research, life science, and biotech organizations. The ability to access and share scientific literature accelerates knowledge and powers innovation.

Many growing businesses are rapidly building their employee base and don’t have the human resources of larger organizations. Many employees in emerging organizations balance a wide array of secondary duties, including literature procurement and management. Solutions for supporting the resource needs of small-medium enterprises (SMEs) are in demand.

In August 2021, CCC and Wiley worked together to pair Wiley’s prestigious journals with RightFind’s powerful collaboration tools. RightFind’s collaboration tools already support numerous research teams, enabling them to more easily access the content they need to conduct critical work more efficiently and effectively.

Wiley Journals through RightFind provides access to the vast majority of Wiley’s articles at an economical and predictable annual price. Researchers get the benefit of an annual enterprise-wide subscription with the convenience of staying in the RightFind platform, strengthening copyright-compliant collaboration.

Wiley Journals through RightFind allows your organization:

Immediate access to 7.5 million+ articles from nearly 2,000 peer-reviewed journals
Benefit from a 200-year heritage of quality publishing; Wiley journals are trusted, highly cited, and globally relevant
Easy full-text access to scientific literature and data you need from Wiley, in a workflow that provides rapid discovery and insights
Access the latest thinking from extensive Editorial Boards of active researchers, subject experts, and research community leaders
Get the content they need when and where they need it, with the extra peace of mind of doing so in a copyright-compliant workflow

“Individual article purchase is often not the most efficient or cost-effective way for emerging R&D organizations to get content,” said Lauren Tulloch, Vice President and Managing Director, Corporate Solutions, CCC. “This collaboration with Wiley enables us to offer quick and easy access to a wide range of the content researchers need to conduct critical work efficiently and effectively.”

“Making Wiley content more accessible to small and mid-size enterprises helps us drive impact for our authors and partner societies, and helps us drive research forward,” said Duncan Campbell, Senior Director, Global Sales Partnerships, Wiley. “Partnering with RightFind enables us to meet users where they are and support their content needs.”

The success of Wiley Journals through RightFind has highlighted a need in SMEs with under 500 full-time employees to offer access to additional STM subscriptions. CCC looks forward to announcing expanded subscription services later in 2022.

The post A Subscription Solution for Emerging Businesses appeared first on Copyright Clearance Center.

Building Networked Innovation

Carlos Barroso — Thu, 23 Jun 2022 08:00:56 +0000

For those competing in the rapid-paced food and beverage industry, whether delivering consumer packaged goods (CPGs), agricultural solutions, or ingredients, you may find that it is time to embark on expanding your innovation ecosystem. Exploring networked innovation can help your food R&D team by accessing additional partner and research assets to generate new, breakthrough ideas and solutions.

There are three major resources to consider when broadening your innovation ecosystem through networked innovation.

Internal Expertise: At smaller companies with limited resources, it may be obvious who the experts are. However, in larger organizations or organizations with high turnover or geographical dispersion, this can be more of a challenge. It can be especially hard to reference expertise from even just a few years ago. When we at CJB and Associates, a boutique food and beverage innovation firm, are working with clients, we’ll too often be referred to previous work only to find there is no written record or it is difficult to locate. The institutional memory of the organization is often too vague to be of specific help with poorly kept, or non-existent content archives. Nevertheless, internal expertise is the logical place to start.
External Partners: If all your R&D has been done in-house so far, it may be as simple as looking to existing partners for some additional development or technology from outside the company. Every company we’ve worked with has some sort of network in place. Usually, it includes suppliers, a group of consultant experts, and possibly some universities and trade association groups. If a good solution can be found with an existing relationship it saves the time and risk of having to establish a new one. If you have exhausted your existing network of external partners, however, it may be time to expand that network to new research partners, including those in other fields, industries or geographies.
Scientific Research: Thorough research of academic and medical journals can identify additional blind spots and potential areas where networked partners can help. If you have access to trade publications and academic journals from other fields, don’t rule anything out as a potential source of ideas and partners. You would be surprised at the breakthroughs that have come from complementary or even completely different industries. Most network partners will also have published papers and research interest that will clue you into whether they might be good fit, or at least worth exploring.

Once you have decided to move forward with networked innovation and identified your expanded research and team of partners, where do you begin?

Defining goals, objectives and strategy

It’s a good idea to start by spelling out what networked innovation means to your company. At CJB and Associates, we spend a lot of time on the upfront part of a project, sorting through the available research, assembling the right team, conducting an in-depth review of the innovation priorities, and determining where and how networked innovation can help. While seemingly trivial, setting a solid idea for what qualifies as “success” is a vital first step in any innovation-oriented project.

Because we’re focusing on the CPG world, any success criteria has to involve a benefit to the consumer. It could be better taste, a health benefit, a cost benefit or some combination. A tangible and measurable definition of success will make defining actionable objectives much easier. For example, if better taste is an objective, set a consumer test-win versus competition or an internal reference as success criteria. If health is an objective, you may need a clinical trial or at least very clear product nutritional guidance (e.g. no trans fat and less than 15% saturated fat; less than 100 calories per serving).

Building a cross-functional team

Setting the strategy should be done cross-functionally and with strong management support and involvement. In the CPG world, you have to have engagement from supply chain, marketing, sales, legal and R&D and possibly HR if there is a cultural hurdle to overcome. The R&D function may talk about breakthrough technology. The legal team will want solid intellectual property (IP) protection and little risk of infringing on others’ IP. The marketing team will think about a superior consumer proposition. Sales will want a great story to bring to customers and a cost basis they can work with. Supply chain will want to make sure the new technologies that emerge can be scaled up and planned for in their capacity and ingredient purchasing planning. Finally, senior management should be able to articulate expectations of what a successful increase in sales and profit would look like and over what timeframe.

Looking at non-conventional solutions

Keep an open mind towards non-conventional solutions to achieve your goal. We have found one of the hallmarks of successful networked innovation is when a solution comes from an unexpected source. For example, in a project that looked for a better way to refine healthy edible oils, we found a promising lead from a group working on fuel cell technologies. In this case the fuel cell team was working with novel separation technologies that, as it turned out, could work for edible oils as well. In projects involving food ingredients, we have found many breakthrough solutions from the pharmaceuticals world. Pharmaceutical companies have spent years working on better drug delivery technologies. Many of those technologies are ideal for delivering flavors. The pharma groups are usually happy to find new applications for existing technologies and the food companies can take advantage of the years of development and safety testing without spending the many millions of dollars typical of a pharma R&D project.

The Role of Content Management in Networked Innovation

In addition to the discovery process, the management of content will be equally important throughout the networked innovation project. Test results and research content will often be exchanged both within and externally to your organization. This should be done in a way that minimizes duplication of effort and the risk of copyright infringement through unlicensed sharing.

A major key to successfully building networked innovation is having the right content workflow and collaboration tools in place from the start to allow R&D teams to:

Review existing research assets to identify internal experts, gaps and areas of opportunity.
Allow assessment of theories and collaborative research across all internal departments and functions to ensure quick buy-in and keep the project from bottlenecking.
Identify which of your existing partners are a best-fit through a review of published research.
Find off-industry experts and research leads more easily and efficiently. Identify research done on other industries’ consumers to help your team uncover new unarticulated wants and needs, and collaborate with the marketing and brand teams to gain valuable insight.
Save essential time and resources in the networked innovation process. The more eyes and minds that can be applied to sifting through leads and research, the better.

The challenges of building and running an innovation network are many, and the journey is long. But the rewards of new discoveries made from the collaborative effort are great. The right set of partners, tools, and an open mind will see you on your way in no time to the next big breakthrough.

More food for thought in these related blogs:

Food Companies in Crisis: What Happens and How to Prevent It

Populating Your Breakthrough Food Innovation Pipeline: 3 Types of Content You Might Be Missing

The post Building Networked Innovation appeared first on Copyright Clearance Center.

Publishing Research Directions

Christopher Kenneally — Mon, 07 Mar 2022 14:41:27 +0000

After nearly five centuries of printing journals and books, Cambridge University Press has announced plans to invite authors to share early outputs as they move through their research projects.

“Informed by feedback from hundreds of researchers, the first titles under the Research Directions banner will launch in 2022, with an initial set of questions and a publishing model that mirrors the research lifecycle, with results, analysis and impact reviews all published as separate, Open Access, peer-reviewed and citable outputs on the Press’s Cambridge Core platform,” according to an announcement last fall.

Click below to listen to the latest episode of the Velocity of Content podcast

Publishing Research Directions

The potential benefits of this move upstream include clearer credit for those involved in specific stages of research as well as encouraging collaboration through community engagement and discussion, explains Fiona Hutton, head of STM open access publishing and executive publisher at Cambridge University Press and Assessment

“Who can we bring together to solve those complex problems? – Bringing together people that wouldn’t normally work together, trying to learn from each other, and trying to get towards the outcome that they all want to get to, but have been limited by either being in silos or within disciplines.

“We don’t want to just talk about open research,” Hutton tells me. “If we develop and evolve what journal publishing is, and we develop it in a way that is needed by researchers in particular fields, obviously those researchers will want to publish in the journal, and we will be able to essentially have a business model that works for a research community.”

As CUP’s head of user experience, Rubem Barbosa-Hughes envisions the effort will increase the visibility and transparency of research as well as the diversity of researchers.

“It brings up contributions that otherwise would have been buried into the traditional journal publication model,” Barbosa-Hughes says. ‘You might have a dataset or another piece of info that fueled the output in the journal which wouldn’t get the same visibility in isolation.”

The post Publishing Research Directions appeared first on Copyright Clearance Center.

SDG Content Hub Magic Number Is 17

Christopher Kenneally — Mon, 21 Feb 2022 14:04:52 +0000

What’s the lucky number for Springer Nature’s SDG publishing program? The answer is 17, for 17 content hubs covering all 17 United Nations Sustainable Development Goals.

The 17 SDG hubs bring together multidisciplinary, peer-reviewed research content from across Springer Nature’s output of journals and books, including its flagship publication, Nature. The effort aims to connect policy and business professionals with the research that can provide the information and the evidence they need to resolve global development challenges – from eliminating poverty and hunger to delivering quality education and economic growth.

Springer Nature is among many publishers around the globe who are signatories of the SDG Publishers Compact, which launched in 2020 in collaboration with the International Publishers Association. The compact commits publishers to accelerate progress to achieve the UN’s Sustainable Development Goals by 2030.

The Springer Nature effort, though, stands out for its scope and ambition. Nearly 400,000 articles and chapters on SDG-related topics have been published since 2015.

“In 2015 when the SDGs were formally adopted, we realized the challenges that they set out to address, are things that need to be informed by high-quality academic research that can be trusted. As a publisher of that kind of research, we realized we had quite an important role to play,” Nicola Jones, head of publishing for the Springer Nature SDG program, told CCC.

The merger of Macmillan Science and Education with Springer Science and Business Media, which created Springer Nature, also occurred in 2015, Jones notes.

“We saw the SDGs as a way to try to bring colleagues together across this brand-new, very large organization to help make a difference in the real world – something people could really get enthusiastic and excited about and to realize how they were making a difference.”

The post SDG Content Hub Magic Number Is 17 appeared first on Copyright Clearance Center.

Workflow of the Future: The Role of Standards Part II

Dave Davis — Fri, 11 Feb 2022 09:39:49 +0000

Cord Wischhöfer, CEO of DIN Software shares perspectives from a Standards Development Organization (SDO)

On Jan 27th, 2022, CCC brought together a panel of industry experts who presented their perspectives on the challenges faced by standards developers and standards users as they move into a dynamic future. This event was supported by the U.S. Department of Commerce’s International Trade Administration’s Market Development Cooperator Program It was well-attended by representatives from US-based, Canadian, and European standards development organizations as well as from knowledge management leaders representing top pharmaceutical companies, global engineering consultancies, and multinational energy firms. All participants had the opportunity, during concurrent breakout sessions, to weigh in from their experiences from all sides of the world of standards.

After being introduced by CCC’s own Andrew Robinson, our moderator Jonathan Clark (of Jonathan Clark & Partners BV), facilitated the discussion among:

Tatiana Khayrullina, Consulting Partner, Standards and Technical Solutions. Outsell Inc.
Cord Wischhöfer, CEO DIN Software GmbH
Simon Klaris Friberg, Senior Librarian/Information Consultant, Rambøll Danmark A/S

Note to readers: We will paraphrase and encapsulate the key points made by each speaker in a series of blog posts, of which this is the second. This link takes you to a recording of the presentation itself.

DIN Software is a subsidiary company of the German Institute for Standardization, DIN e.V. They principally produce databases of standards metadata, and standards-related content. In the standards domain, DIN manages all the XML content their clients and customers use to create products.

Cord Wischhöfer, our second presenter, walked us through the standards work in which DIN engages on a daily basis. I found Cord’s graphic which appears next to be eye-opening —not just for me but for others in the audience.

The graphic, reproduced here with the permission of Ivan Salcedo of the British Standards Institute (BSI), maps the levels of utility of standards and is part of the discussion surrounding smart standards at ISO, CEN and CEN-CENELEC.
The rest of this post represents my summary of Cord’s comments as he described this visual and the issues and evolution it suggests.

This vision suggests that — in the very near future — machine-to-machine data exchange of the components of standards will become commonplace for engineering organizations. These standards may function as referral documents or perhaps as a database component in cases where customers use the data as a service. This set of future utility levels brings up all kinds of interesting questions.

In effect, the graph illustrates the observation (among many in the standards community) that until a few years ago engineers were mainly using paper versions (Level 0), and simply reading and applying this material in their work. However, over time, the use case has shifted to a machine-displayed document like a PDF (Level 1), such that it was easily viewable, the user could store it more easily and mark up the sections most relevant to their work, and so forth. But the user couldn’t really use the content, so this was basically still a relatively “dumb” document.

The levels ascend in complexity from there. Level 2 is now industry-standard such that certain standards publications already are “machine-readable” (to a degree) in that basically they are XML-structured documents based on the new NISO Standards Tag Suite (NISO STS). In other words, a cooperative venture between mainly European standards bodies and their American analogues has agreed on a common XML format for the publication of standards, but this XML format is still rather “dumb,” in a manner of speaking. It’s very much document-oriented: While it tells you that “this is a paragraph” and that “this is a normative reference,” it doesn’t really inform the user anything about the content; that is, while the verbal content is still (technically) machine-readable, it is not semantically enriched at all. One could easily identify far more sophisticated solutions, where readers could access the structured content of standards – not the whole document, but the actionable and relevant parts for the project at hand.

Level 3 postulates fully machine-readable content and that’s where we are moving to at the moment. At Level 3, the standards developer will identify certain elements and other standards relevant for user applications in industry, for example, incorporating requirements from standards into the user’s Requirements Management System.

What this means is, basically, no more cutting and pasting. Instead, users will be able to transfer sections and subsections into their requirements management system as needed. And SDOs will offer services to customers saying we can give you all the requirements from certain standards. This, at least, is our vision for the next-stage future.

Level 4, which is still on the drawing board, will arrive when the content can be called “intelligent” or “smart” — smart as in “smart phones” and “smart cars.”

In this still-a-few-years-off environment, users with be provided APIs through which they will access the required information from the intelligent standards, as discovered by intelligent search engines (not yet available), which can then answer questions which may require retrieval from a range of 10 (or possibly 20,000!) documents. We look forward to getting the right answer for the right question, just in time, for the user in need of it. That visionary state will, no doubt, be very difficult to achieve. We are working on it though, because that’s what we hear is needed from our customers.

Along with our customers, we are looking ahead to 2030 and asking: What does the world of standards look like at that point? We anticipate sentence- or provision-centric tagging in the standards content, in effect a much more granular tagging of content in standards, and we’re developing an information model that will get us there. The vision is to service applications and industry with meaningful information, with less time wasted by engineers searching for things and more time doing engineering. Eventually, we will see standards move away from the document-based standard that has existed for over a hundred years to a data-centric version, where most of the content is in formats like XML or other textual representation, and is available either via libraries or databases queried automatically or semi-automatically.

Cord concluded by expressing his hope his overview provided listeners and readers with basic and accurate idea of how, within the ISO world and among the associated national standards developers community, they are working every day to develop a new level of utility and granularity for a better user experience.

This series will conclude with the perspective of Simon Klaris Friberg (Senior Librarian/Information Consultant, Rambøll Danmark A/S).

Standards Development Organizations: Learn More

The post Workflow of the Future: The Role of Standards Part II appeared first on Copyright Clearance Center.

Workflow of the Future: The Role of Standards Part I: Industry Insights from Tatiana Khayrullina of Outsell, Inc.

Dave Davis — Fri, 04 Feb 2022 13:45:04 +0000

On Jan 27th, 2022, CCC brought together a panel of industry experts who presented their perspectives on the challenges faced by standards developers and standards users as they move into a dynamic future. This event was supported by the U.S. Department of Commerce’s International Trade Administration’s Market Development Cooperator Program. It was well-attended by representatives from US-based, Canadian, and European standards development organizations as well as knowledge management leaders representing top pharmaceutical companies, global engineering consultancies, and multinational energy firms. All participants had the opportunity, during concurrent breakout sessions, to weigh in with their experiences on various perspectives on the world of standards.

After being introduced by CCC’s own Andrew Robinson, our moderator Jonathan Clark of Jonathan Clark & Partners BV, facilitated the discussion among:

Tatiana Khayrullina, Consulting Partner, Standards and Technical Solutions, Outsell Inc.
Cord Wischhöfer, CEO, DIN Software GmbH
Simon Klaris Friberg, Senior Librarian/Information Consultant, Rambøll Danmark A/S

Note to readers: We will paraphrase and encapsulate the key points made by each speaker in a series of blog posts, of which this is the first. This link takes you to a recording of the presentation itself.

Tatiana Khayrullina of Outsell led off with an overview of the role of standards in those 21st century organizations that are deeply involved with R&D efforts.

Standards reside in the realm of engineering solutions, which in recent years have undergone transformation as the industries they serve are reshaped by market and technological factors.
The factor most widely experienced in these industries is the increasing pace of technological advancement, boosting the speed of product design, which in turn boosts the speed of the R&D process, as well as that of product development.
The concepts of “content” and “data” are converging, and new formats (e.g., video, audio) are becoming more dominant. At the same time, reproducible results are becoming more and more important.

Tatiana next urged upon us that these trends are critically important for engineering solutions vendors and information providers to keep in mind, especially as they work to “future proof” their businesses. First and foremost, these vendors ought to embrace “smart data.” In our time of transition, engineering solutions run the gamut from paper-based all the way to the distributed realm of the nascent metaverse. Leading providers of engineering solutions do recognize these realities. These vendors also recognize that they need to support their customers in the transition to contemporary models that rely more and more on smart and autonomous systems that run on data, such as those utilizing machine learning, all of which we may refer to generally as ”Industry 4.0.” As leading providers of engineering solutions, they aim to support their customers who want to utilize the data that’s available to them in an optimal way. And this includes both internally and externally sourced data —and thus, including standards.

Success for these leading providers depends on finding an optimal way of utilizing the data and matching the level of technological maturity of their customer companies. Because they’re each at a different phase of transition to Industry 4.0, optimizing data for some companies could mean eliminating unnecessary internal silos, while for other companies, it could mean incorporating the next generation of smart standards in their engineering pipelines.

Engineering solutions content vendors, including some standards vendors, have seen an advantage to staying highly flexible with regard to formats (e.g. paper, or PDF and XML). Essentially, they’re allowing their customers to take content further down the line in workflow and incorporate that content in whatever shape they require.

Content vendors, meanwhile, are finding new opportunities to provide solutions to address specific market needs for specific customers in collaboration with leading software vendors. We have also observed a growing trend of pure play software vendors showing an interest in incorporating content-based solutions into their portfolios. This development is not very large right now, but it fuels an appetite for merging workflow and content.

Targeting engineers as an audience is challenging since it is a highly fragmented market. Historically, the way to address this was to create a collection vast enough to satisfy the plurality of the organization’s needs. With the new users coming on the scene —new users accustomed to a more sophisticated delivery of digital content—this may not work any longer, so the market is developing an appetite for more curated and more focused collections.

The gap between the data needs of users and the capabilities of content providers is widening. This discrepancy has existed for a while; in order to address it, leading enterprises have created in-house solutions. And for many of them, this is just part of their R&D process; this is the way they create their IP. Cybersecurity concerns also prompt leading enterprises to create homegrown data solutions as a result. Finally, the growing volume of publicly available data and information also allows users to fill that gap.

The next post in this short series will focus on the remarks of Cord Wischhöfer, (CEO DIN Software GmbH) on the “Maturity Levels of Standards Adoption” and conclude with the perspectives of Simon Klaris Friberg (Senior Librarian/Information Consultant, Rambøll Danmark A/S) on the role of standards from the users’ side.

Standards Development Organizations: Learn More

The post Workflow of the Future: The Role of Standards Part I: Industry Insights from Tatiana Khayrullina of Outsell, Inc. appeared first on Copyright Clearance Center.