While text mining data from scientific article abstracts provides some value, there are limits to what data can be found. The ability to mine the full text of the article rather than the abstract, ensures that you don’t miss vital data, discoveries and assertions that are only published in the full-text. For detailed descriptions of methods and protocols and the complete information on all study results, full-text is an essential resource.
However, obtaining full-text articles for text mining is often a struggle. When you do get access to the full-text, you must contend with multiple formats and inconsistent license terms, all of which can inhibit your text mining efforts.
To address these issues, Copyright Clearance Center (CCC), in partnership with Linguamatics, developed an integrated solution to make it simple for I2E end-users to obtain and index full-text XML articles from a wide range of scientific publishers.
Benefits of the Combined Solution
- Automatically index full-text content in XML format in I2E from CCC’s database of millions of articles from major STM publishers
- Discover and purchase XML-formatted articles outside of your company’s subscriptions for mining, and index them in I2E
- Store, manage and analyze XML-formatted article content for all your literature-based text mining projects