PRESS RELEASE | Bangalore, INDIA August 2019
To promote data sharing in research, an increasing number of publishers, funders, and institutions have adopted data sharing policies that either recommend or mandate that data associated with a research article be shared in a public repository.
Even when data sharing is mandatory, compliance with these policies is patchy. Authors are often unsure which datasets they should share and which repository is most appropriate, and stakeholders cannot easily assess whether authors’ sharing efforts meet their policy requirements.
DataSeer is being developed to address these challenges. DataSeer is an open source web-based service that uses Natural Language Processing (NLP) to identify datasets associated with a particular article. Authors are shown best sharing practice for their type of data, and stakeholders are sent a report detailing the completeness of the authors’ sharing efforts.
The project is led by Dr Tim Vines, PhD, a peer review workflow expert who conceived of DataSeer while trying to enforce the data sharing policy at the journal Molecular Ecology. For training the DataSeer NLP algorithm to find dataset mentions in research articles, Dr Vines engaged Molecular Connections. “Molecular Connections are working with us to generate a database of sentences in research articles that describe data collection. We are currently focused on the Methods sections of 2000 published research articles from a wide range of research fields. In each of those Molecular Connections has identified the main data collection sentence, the type of data being collected, and noted any specialist equipment the researchers used.” commented Dr. Vines.
He continued, “Our aim is to develop this database into a high-quality open resource for anyone interested in how researchers describe their data collection efforts. Going forward, we will be working with Molecular Connections to expand the database to new fields and incorporate other components of the research cycle, such as data analysis.”
Krishna K., Director Sales & Marketing, Molecular Connections commented, “content curation is our core business, and we started Molecular Connections way back in 2000 as a “text mining” curation company. Today, we are the world leaders in this space & work with publishers – both primary and secondary, societies and pharmaceutical companies, hand-holding all their content and technology needs. Over the last 2 decades, our 2000+ SMEs have dealt with a myriad of content types & have a 360 degree view of the challenges faced. We are extremely delighted to work with Dr. Tim Vines, sharing our experience, perspective, best practices learnt from our association with the industry, translating that into building a best-in-class workflow solution for DataSeer.”
About DataSeer: DataSeer is an open source project that fills a major implementation gap in open data: identifying exactly which data researchers should share. Dr Tim Vines is the founder of DataSeer and the project lead.
About Molecular Connections: Structuring the Unstructured – Data to Insights
Molecular Connections is the largest STM Indexing and Abstracting Company from India. MC’s 360º solutions help publishers and pharmaceutical companies maximize the value of their information assets. With expertise spanning across multiple domains including, machine learning, text mining, literature curation, ontology development, content analytics and visualization, we serve various industries with proprietary services and products.
Krishna |
Director – Sales and Marketing | Cell: +91 96113 85566
Molecular Connections P Ltd. | Corporate Development Center, Bangalore – India