FAIR Agronomy, where are we? The KnetMiner Use Case
Tue 11 Jan– 15:50 – 16:20 CET
Marco Brandizi and Keywan Hassani-Pak, Rothamsted Research
FAIR data principles are being a driving force in life sciences and other scientific domains, helping researchers to share their data and free all of their potential to integrate information and do novel discoveries. Knowledge graphs are an ever more popular paradigm to model data according to such principles, and technologies such as graph databases are emerging as complementary to approaches like linked data. All of this includes the agronomy, farming and food domains. How advanced the adoption of sound data management policies is in these life domains? How does that compare to other life sciences? In this presentation, we will talk about our practical experience, focusing on KnetMiner, a gene and molecular biology discovering platform, which is based on building and publishing knowledge graphs according to the FAIR principles, as well as using a mix of linked data standards for life sciences and recent graph database and API technologies. We will welcome questions and discussions from the audience about similar experience.
Panel session – Agri – semantics. Chair Chris Baker
Tue 11 Jan 18:30 – 19:30 CET
Head, Digital Technology
Directorate for Open Science (DipSO)
French National Institute for Agricultural Research (INRAE)
Esther Dzalé is an expert in managing scientific data, and uses her computer science skills to share scientific research. Esther is in charge of the digital technology for science team of the Directorate for Open Science (DipSO) at INRAE. Research data management is a core aspect of Esther’s career, and she’s particularly passionate about sharing research data. More than a simple interest in the topic, for Esther, it’s a real conviction. “Working in open science means making sure that scientific knowledge is available and accessible beyond linguistic boundaries, between organizations and across disciplines. It’s very meaningful; that appeals to me.” Starting in 2013, Esther began examining questions about managing and sharing data, such as how, when and if data should be shared, under what conditions, and how to ensure that others could make best use of the data. This process of reflexion led to the development of Data Partage, a website to exchange best practices, and Data INRAE. The aim was twofold: to create a space, outside of thematic reference data warehouses, where all of INRAE’s scientific data could be compiled and shared, and to develop a catalogue of INRAE data. Through Data INRAE, Esther and her teams contributed to make INRAE’s data more accessible, interoperable and reusable. She is also active at the international level, participating for example in the Research Data Alliance: she co-shared the wheat data interoperability working group and the IGAD interest group from 2014 to 2018. Her work has informed INRAE’s data management policies. Recognised as an expert in her field, Esther and her teams have been entrusted with the management of an ambitious national project to develop a data repository for all universities and research institutes in France. The project is set to launch in spring 2022.
Biometry and Bioinformatics Team, Institute of Plant Genetics, Polish Academy of Sciences
Dr Hanna Ćwiek-Kupczyńska is a data scientist working in plant research, particularly in managing and analysing data obtained from plant phenotyping experiments. In collaboration with multiple European research partners from EU infrastructure projects and initiatives (such as tranPLANT, EPPN2020, ELIXIR Plant Community) she has been involved in standardisation efforts for plant phenotyping data description which resulted in shaping the MIAPPE recommendations, a corresponding data model and its serialisations for RDF (PPEO), web services (BrAPI) and flat files (ISA-Tab). She continues collaboration with the community to facilitate FAIRification, sharing and integration of plant (high throughput) phenotyping results. She is also interested in semantic modelling and linking of experimental data with statistical and scholarly information to improve research dataset discoverability.
Head of Ecoinformatics, Rothamsted Research
Richard is the team leader for the newly formed Research Data Systems group at Rothamsted Research with responsibility for supporting research data stewardship across the institute and Rothamsted’s three data rich National Capabilities (Long-term Experiments, North Wyke Farm Platform and Rothamsted Insect Survey). Since joining Rothamsted in 2017 Richard has advocated for the adoption of FAIR Data Principles to maximise the impact and value of the institutes research data. Richard and his team now have a leading role for improving research data stewardship as part of Rothamted’s ongoing Digital Transformation. As part of the transformation Richard has led a successful project to improve and standardise data management across Rothamsted’s research farms using the open source farmos.org platform and pilot new field survey tools. The next phase of this project is to integrate FarmOS field experiment management with experiment data, using the Grassroots Infrastructure, to provide standards compliant and annotated research datasets. This project has been a formative experience, highlighting that achieving interoperable and re-usable data is as much a cultural change as a technical one; data creators need to be empowered with appropriate tools and the data literacy not just to use them but also why.
Having previously worked at CEHs Biological Records Centre managing the National Biodiversity Network database and at QMUL on long-term cancer trials it was only natural that on joining Rothamsted Richard took a special interest Rothamsted’s Long-term Experiments. Richard has led redevelopment of e-RA which provides comprehensive LTE descriptions and data access. He has also been an active contributor to the Global Long-term Experiments Network, leading development of the GLTEN metadata schema which provides a consistent and semantically meaningful description of these experiments. From his work on long-term experiments, Richard believes they encapsulate many of the wider challenges faced for integrating and re-using agricultural data.
Richard is currently co-chair of the UK Environmental Observation Framework Data Advisory Group, member of the UK Soil Observatory collaboration and has played an active role in the RDA IGAD where he is currently contributing to a new Crop Data Management Interoperability Work Group.
Digital Conservator, Yale University Library
Dr. Katherine Thornton is an information scientist working on the WikiFCD project. WikiFCD aims to provide web access to food composition datasets. Kat has mapped parts of the WikiFCD data model to Wikidata.
The WikiFCD SPARQL endpoint allows consumers of food composition data to query this data in combination with data from Wikidata or other available RDF data. Kat is a member of the Joint Food Ontology Workgroup.
Kat has been contributing to Wikidata since 2012 and is a member of the ShEx Community Group. Kat is also the co-founder of ScienceStories, a multimedia biography website for scientists.
Liverpool Hope University, UK / LIAAD-INESC-TEC, Portugal
Dr Brett Drury gained his PhD at the University of Porto under the direction of Prof Luis Torgo and he undertook Post Doctoral Studies at the University of Sao Paulo under the guidance of Prof Alenu Lopes. He was a research fellow on the ROCSAFE H2020 Project at the National University of Ireland Galway and former holder of a PIPE Grant from FAPESP, as well as an Innovation Fellow at the Royal Society of Engineering. He is a referee for Computer and Electronics in Agriculture, and a PC member for a number of academic conferences such as Intelligent Data Analysis. Brett specialises in Probabilistic Reasoning and Natural Language Processing (NLP) and has deployed Deep Learning in areas such as image classification.