- How does Data Science impact the Semantic Web? by Philip Bourne
- Semantic Web and the New Industrial Revolution by Dean Allemang
- Spinning a Semantic Web for Agriculture by Medha Devare.
- Artificial Intelligence and Big Data in Health: the dilemma of Truth
by Christian Lovis.
How does Data Science impact the Semantic Web?
Semantic Web and the New Industrial Revolution
Dean Allemang is CEO and Principal Consultant at Working Ontologist, LLC, a firm devoted to the deployment of Semantic Web solutions. In this capacity, he has provided industrial-strength semantic solutions in a variety of industries, including finance, media, and government. He has served on review boards for semantic web programs for pharmaceuticals, health care and agriculture. As a long-time traveller in knowledge science (PhD in AI in 1990, worked at five different AI labs in Europe in the early 90’s, and co-founded one of those companies in the late 90’s that tried to invent the Semantic Web when the standards were just a gleam in the eye of a few W3C folks), Dean has a broad perspective on how knowledge-based systems can be appied in a variety of contexts. Dean co-author of Semantic Web for the Working Ontologist, the best-selling book on the Semantic Web, and is expert on the W3C semantic web standards, including RDF, RDFS, OWL, SPARL, SKOS and SHACL. With a PhD in Computer Science (as a NSF Fellow at Ohio State) and a MSc in Pure Mathematics (as a Marshall Scholar at Trinity College, Cambridge), he brings a formal approach to semantic modeling, which he couples with over 15 years experience of successful business deployments of knowledge technology. Dean is a repeat winner of the Swiss Technology Center award for Innovation.
Spinning a Semantic Web for Agriculture
CGIAR is a global research partnership of 15 Centers primarily located in developing countries, working in the agricultural research for development sector.
The CGIAR system is charged with tackling challenges at a variety of scales from the local to the global, which generally means being able to query and/or aggregate a variety of data types and streams. CGIAR aspires to FAIRness in its research outputs, and while progress is being made, the I of FAIR is the frog that doesn’t readily turn into a prince for the FAIR ringmistress in this tale. Yet, interoperability – particularly semantic interoperability— is critical to providing meaning and context to CGIAR’s varied information resources and enabling integration between linked or related data (e.g. an agronomic data set and related socioeconomic data).
CGIAR’s approach to interoperability and data harmonization focuses on the use of standard vocabularies, and strong reliance on ontologies developed across CGIAR (efforts such as the Crop Ontology, the Agronomy Ontology – AgrO, the in-development socioeconomic ontology – SociO), and other entities (ENVO, UO, PO etc.) Spinning these into a semantic web for agriculture is a primary focus of the CGIAR’s Big Data Platform for Agriculture, and its Global Agricultural Research Data Innovation and Acceleration Network (GARDIAN). GARDIAN is intended to provide seamless, semantically-linked access to CGIAR publications and data, to demonstrate the full value of CGIAR research, enable new analyses and discovery, and enhance impact.
In this talk, I will discuss CGIAR’s path to a FAIRy tale ending, complete with the sticky considerations entailed in web-spinning.
Medha Devare Ph.D. is Senior Research Fellow with the International Food Policy Research Institute (IFPRI). She led the CGIAR System‘s Open Access/Open Data Initiative, and currently leads efforts to organize data and enable semantic interoperability across CGIAR’s 15 agriculture for development-focused Centers through its Big Data Platform. Medha is a Cropping Systems Agronomist with significant experience leading projects addressing food security and sustainable resource management in South Asia. She also has expertise in data management and semantic web tools; while at Cornell University, she was instrumental in the development of VIVO, a semantic web application for representing academic scholarship.
Artificial Intelligence and Big Data in Health: the dilemma of Truth
In medicine, the emergence of several new instruments in the field of artificial intelligence, such as deep learning, is closely linked to the increased availability of large data sets. We are not only in a new era of data driven life sciences, we are also seeing these enabling technologies bringing major changes in numerous health sectors, such as: genomics and precision medicine, disease prevention and health determinants, decision support, citizen empowerment. These changes are leading more than a simple evolution of medical practice, they are raising numerous challenges at all levels, reaching from science and engineering all the way to legal, societal and ethical considerations.
The data. Factors influencing health and disease cover the whole range of imaginable determinants. The current health data landscape, made up of genomics and molecular biology, is increasingly seen as only a fraction of the whole landscape that includes the healthcare system, individual behavior, socio-economic factors, environmental exposure, and finally the regulatory and geo-political system. In other terms, our health comprises of what we are, how we behave, and the ecosystem we inhabit and interact with. Data science applied to determinants of health thus covers a gigantic range of data that is massive in quantity, highly heterogeneous, as well as multimodal and multidimensional in scope. Moreover, the highest burden for humans and society is caused by chronic conditions, most of which reflect very long disease histories. Conditions with very different etiologies, such as heart failure due to hypertension, chronic obstructive pulmonary disease due to smoking, and blindness due to glaucoma, all require long exposure, usually more than 20 years to emerge.
The knowledge. The challenges raised by the high complexity of data is reinforced by the challenge of depending on a dynamic and fragile frameworks of evidence. The most important source of knowledge in life sciences is the US National Library of Medicine (NLM) of the US National Institute of Health (NIH) very large digital databases. These databases cover all fields of life sciences, from molecular biology, such as Genes and Proteins, to peer-reviewed publications. The latter is an important source of “evidence-based peer-reviewed literature”, or “truth”.
The intelligence. Artificial intelligence, and especially supervised deep learning, has significant potential in medicine. However, these instruments rely on reliable sources of “truth” that in turn depend on “well annotated” data sources. Building decision-support systems, or predictive systems, requires also that we build trust in these new tools. Opening the black box in order to improve understanding of the mechanistic and deterministic processes that drive deep learning is an important, but insufficient step forward. Building trust requires that we develop methodologies able to assess the reproducibility, the generalizability, and thus the reliability of these tools. It also requires assessment of their positive and negative predictive power in various a priori probability landscapes.
Societal challenges. Finally, there are many challenges at the societal level. Privacy is very difficult to preserve, as it is extremely difficult to anonymize any particular individual’s dataset without losing the granularity required for analytics. But other questions arise, such as how to build a social system based on the distribution of risks, when risks and protecting factors become potentially known for each individual. In this talk, the hopes, expectations and challenges described above will be discussed.
Christian is professor of clinical informatics at the university of Geneva and chairman of the division of medical information sciences at the university hospitals of Geneva. He is a medical doctor board certified in Internal Medicine with special emphasis on emergency medicine, master in public health from the University of Washington, parallel education in biomedical informatics at the University of Geneva, focusing on clinical information systems, clinical data interoperability and medical semantics. One of the focus of his team is on phenotype clinical data interoperability and semantic representation to support research, including a strong lead on computational linguistics and tools to use medical narratives and texts. He is a Fellow of the American Medical informatics association (FACMI) and medical informatics certified of the German medical Informatics Association. Christian co-authors more than 150 publications. He is editor-in-chief of JMIR Medical Informatics and editorial board member of several peer-reviewed journals in biomedical informatics, such as the Journal of the American Medical Informatics Association (JAMIA), PLOS One, Applied Clinical Informatics (ACI) and BMC Big Data Analytics. He is member of the executive board of the Swiss Personalized Health network initiative. He is the 2016-2018 president of the European federation of medical informatics. Christian has participated to several start-ups