Keynotes

This year’s Keynotes are:

How does Data Science impact the Semantic Web? by Philip Bourne
Semantic Web and the New Industrial Revolution by Dean Allemang
Spinning a Semantic Web for Agriculture by Medha Devare.
Artificial Intelligence and Big Data in Health: the dilemma of Truth
by Christian Lovis.

How does Data Science impact the Semantic Web? Watch here

In establishing data science initiatives, first within a major US federal agency (The National Institutes of Health) and now with a major public university (The University of Virginia) one continues to encounter tension I will attribute here to different modes of discovery – looking for a needle in a haystack versus purposeful navigation of semantic content. Over-simply perhaps, also stated as the tension between working with structured versus unstructured data. A subject for which there can be endless debate. Viewed through my simplistic lens as a researcher who wants to maximize their chances of discovery, I embrace both semantically rich as well as poorly characterized, often large data blobs. What emerges? How do we train across this divide? How do we structure our organizations to maximize success given such a data and knowledge landscape? Far from having answers I will present what we are doing to address these questions as much as to attempt to crowd source answers to weighty questions. Keynote Speaker Slides Philip E Bourne SWAT4HCLS 12 04 2018

Philip Bourne

Philip E. Bourne, PhD, FACMI is the Stephenson Chair of Data Science, Director of the Data Science Institute and a Professor in the Department of Biomedical Engineering at the University of Virginia.

Prior to that he was the Associate Director for Data Science (ADDS; aka Chief Data Scientist) for the National Institutes of Health (NIH) and a Senior Investigator at the National Center for Biotechnology Information (NCBI). In his role as ADDS he led the trans NIH US $110M per year Big Data to Knowledge (BD2K) research initiative and contributed to data policies and infrastructure aimed at accelerating biomedical discovery. Examples include: establishing the NIH Commons, support for data and software citation and establishing preprints as a supported form of research. Prior to joining NIH, Dr. Bourne was Associate Vice Chancellor for Innovation and Industry Alliances in the Office of Research Affairs and a Professor in the School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego (UCSD). Dr. Bourne is a Past President of the International Society for Computational Biology, an elected fellow of the American Association for the Advancement of Science (AAAS), the International Society for Computational Biology (ISCB), the American Medical Informatics Association (AMIA) and an inductee of the American Institute for Medical and Biological Engineering (AIMBE). He has published over 350 papers and 5 books and co-founded 4 companies. Awards include the Jim Gray Award eScience Award and the Benjamin Franklin Award.

Homepage – Linkedin – Twitter.

Semantic Web and the New Industrial Revolution Watch here

Data has come a long way. From humble roots as a computer science discipline and data centers that the business viewed as annoying cost-centers, we have moved through data-driven companies to industries where data is their main product. Today, many companies can no longer operate without the insights they derive from data. We’ve seen Big Data become a thing, leading on to machine learning and deep learning. We now have professionals with 20 years experience as Data Scientists, even though that title was unheard of even a decade ago. The essential importance of data is recognized in every industry. Recent developments around the use and abuse of personal data, coupled with the ubiquity of social media, have raised awareness of the importance of data in the public sphere.

At each stage of this revolution, we have seen — and so far, overcome — new challenges of dealing with data. Big Data dealt with volume and velocity of data; NoSQL dealt with the variety of data. But as data applications move from the enterprise level (using data to help my business) to an industrial level (using data to progress entire industries), we face the challenge of interoperable meaning of data. How can we understand what our data means, so that we can use it on an industrial scale?

The Semantic Web has been dealing with data meaning for decades, and was designed to deal with data integration on a world-wide scale (hence its roots in the world-wide web). It is now enjoying success on an industrial scale in industries such Life Sciences, Healthcare, Finance, Agriculture, and Media. These industries recognize the need for data integration on a world-wide scale and have made strides toward making it a reality. In this talk, I will review these efforts, compare the challenges they face, and examine the challenges ahead. Keynote Speaker Slides Dean Allemang SWAT4LS 2018

Dean Allemang

Dean Allemang is CEO and Principal Consultant at Working Ontologist, LLC, a firm devoted to the deployment of Semantic Web solutions. In this capacity, he has provided industrial-strength semantic solutions in a variety of industries, including finance, media, and government. He has served on review boards for semantic web programs for pharmaceuticals, health care and agriculture. As a long-time traveller in knowledge science (PhD in AI in 1990, worked at five different AI labs in Europe in the early 90’s, and co-founded one of those companies in the late 90’s that tried to invent the Semantic Web when the standards were just a gleam in the eye of a few W3C folks), Dean has a broad perspective on how knowledge-based systems can be appied in a variety of contexts. Dean co-author of Semantic Web for the Working Ontologist, the best-selling book on the Semantic Web, and is expert on the W3C semantic web standards, including RDF, RDFS, OWL, SPARL, SKOS and SHACL. With a PhD in Computer Science (as a NSF Fellow at Ohio State) and a MSc in Pure Mathematics (as a Marshall Scholar at Trinity College, Cambridge), he brings a formal approach to semantic modeling, which he couples with over 15 years experience of successful business deployments of knowledge technology. Dean is a repeat winner of the Swiss Technology Center award for Innovation.

Linkedin.

Spinning a Semantic Web for Agriculture Watch here

CGIAR is a global research partnership of 15 Centers primarily located in developing countries, working in the agricultural research for development sector.
The CGIAR system is charged with tackling challenges at a variety of scales from the local to the global, which generally means being able to query and/or aggregate a variety of data types and streams. CGIAR aspires to FAIRness in its research outputs, and while progress is being made, the I of FAIR is the frog that doesn’t readily turn into a prince for the FAIR ringmistress in this tale. Yet, interoperability – particularly semantic interoperability— is critical to providing meaning and context to CGIAR’s varied information resources and enabling integration between linked or related data (e.g. an agronomic data set and related socioeconomic data).

CGIAR’s approach to interoperability and data harmonization focuses on the use of standard vocabularies, and strong reliance on ontologies developed across CGIAR (efforts such as the Crop Ontology, the Agronomy Ontology – AgrO, the in-development socioeconomic ontology – SociO), and other entities (ENVO, UO, PO etc.) Spinning these into a semantic web for agriculture is a primary focus of the CGIAR’s Big Data Platform for Agriculture, and its Global Agricultural Research Data Innovation and Acceleration Network (GARDIAN). GARDIAN is intended to provide seamless, semantically-linked access to CGIAR publications and data, to demonstrate the full value of CGIAR research, enable new analyses and discovery, and enhance impact. In this talk, I will discuss CGIAR’s path to a FAIRy tale ending, complete with the sticky considerations entailed in web-spinning. Keynote Speaker Slides Medha Devare_SWAT4LS_4_12_2018

Medha Devare

Medha Devare Ph.D. is Senior Research Fellow with the International Food Policy Research Institute (IFPRI). She led the CGIAR System‘s Open Access/Open Data Initiative, and currently leads efforts to organize data and enable semantic interoperability across CGIAR’s 15 agriculture for development-focused Centers through its Big Data Platform. Medha is a Cropping Systems Agronomist with significant experience leading projects addressing food security and sustainable resource management in South Asia. She also has expertise in data management and semantic web tools; while at Cornell University, she was instrumental in the development of VIVO, a semantic web application for representing academic scholarship.

Linkedin.

Artificial Intelligence and Big Data in Health: the dilemma of Truth Watch here

In medicine, the emergence of several new instruments in the field of artificial intelligence, such as deep learning, is closely linked to the increased availability of large data sets. We are not only in a new era of data driven life sciences, we are also seeing these enabling technologies bringing major changes in numerous health sectors, such as: genomics and precision medicine, disease prevention and health determinants, decision support, citizen empowerment. These changes are leading more than a simple evolution of medical practice, they are raising numerous challenges at all levels, reaching from science and engineering all the way to legal, societal and ethical considerations.

The data. Factors influencing health and disease cover the whole range of imaginable determinants. The current health data landscape, made up of genomics and molecular biology, is increasingly seen as only a fraction of the whole landscape that includes the healthcare system, individual behavior, socio-economic factors, environmental exposure, and finally the regulatory and geo-political system. In other terms, our health comprises of what we are, how we behave, and the ecosystem we inhabit and interact with. Data science applied to determinants of health thus covers a gigantic range of data that is massive in quantity, highly heterogeneous, as well as multimodal and multidimensional in scope. Moreover, the highest burden for humans and society is caused by chronic conditions, most of which reflect very long disease histories. Conditions with very different etiologies, such as heart failure due to hypertension, chronic obstructive pulmonary disease due to smoking, and blindness due to glaucoma, all require long exposure, usually more than 20 years to emerge.

The knowledge. The challenges raised by the high complexity of data is reinforced by the challenge of depending on a dynamic and fragile frameworks of evidence. The most important source of knowledge in life sciences is the US National Library of Medicine (NLM) of the US National Institute of Health (NIH) very large digital databases. These databases cover all fields of life sciences, from molecular biology, such as Genes and Proteins, to peer-reviewed publications. The latter is an important source of “evidence-based peer-reviewed literature”, or “truth”.

The intelligence. Artificial intelligence, and especially supervised deep learning, has significant potential in medicine. However, these instruments rely on reliable sources of “truth” that in turn depend on “well annotated” data sources. Building decision-support systems, or predictive systems, requires also that we build trust in these new tools. Opening the black box in order to improve understanding of the mechanistic and deterministic processes that drive deep learning is an important, but insufficient step forward. Building trust requires that we develop methodologies able to assess the reproducibility, the generalizability, and thus the reliability of these tools. It also requires assessment of their positive and negative predictive power in various a priori probability landscapes.

Societal challenges. Finally, there are many challenges at the societal level. Privacy is very difficult to preserve, as it is extremely difficult to anonymize any particular individual’s dataset without losing the granularity required for analytics. But other questions arise, such as how to build a social system based on the distribution of risks, when risks and protecting factors become potentially known for each individual. In this talk, the hopes, expectations and challenges described above will be discussed. Keynote Speaker Slides Christian Lovis AI-Truth SWAT4HCLS 2018-12

Christian Lovis

Christian is professor of clinical informatics at the university of Geneva and chairman of the division of medical information sciences at the university hospitals of Geneva. He is a medical doctor board certified in Internal Medicine with special emphasis on emergency medicine, master in public health from the University of Washington, parallel education in biomedical informatics at the University of Geneva, focusing on clinical information systems, clinical data interoperability and medical semantics. One of the focus of his team is on phenotype clinical data interoperability and semantic representation to support research, including a strong lead on computational linguistics and tools to use medical narratives and texts. He is a Fellow of the American Medical informatics association (FACMI) and medical informatics certified of the German medical Informatics Association. Christian co-authors more than 150 publications. He is editor-in-chief of JMIR Medical Informatics and editorial board member of several peer-reviewed journals in biomedical informatics, such as the Journal of the American Medical Informatics Association (JAMIA), PLOS One, Applied Clinical Informatics (ACI) and BMC Big Data Analytics. He is member of the executive board of the Swiss Personalized Health network initiative. He is the 2016-2018 president of the European federation of medical informatics. Christian has participated to several start-ups