Scientific programme

Overview

Wednesday 28th November 2012
(Hackathon day)

Location: To be confirmed (close to the workshop venue)

all day

The SWAT4LS Hackathon will bring together participants interested or involved in the standardization of biomedical information.
It is co-organized with Toshiaki Katayama (Database Center for Life Sciences (DBLCS)), Andra Waagmaaster (WikiPathways and OpenPHACTS), Ismael Navas-Delgado (University of Malaga), Eric Prud’hommeaux (W3C Consortium).
More information can be found on the W3C SWAT4LS Hackathon page

Thursday 29th November 2012
Tutorials day

Location: Centre de Recherche des Cordeliers in central Paris

	Track A	Track B
09:00-09:30	Registration and orientation
09:30-11:00	Thorough introduction to SPARQL for anyone in the Life Sciences	Drug-discovery knowledge integration and analysis using OWL and reasoners
11:00-11:30	Coffe break
11:30-13:00	Thorough introduction to SPARQL for anyone in the Life Sciences	Use of the BioQueries collaborative portal
13:00-14:00	Lunch break
14:30-16:00	Biological Pathways and the Semantic Web	Interconnecting Linked Open Data and R with the SPARQL package for Linked Science
16:00-16:30	Coffe break
16:30-18:00	Biological Pathways and the Semantic Web	Interconnecting Linked Open Data and R with the SPARQL package for Linked Science
18:00-19:00	Using SPARQL to Query BioPortal Ontologies and Metadata

Friday 30th November 2012 (Workshop day)

Location: Amphithéâtre-Farabeuf at the Centre de Recherche des Cordeliers in central Paris

08:30-09:00	Welcome
09:00-09:15	Introduction
09:15-10:00	Keynote: Mark Musen, Semantic Technology Goes Mainstream: The NCBO Experience
10:00-11:00	Full papers presentation (4×15′)
11:00-11:30	Coffee break, poster and demo session
11:30-12:15	Keynote: Christine Golbreich, Formal ontologies for the Semantic Web ?
12:15-13:00	Industry session
13:30-14:00	Lunch break, poster and demo session
14:00-14:45	Keynote: Michael Gibson, The World is Flat, but the Earth is Round
14:45-16:00	Full papers presentation (4×15′),Position papers presentation (3×5′)
16:00-17:00	Coffee break, poster and demo session
17:00-18:00	Full papers presentation (3×15′),Position papers presentation (3×5′)
18:00-18:30	Keynote: Ann Marie Martin, Joining private and public forces to boost innovation in healthcare: the innovative medicines initiative and knowledge management
18:30-19:00	Panel discussion
19:00-19:30	Awards and closing ceremony
20:30-	(Optional) Social Dinner

Keynotes

Mark A. Musen, M.D., Ph.D: Semantic Technology Goes Mainstream: The NCBO Experience

The National Center for Biomedical Ontology (NCBO) is one of seven National Centers for Biomedical Computing in the United States supported by the National Institutes of Health. As workers in biomedicine recognize the importance of creating structured representations of the entities and the relationships among entities in experimental domains, the NCBO is developing the semantic technologies to use those representations to drive in a wide range of applications in data annotation, data integration, information retrieval, natural-language processing, and decision support. The NCBO is creating Web-based software to facilitate the archiving, peer review, and application of ontologies by workers in biomedicine. In recent years, the use of NCBO resources has been growing exponentially. Currently, more than 65,000 visitors browse the BioPortal ontology repository each month, many of whom seem to visit the site nearly every day. Each month, the NCBO handles more than 3 million Web service calls. Learning what all these users are doing with NCBO technology provides an opportunity to track the requirements of the Semantic Web community in health care and the life sciences. The NCBO continues to explore methods to obtain more information about these users and their needs in an attempt to anticipate trends in the work of biomedical scientists who are embracing semantic technology.

Dr. Musen is Professor of Biomedical Informatics at Stanford University, where he is Director of the Stanford Center for Biomedical Informatics Research. He holds an MD from Brown University and a PhD from Stanford.
Dr. Musen conducts research related to intelligent systems, the Semantic Web, reusable ontologies and knowledge representations, and biomedical decision support. His long-standing work on a system known as Protégé has led to an open-source technology now used by thousands of developers around the world to build intelligent computer systems and new computer applications for e-science and the Semantic Web. He is known for his research on the application of intelligent computer systems to assist health-care workers in guideline-directed therapy and in management of clinical trials. He is principal investigator of the National Center for Biomedical Ontology, one of the eight National Centers for Biomedical Computing supported by the U.S. National Institutes of Health. He chairs the Health Informatics and Modeling Topic Advisory Group for the World Health Organization’s revision of the International Classification of Diseases (ICD-11). He is a member of the National Advisory Council of the National Institute for Biomedical Imagine and Bioengineering of the U.S. National Institutes of Health.
Early in his career, Dr. Musen received the Young Investigator Award for Research in Medical Knowledge Systems from the American Association of Medical Systems and Informatics and a Young Investigator Award from the National Science Foundation. In 2006, he was recipient of the Donald A. B. Lindberg Award for Innovation in Informatics from the American Medical Informatics Association. He has been elected to the American College of Medical Informatics and the Association of American Physicians. Dr. Musen sits on the editorial boards of several journals related to biomedical informatics and computer science. He is co-editor of the Handbook of Medical Informatics (Springer-Verlag, 1997) and co-editor-in-chief of the journal Applied Ontology.

C. Michael Gibson, M.S., M.D., FRCP, FAHA, FSCAI, FACC: The World is Flat, but the Earth is Round

One of the tenets of Friedman’s “The World is Flat” is that Innovation increases as open access to information increases. Although the wisdom of the crowd literally argued that the “World is Flat” several centuries ago, a select number of “experts” demonstrated the world is in fact round. With the vast amount of medical information on the internet, how do we harness the “Wisdom of the Crowds” yet vet it through experts, and drive traffic to credible sites with the most relevant content? How can we streamline the process so that greater numbers of individuals and websites can participate in schema.org?

C. Michael Gibson, M.S., M.D. is an interventional cardiologist, cardiovascular researcher and educator at Harvard Medical School and Duke. Dr. Gibson is Founder and Chairman of the Board of WikiDoc Foundation (a 509 (a)(1) Charitable Organization). This is the world’s largest medical textbook / encyclopedia. There are currently over 400,000 page views daily of over 175,000 chapters of content contributed and edited over 619,000 times by over 6,910 registered users. Dr. Gibson has personally made over 69,000 edits to WikiDoc. The site is viewed 160 million times each year. Gibson was one of the co-creators of a www.schema.org-based schema that allows webmasters and content publishers to mark up health and medical content on the web.

Christine Golbreich, Ph.D: Formal ontologies for the Semantic Web ?

Ontologies are widely considered as a foundational technique of the Semantic Web, in which meanings of terms are defined by formal ontologies and semantic annotations facilitate the access to Web content. This view has led to the standardization of the Web Ontology Language OWL 2. In this talk, we will reflect on the usefulness of OWL 2 ontologies for Life Sciences. We will do this by presenting a number of advantages of OWL 2 ontologies: interoperability, semantics, reasoning services. But we will also notice that it is often the case that applications only ever use terms, e.g., the classical utilization of SNOMED-CT is to use its catalogue and codes to index medical records. We will discuss a new approach reconciling ontologies and terminologies. We propose to use an OWL 2 ontology for clear semantics and reasoning, and to derive from it a lightweight terminology (or perhaps something like an OWL 2 QL ontology) for applications such as resources indexing and search. For large vocabularies, the underlying rich ontology is essential to ensure that the lightweight derived terminology is quality-controlled. Reliability is particularly important for Health Care and Life Sciences applications where safety is critical. I will illustrate this by examples, including the semantic annotation of brain MRI images (IEEE Transactions on Medical Imaging 2009), the Foundational Model of Anatomy in OWL 2 and its use for a European Portal of Health terminologies dedicated to resources indexing (AIIM 2012 under press).

Christine Golbreich is a Professor in Computer Science at the University of Versailles Saint-Quentin. She was pioneer in France in promoting the Semantic Web for life sciences and formal ontologies in biomedicine – initiating in 2003 the first workshops in France and at the Medical Informatics Europe Conference. She managed several projects on representing large biomedical ontologies in OWL, such as the MeSH in OWL (KR-MED 2004), the Foundational Model of Anatomy in OWL (Journal of Web Semantics 2006), OBO and OWL (ISWC 2007). She was responsible of a number of projects on semantic integration, for example for a system integrating transplantation and dialysis data from different medical centers in France. Her recent works were devoted to the semantic annotation of brain MRI images (IEEE Transactions on Medical Imaging 2009), and to the Foundational Model of Anatomy in OWL 2 and its use for a European Portal of health terminologies dedicated to resources indexing (AIIM 2012).
She was a member of the W3C OWL Working Group and editor of the OWL 2 Web Ontology Language: New Features and Rationale document (W3C Recommendation, 2009). She advocated the needs to extend OWL by a rule formal language at RuleML 2004 (LNCS 3323), presented a number of use cases for the RIF, which is now a W3C recommendation, and achieved the first Protégé plugin, SWRLJessTab, to reason with ontology and rules.
Christine Golbreich has a pluri-disciplinary expertise: she is Engineer of Ecole Nationale Supérieure des Mines de Paris, has a PhD and Habilitation in Computer Sciences from University Paris 11, a research master in Logics, in Biomathematics, and a professional master in Clinical Psychology.

Ann Marie Martin, Principal Scientific Manager, Knowledge Management, European Innovative Medicine Initiative : Joining private and public forces to boost innovation in healthcare: the innovative medicines initiative and knowledge management

IMI is a public-private partnership between the European Union, represented by the European Commission, and the pharmaceutical industry, represented by the European Federation of Pharmaceutical Industries and Associations (EFPIA). IMI’s total budget amounts to €2 billion. €1 billion is invested from the European Commission’s Seventh Framework Programme (FP7), which is matched by contributions from EFPIA and its member companies.
As output, IMI is currently funding 42 projects representing an investment of € 1.200 million (for a description http://www.imi.europa.eu/content/ongoing-projects). All projects have a knowledge management component and IMI has concluded a memorandum of understanding with CDISC (Clinical Data Interchange Standards Consortium) a standards development organization well known within the Pharmaceutical industry to address the need to use both format and content standards in the projects. Furthermore some projects have specific knowledge management objectives including one project adopting specifically semantic web technologies: Open PHACTS.

Ann is responsible for the Knowledge Management projects at the Innovative Medicines Initiative (IMI) and Knowledge management aspects of the IMI collaborations. IMI is Europe’s largest public-private initiative aiming to speed up the development of better and safer medicines for patients. IMI supports collaborative research projects and builds networks of industrial and academic experts in order to boost pharmaceutical innovation in Europe and operates as joint undertaking between the European Union and the pharmaceutical industry association EFPIA.
Between 1997 and 2009, Ann Martin held various management positions in the pharmaceutical industry as Global Head of Biostatistics for UCB Pharma (1997-2001), Global Section Head of Statistical Programming for Novartis (2001-2005) and Global Head Statistical Programming Operations, Standards and CDISC Implementation at UCB Pharma (2005-2009), giving her a broad knowledge on drug development and extensive international experience with Europe, the US and India.
Between 1987 and 1997, Ann worked for Bristol-Myers-Squibb both as a Junior and Project Biostatistician in multiple therapeutic areas, following a short period as research assistant at the University of Lancaster (1985-1987).
Ann Martin is a Chartered Statistician and holds a Masters degree in Sociology and Statistics from the London School of Economics and Political Science, UK.

Industry session

Ontotext: Linked Life Data: Increasing Research Scope without Integration Headaches
The linked data publishing principles have been in existence for more than 5 years. The technology has been adopted by many research organisations and there is an increasing volume of published information. Yet we do not see many sustainable business models based on the linked data platform. This talk will be a joint presentation between the drug discovery and technology industries and will discuss the business needs of pharmaceutical researchers. The drug discovery partner, UCB is a leading biopharmaceutical company facing many of the challenges in the data arena related to its R&D activities. They will address the type of required information services for public RDF data and how their requests are fulfilled by the Linked Life Data (LLD) service, which allows the cost of hosting, integration, maintenance and support of big data warehouses to be outsourced. Ontotext will provide an update on the latest new features available in LLD and the underlying OWLIM database, such as ‘nested repositories’, query federation and improved web interface. Finally, the presentation will explain how you can design your LLD service by using a copy of OWLIM.
The talk will be presented by Vassil Momntchev (Ontotext) and James Snowden (UCB)

Tutorials

Interconnecting Linked Open Data and R with the SPARQL package for Linked Science
The openly available R package SPARQL allows to directly connect to Linked Data and use the SPARQL querying language for selecting interesting part of data for analysis. Thus it enables to meet massive and rich data sets with the analytical power of the R language and environment.
This approach and tools contribute to Linked Science and Open Science movements to support transparency of science and helps to conduct transdisciplinary research.
In this tutorial we will introduce the idea and concepts about Linked Science, and show via illustrative examples about how to practically query and analyze Linked Data from within R environment for statistical analysis.
The website for the tutorial will include all the tutorial materials (http://linkedscience.org/events/lodr4ls/)
Tutorial presented by Tomi Kauppinen, Willem Robert van Hage, Benedikt Gräler, Biniyam Tilahun

Biologial Pathways and the Semantic Web
This tutorial will address exposing Biological Pathways to the semantic web. In this tutorial WikiPathways (http://www.wikipathways.org) will be core. We recently exposed WikiPathways content as linked open data now available through a Sparql endpoint (http://sparql.wikipathways.org).
The tutorial will cover the following topics:
How to edit pathways at WikiPathways
How to convert pathway content to RDF and link it to other semantic web resources
How to perform common semantic queries on pathways
How to build tools around semantic pathway resources
How to use Open PHACTS (http://www.openphacts.org) with pathway content
The tutorial is proposed by Andra Waagmeester (BiGCaT, @andrawaag ), Martina Kutmon (BiGCaT), Egon Willighagen (BiGCaT, @egonwillighagen), Chris Evelo (BiGCaT, @Chris_Evelo) and Alex Pico (http://nrnb.org/frames/alexpico_bio.html)

Using SPARQL to Query BioPortal Ontologies and Metadata
BioPortal is a repository of biomedical ontologies, with more than 300 ontologies to date. This set includes ontologies developed in OWL, OBO format, as well as a large number of medical terminologies that the US National Library of Medicine distributes in its own proprietary format. We have published the RDF based serializations of all these ontologies and their metadata at sparql.bioontology.org. This dataset contains 203M triples, representing both content and metadata for the 300+ ontologies; and 9M mappings between terms. This endpoint can be queried with SPARQL, which opens new usage scenarios for the biomedical domain. This tutorial will present an overview of the SPARQL endpoint, commonly used queries, and lessons learned from having redesigned several applications that today use this SPARQL endpoint to consume ontological data.
The tutorial will be presented by Trish Whetzel, Ph.D, Outreach Coordinator at the National Center for Biomedical Ontology (NCBO).
Slides

Drug-discovery knowledge integration and analysis using OWL and reasoners
Biology relies on various classifications and databases to organize and explain the living world. These resources are traditionally disparate but can nowadays be expressed with the Web Ontology Language (OWL) in order to be more easily combined and queried. In this tutorial we will go beyond RDF/SPARQL and will explore the advantages of using OWL and reasoners for knowledge integration and analysis in life science.
The tutorial is composed of a theoretical part around OWL modelling for biology featuring hands-on exercises using Protégé and the OWL-API. Then follow a practical session, where the attendees will integrate the information coming from different sources (Gene Ontology, Uniprot and DrugBank) into an OWL knowledgebase. The integrated data will be queried using OWL constructs and is capable of answering drug-discovery questions such as finding new protein targets to treat a disease via reasoning.
The attendees are expected to have some general knowledge around Semantic Web and Java. During the tutorial, the participants will learn how to integrate and leverage the information present in various repositories using OWL. They will understand how to formulate biomedical questions in Description Logic and retrieve hidden knowledge with the help of a reasoner.
The tutorial will be presented by Samuel Croset (http://www.samuelcroset.com) and Dietrich Rebholz-Schumann (http://www.ebi.ac.uk/Rebholz/)

Thorough introduction to SPARQL for anyone in the Life Sciences
Learn the basics of SPARQL with hands on exercises. How is SPARQL different from SQL or Google?
Harness the power of combining local and remote data via federalized SPARQL queries for quick data aggregation. Learn to navigate other peoples data in RDF using “follow” your nose principles.
The tutorial will be presented by Jerven Bolleman
Slides

Use of the BioQueries collaborative portal
This tutorial will address the use of a collaborative portal (http://bioqueries.uma.es) for the design and execution of SPARQL queries in Life Sciences. This portal provides a collaborative environment in which biological and bioinformatics users can participate sharing, editing and designing SPARQL queries allowing the process of Linked Data consumption in Life Science domain where there is a lack of end-user applications. The tutorial covers the following topics:
Execution of SPARQL queries on the portal from their natural language description
Registration of simple queries targeting one Linked Data repository and correct annotation for enabling non-technical users to run queries
Collaborative aspects of Bioqueries: comments, versioning of queries, management of vandalism, Registration of new endpoints
Federated queries for addressing multiple endpoints.
The tutorial will be presented by Ismael Navas-Delgado

Accepted full papers

Provisional list

Alison Callahan, Jose Cruz-Toledo, Peter Ansell, Dana Klassen, Giovanni Tumarello and Michel Dumontier
Improved dataset coverage and interoperability with Bio2RDF Release 2
Jerven Bolleman, Sebastien Gehant and Nicole Redaschi
Catching inconsistencies with the semantic web: a biocuration case study
Olivier Dameron, Paolo Besana, Oussama Zekri, Annabel Bourdé, Anita Burgun and Marc Cuggia
OWL Model of Clinical Trial Eligibility Criteria Compatible With Partially-known Information
Simon Jupp, Helen Parkinson and James Malone
Semantic Web Atlas: Putting Gene Expression Data Into Biological Context
Rinke Hoekstra, Anita De Waard and Richard Vdovjak
Annotating Evidence Based Clinical Guidelines — A Lightweight Ontology
Guoqian Jiang, Harold Solbrig and Christopher Chute
Building Standardized Semantic Web RESTful Services to Support ICD-11 Revision
Alexandre Riazanov, Matthew Hindle, E. Scott Goudreau, Chris Martyniuk and Christopher Baker
Ecotoxicology Data Federation with SADI Semantic Web Services
Despoina Magka
Ontology-Based Classification of Molecules: a Logic Programming Approach
Gayo Diallo and Mouhamadou Ba
Effective method for large scale ontology matching
Hugo Leroux, Laurent Lefort
Using CDISC ODM and the RDF Data Cube for the Semantic Enrichment of Longitudinal Clinical Trial Data

Accepted position papers and highlight posters