Summary

SERVICES

The iCONICS platform provides access to software solutions and methodological expertise for a variety of biomedical data management and analysis processes.

DATA MANAGEMENT

We offer support for the data management of biomedical data. This includes the definition and implementation of good practices in the processes of curation, standardization, annotation and structuration of field-specific data. The goal is to reach high-quality standards in data consistency, reliability, value, and interoperability.

To this end, we implement dedicated community and open solutions to capture data in a digital format, such as:

  • REDCap for clinical eCRF (with the definition of a unified variable dictionary for clinical studies);
  • XNAT for neuroimaging data;
  • OMERO for cell imaging data;
  • Tumorotek for biological resource collections.

Training and support are organized to support the usage of these software solutions.

We also offer assistance in defining Data Management Plans (based on the European Union FAIR data principles: Findable, Accessible, Interoperable, and Reusable, learn more here), which are now mandatory for most research funding agencies.

DATABASE AND SOFTWARE DEVELOPMENT

We develop software tools that fulfill two main objectives:

  • Organize data in a consistent, reliable and secured way;
  • Provide end-users with graphical tools to manage, query, browse and visualize their data.

As an example, we are working on a novel software to efficiently handle pseudonymization and identifiers of subjects participating in clinical trials at the Institute (across various platforms), and we contributed to improve the database and application used for managing the OncoNeuroTek resource. We also improve the community tools listed above by integrating new functionalities or plug-ins.

In addition, graphical tools are developed and deployed to help in the operation of methods and in the interpretation of complex data, in coordination with the analytical components of the platform activities. For instance, we provide access to Shiny/R applications intended to filter and query gene variant data, explore transcriptomics data or launch integrative analyses of multimodal data.

The Paris Brain Institute also provides access to the Ingenuity Pathway Analysis software.

GENOMICS DATA ANALYSIS

We develop and provide bioinformatics pipelines and interfaces in order to process and visualize a range of genetics, genomics and epigenomics data; these tools are operated by staff members or by end-users, who also benefit from expert assistance from the platform to analyze and interpret their data.

The panel of supported biological applications, mainly derived from high-throughput sequencing acquisition (NGS), includes:

  • Gene panel, whole-exome sequencing and whole-genome sequencing (SNPs, CNVs, rare variants);
  • RNA-seq (differential gene expression, splicing variants, long non-coding RNA, microRNA, single-cell);
  • Bisulfite-seq (methylation profile);
  • ATAC-seq (chromatin accessibility);
  • ChIP-seq (protein binding).

A few array analyses are also proposed (GWAS, transcriptomics, methylation).

Pipelines are built using a variety of tools chosen to ensure scalability, reproducibility and portability (workflow manager snakemake, package manager Conda, container manager Docker). Graphical Shiny/R applications provide users with efficient means to explore their results.

This activity is tightly linked with the sequencing data production operated by the iGenSeq core facility.

BIOSTATISTICS AND DATA INTEGRATION

Basic support in statistical data analysis is provided. We welcome any request dealing with the implementation of standard (e.g., hypothesis testing, ANOVA, mixed-models, clustering) or advanced (e.g., non-linear regression and classification, feature and representation learning) statistical methods, to assess, validate, correlate or explore biomedical data (typically, clinical examinations, cell imaging, electrophysiology, genomics or metabolomics data acquisitions).

In addition, we design and apply specific methods for the integrative analysis of multimodal and high-dimensional data (e.g., genetics/multiomics data, neuroimaging data, clinical observations). In particular, a versatile framework called Regularized Generalized Canonical Correlation Analysis (RGCCA), and its sparse counterpart SGCCA, are dedicated to the analysis of data sets structured in blocks of variables. In partnership with CentraleSupelec, we develop new components for this package (e.g., for better handling missing values) and implement graphical interfaces to operate the methods and visualize the results.

TRAINING

We offer training classes in various areas related to our domains of expertise, such as:

  • Prepare your Data Management Plan
  • Clinical research data collection and management with REDCap
  • OMERO for microscopic data management
  • Statistics for cell image analysis
  • Introduction to R for genomics
  • Statistics and R (various flavors)

Individual training may also be provided “on demand”.