Revealing the genetic background of chronic non-communicable diseases is a considerable challenge because of the large number of genetic variations influenced by environmental factors and genetic regulators. The suite is based on modules – created to answer specific questions. The modules can be customized according to the individual characteristics of the project.
Variant Detection and Annotation Module
The result of raw data analysis is an assembled and aligned nucleotide sequence. Further analysis starts with annotation: the location of the genetic variant, the gene and function affected need to be identified. Revealed variants are first filtered then annotated, resulting in dbSNP rs IDs, overlapping-gene accession numbers, SNP function (e.g. mis-sense coding), etc. Module components:
Mapped sequences and variants need to be functionally annotated and assigned. The module performs pathway analysis, function prediction and protein-protein interaction prediction – providing genotype-phenotype associations. Module components:
Variant Filtering (e.g. mis-sense coding)
Filtering against public SNP, mutation databases and HapMap
Function Prediction
Protein-Protein Interaction
Gene Ontology
Variant Frequency Analysis
Pathway Analysis
Genome-Wide Association Analysis Module
Genetic characterization of multifactorial diseases is a complex matter. The output of genome-wide association mapping is a whole genome/exome scanning analysis for statistically strong associations between a set of SNVs/structural variants and a particular trait or disease. Module components:
Quality Control (statistical metrics of raw data, e.g Ti/Tv ratio, dbSNP concordance)
Phenotype Classification
Variant Frequency Analysis
Genotype-Phenotype Association Testing
Variant Classification
Data Integration Module
Data originating from different platforms or procedures (microarray, DNA/mRNA/ChIP/miRNA sequencing, proteomic methods, etc.) need to be integrated into a single holistic data warehouse for simultanous interpretation. The omics module presents data modelling in a system wide context. Module components:
Integrated Data Warehouse (ADD ON proteomics, metabolomics, transcriptomics etc. databases)
Data Mining (pattern recognition, classification, clustering)