Vitis OneGenE v1 - Downloads

The following downloads are related to open data.

Onegene Data

The expansion of each gene of the Vitis vinifera genome originated from the run of the algorithms of the OneGenE system. Each expansion procedure consisted of 2000 iterations of our C++ implementation (https://bitbucket.org/francesco-asnicar/pc-boinc/, last accessed on 16 June 2020) of the PC-algorithm skeleton procedure (α = 0.05) to 29 sets of 1000 variables (genes), which included the gene to be expanded, and a random subset of 999 genes sampled without replacement. The blocks scheme of the OneGenE architecture is shown in Blanzieri et al. 2020, Figure 1. The input data were extracted from the 28,013 × 1131 normalized expression data matrix initially obtained from the VESPUCCI repository, filtered, and preprocessed. Each expansion list is ordered with respect to the relative frequency, namely F_rel = #times the gene is present in the output of the PC-algorithm/# times the gene is present in the input of the PC-algorithm. Overall, the computation of the expansion lists of all the genes required 28,013 × 29 × 2000 = 1,624,754,000 runs of the PC algorithm, each run taking 5.63 s on average on our reference machine (Intel i7-4770K, Ubuntu). Therefore, the computation was done within the gene@home project on the volunteer distributed computation platform TN-Grid (20 TeraFLOPS on average), powered by BOINC software (S. Pilati et al. https://www.mdpi.com/2218-273X/11/12/1744).

The output as presented here is a list of pair of transcripts with the absolute and relative frequency of detection the latter while expanding the gene regulatory network of the former.

Size: 213.90 MiB

Genes & annotation files

The list of 28.013 genes used in our expansions.

Size: 671.94 KiB

Transcriptomic dataset (VESPUCCI v1)

The transcriptomic dataset used to run OneGenE for expansion lists computing. The grapevine transcriptomic dataset has been obtained by performing a pre-processing of the publicly available Vespucci compendia (29,090 genes and 2,017 contrasts, Moretto et al., 2016 ). The pre-processing procedure, described in (Malacarne et al., 2018), comprised three steps:

removal of contrasts with more than 55% of missing values;
removal of genes with more than 55% of missing values;
for each gene, replacement of the remaining missing values with the median of its contrasts values.

Size: 108.72 MiB

Pearson correlations

The values of Pearson correlation computed on the transcriptomic dataset above.

Size: 2.50 GiB

Pearson correlations minimal

The same of Pearson data but limited to the pair who has a non zero relative frequency in the OneGenE data.

Size: 213.90 MiB

Download of specific expansion lists

At the link specified below the user can make a single or multiple queries, using the 12X.v1 VIT code, to retrieve the corresponding annotated expansion list(s) and a README file as a compressed .zip file.

http://ibdm.disi.unitn.it/onegene/v1/onegene-vv.php