Values Above Gene List

BRB-ArrayTools used to provide a gene list containing all genes that are significantly differentially expressed among the classes at the p value threshold specified (default 0.001). We now provide two other options for defining the gene list. The genes are still ordered by the parametric p value (smallest first), but you can determine the length of the gene list to limit the number of false discoveries in the gene list or the proportion of the gene list that represent false discoveries. A false discovery is a gene included in the list that is not really differentially expressed among the classes being compared; a false positive. If you specified the maximum number of false discoveries you want in the gene list, say 10, the output will indicate where to cut the list in order to have the median number of false discoveries equal to 10 and where to cut the list so that there is 95% probability that the number of false discoveries is no greater than 10. You can think of the list with the median number 10 as having the expected number of false discoveries of 10, although the median is used because it enables us to take into account the correlation among the genes. You also have the option of specifying the proportion of genes on the gene list that are false discoveries, say 10%. If you requested that option, the output indicates where to cut the list provided so that the median proportion of false discoveries is say 10% and where to cut so that there is 95% probability that the proportion of false discoveries is no greater than the specified target, say 10%. A multivariate permutation test is performed to obtain this gene list information. Consequently, the claims about the number and proportion of false discoveries in the gene list do not depend on normal distribution assumptions. The multivariate permutation test utilizes the data more efficiently than the univariate permutation p-values. For more information about the multivariate permutation tests, see the User’s Guide and the references given there.

Also listed is a p-value for the global test of the hypothesis that the classes do not differ at all with regard to expression profiles. The test statistic used is the number of genes with parametric p-value less than the threshold specified (default 0.001). A permutation analysis is used for the computation of the p value for the global test. A global p-value less than 0.05 is considered significant. Since one global hypothesis is tested, stringent control for multiple comparisons is not necessary for the global test.

 

Values in Gene List

Parametric p-values:

These statistical significance values are based on testing the hypothesis that the gene is not differentially expressed between the classes being compared. The tests are applied separately to each gene. The parametric p-value is based on normal distribution assumptions. It is either a t-test, a paired value t-test, an F test, or a randomized variance version of the t-test and F test.

FDR:

The False Discovery Rate associated with a row of the table is an estimate of the proportion of the genes with univariate p values less than or equal to the one in that row that represent false positives. The method of Benjamini and Hochberg (1995) is used for this estimation. The FDR will be shown if the univariate test is used for gene selection.

Local FDR:

The Local FDR (False Discovery Rate) is defined by B. Efron (2001). The local FDR will be shown if the local false discovery option is used for gene selection.

Permutation p-values:

Class labels of the samples are randomly permuted N times (where N is assigned by the user). For each gene and each permutation, an associate p-value for the univariate test is computed. For each gene, the permutation p-value is defined as a proportion of permutations for which the p-values of the univariate test are smaller then the p-value computed for the original labeling. Because the multivariate permutation tests now available in BRB-ArrayTools for controlling the number and proportion of false discoveries are more powerful than the univariate permutation test, we no longer recommend requesting the latter. It is provided, however, as an option.

Geometric mean of gene expressions (ratios/intensities):

For each class, geometric mean of gene expression (ratios for 2-channel data/intensities for single-channel data) is provided. If only two classes are considered, the fold difference of these geometric means is also computed.

Description of the genes:

All columns that are given in the gene identifiers sheet are copied to the table.

Annotations:

BRB ArrayTool will search public databases for any information about each particular gene in the list. If any information is found, the cell in the Annotation column contains the Click for more information link. This link opens a pop-up window with the URL links to the public databases that contain relevant information about the gene.

 

View clustered heatmap of significant genes

A heatmap of gene expression for significant genes is generated. Genes are ordered by using hierarchical clustering with Euclidean distance and average linkage. For two channel data, log ratios are truncated by the truncation parameter defined in the Utilities -> Preference before the heatmap is created. For one channel data, log intensities are not truncated. Missing values are represented by the yellow color.

 

View View dynamic volcano/parallel coordinate plot of significant genes

The dynamic volcano/parallel coordinate plot allows user to get the gene information when the mouse is stopped over a point on the plot. For the internet Explorer users, they need to enable ActiveX control. To do this, click Internet Options from the Tools menu. Select the Security tab. Make sure that the Internet content zone is selected and click Custom Level. Set Run ActiveX controls and plug-ins to Enable. Set Script ActiveX controls marked safe for scripting to Enable. Click OK and close all the instance of Internet Explorer and re-open it.