Fisher-exact test download:
- Count.txt file(s)
- StatisticalTests.txt file(s)
The job information submitted to tranSMART. The jobinformation gives an overview of the input data for the analysis. In parameters it shows the jobType to indicate what analysis and image will be produced and it shows which variables were selected by displaying the full concept path names for each of them.
In case of High dimensional data this file contains information on which genes where selected for independent and dependent variable names. The indepedent variables are used for the row names and the dependent variables for the column names
binDistributionIndep and binDistributionDep indicate which distribution was used to bin continuous variables. The indication is an abbreviation:
- EDP; Evenly Distribute Population
- ESB; Evenly Spaced Bins
Manual binning is not indicated in the jobInfo.txt
The data with three columns describing the points plotted:
- PATIENT_NUM, Patient identifier
- IND, the independent variable options, in case of numerical binned variable this indicates the bin boundaries
- DEP, the dependent variable options, in case of numerical binned variable this indicates the bin boundaries
Optional columns when using numerical variables
- GROUP; Name of the independent group, to be displayed in the table
- GROUP.1; Name of the dependent group, to be displayed in the table
When having selected high dimensional data it is possible to select more than one dependent and independent variable. Multiple independent and/or dependent variables will produce one point per GROUP-GROUP.1 combination. Each combination consists of one line per patient identifier. In case of two dependent and independent variables this will yield four lines of data points per patient.
The GROUP and GROUP.1 names represent a probe (or gene) name as stored in tranSMART. To find the gene names corresponding to the probes used, the two GROUP columns need to be combined with jobInfo.txt to see what was used as input for the IND & DEP columns. The independent gene name is displayed next to the divIndependentPathwayName item and the dependent gene name is displayed next to the divDependentPathwayName item.
json representation of the jobInfo.txt
Representation of the count table. Independent variables fill the rows and dependent variables the columns.
In case of high dimensional data there will be one count file per independent probe (or gene) name, naming-convention: Count<ProbeName>.txt. If using probe data as input for the analysis, then in order to find the gene corresponding to the probe name look in the jobInfo.txt file for the independent gene name.
The count file produces one representation per dependent variable probe or gene name. For example when two probes (or genes) are selected for the dependent variable this will produce two table representations. Each table will have a name where the name is a probe (or gene) name.
Result of the Fisher-exact statistical test. Produces three lines of output, the fishp, chis and chip with are the Fisher-exact test p-value, the Chi-square value and the Chi-square p-value, respectively.
In case of high dimensional data there will be one statisticalTest file per independent probe or gene, naming convention: statisticalTest<ProbeName>.txt. If using probe data as input for the analysis, then in order to find the gene corresponding to the probe name look in the jobInfo.txt file for the independent gene name.
The statisticalTest file produces one representation per dependent variable probe name. For example, when two probes (or genes) are selected for the dependent variable this will produce two lists with the three output items mentioned above. Each list will start with a name where the name is the probe (or gene) name of the dependent variable.