Files included in export:
The job information submitted to tranSMART. The job information gives an overview of the input data for the analysis. At parameters, it shows the jobType to indicate what analysis and image will be produced and it shows which variables were selected by displaying the full concept path names for each of them.
The selected graph type is indicated in the jobInfo.txt:
- MERR: Mean with error bar
- MSTD: Mean with standard deviation
- MEDER: Median with error bar
- IND: Plot individuals
Variables selected: The items dependentVariable and groupByVariable show the full path name of the items selected in tranSMART to use as input. The independent and group by variable reflect the two input box names in the user interface: independent and outcome, respectively. Combining several items gives information on the type of variable selected and if the variable was used for binning or not.
divDependentVariableType and divGroupByVariableType indicate the type of variables that were used. Values include CLINICAL for categorical and numeric low-dimensional data types or the type of high-dimensional data that was used, for example “mrna” incase of a mrna expression dataset.
The dependent variable is always a numerical or high dimensional data node. The groupByVariable is always a categorical or binned variable.
Binning: There are three different options for binning shown in binDistributionGroupBy:
- EDP; Evenly Distribute Population
- ESB; Evenly Spaced Bins
- Manual binning
In case of EDP or ESB for numerical or high-dimensional variables, the file will contain the following parameters to provide information on which variable was used to bin:
- binningGroupBy; Either True or False
- numberOfBinsGroupBy; Number of bins defined by the algorithm
In case of the Manual binning option the above items are also included in the jobInfo but the item manualBinningGroupBy will be set to True instead of False, indicating that manual bins were defined. Additionally, an item named binRangesGroupBy is added to reflect the different bins manually defined. Note that both the numberOfBins and binVariable are still relevant in this case, while binDistributionGroupBy is not.
The data with five columns describing the points plotted:
- PATIENT_NUM; Subject identifier
- VALUE; The value that is either being plotted or used to calculate the mean or median to plot
- GROUP; the variable name/path, this is indicated as dependent variable in the jobInfo.txt
- PLOT_GROUP; In case of high dimensional data contains probe, gene or protein name depending on the data selected. Otherwise will be empty.
- GROUP_VAR; The name of the (binned) group that a point belongs to.
json representation of the jobInfo.txt
Image of plot. On the X-axis the time points are plotted. Each time point corresponds to a variable selected in the Time/Measurement Concepts-box. On the Y-axis the measurement is plotted, in parentheses the values that are being displayed (i.e. mean + se etc.). On the right there is a legend indicating which colour corresponds to each group.