Scatterplot download: Files included in the R data:
- LinearRegression.txt file(s)
- ScatterPlot.png image file(s)
The job information submitted to tranSMART. The jobinformation gives an overview of the input data for the analysis. In parameters it shows the jobtype to indicate what analysis and image will be produced and it shows which variables were selected by displaying the full concept path names for each of them.
In case of High dimensional data this file contains information on which genes where selected for independent and dependent variable names. The independent variables are plotted on the X-axis and the dependent variables on the Y-axis
The data with five columns describing the points plotted:
- PATIENT_NUM; Patient identifier
- X; Value to be plotted on X-axis, the independent variable(s)
- GROUP; X-axis name
- Y; Value to be plotted on the Y-axis, the dependent variable(s)
- GROUP.1; Y-axis name
When having selected high dimensional data, it is possible to select more than one dependent and independent variable. Multiple independent and/or dependent variables will produce one point per GROUP-GROUP.1 combination. Each combination consists of one line per patient identifier. In case of two dependent and two independent variables this will yield four lines of data points per patient.
The GROUP and GROUP.1 names represent the probe names as stored in tranSMART. To find the gene names corresponding to the probes used the two GROUP columns need to be combined with jobInfo.txt to see what was used as input for the X & Y axis. The independent gene name is displayed next to the divIndependentPathwayName item and the dependent gene name is displayed next to the divDependentPathwayName item.
json representation of the jobInfo.txt
In this file, there are seven lines per dependent variable selected:
- name, name of the independent variable. In case of a High dimensional data node this is a probe name.
- n, the number of subjects with data for this variable
- intercept, the X-coordinate where the linear regression will hit Y=0
- slope, the coefficient of the line drawn
- nr2, the residual standard error
- ar2, the adjusted R-squared
- p, the calculated p value for the linear regression.
In case more than one independent variable was selected the naming convention of the file is: LinearRegression_<DependentVariableName>.txt.
In case more than one dependent variable was selected the above described seven lines repeated for each variable selected.
ScatterPlot.png image file(s)
Scatter plot image corresponding to the data. X-axis is independent and Y-axis the dependent variable. Coefficients of the plotted line(s) can be found in LinearRegression.txt & the coordinates of the points plotted are in output.txt.
When more than one independent variable was selected the namingconvention is: ScatterPlot_<DependentVariableName>.txt
If the independent variable is a high dimensional data point the name will be the probe name. GROUP.1 is used as the default group name for points that are plotted. To identify the gene names see the input variable in jobInfo.txt. The independent gene name is displayed next to the divIndependentPathwayName item & the dependent gene name is displayed next to the divDependentPathwayName item.