Clinical data steps
Set loading type
This step allows choosing if the loading will be incremental or not.
If incremental loading is chosen, a sub-folder will be needed to store partial data. Select an existing sub-folder in the list, or create a new one by indicating a not-existing name.
Select raw files
This step allows adding one or several raw files, by indicating the paths, separated with a ';' character, or by choosing them with the 'Browse' button. For clinical data loading, inputs files can be of two types:
- A tab delimited text file, with lines representing subjects and columns representing variables. A column has to contain unique subject identifiers.
- A file in soft format, which is one of the format proposed on GEO site for public data sets. In this case, a tab delimited file is created from the soft file, by getting the sample identifiers and the different characteristics under the form “tag: value”.
The format of the file, tab delimited or soft, has to be indicated in the 'Format' dropdown.
When the button 'Add files' is clicked, the format of the files is checked, and then the files are:
- For a tab delimited file: copied in the workspace
- For a soft file: information are get from the "^SAMPLE = value" line, and from the "!Sample_characteristics_ch1 = property: value" lines (under the form of a property/value couple), then a tab delimited file is created in the workspace with this information
It is also possible to remove one or several files by selecting them in the corresponding list and then click on the 'Remove selected files' button
Set column mapping file (optional):
This step allows choosing a column mapping file, by indicating the path or by choosing it with the 'Browse' button.
When the button 'Add file' is clicked, the format of the file is checked, and then the file is copied in the workspace with the extension '.columns'
Set word mapping file (optional):
This step allows choosing a word mapping file, by indicating the path or by choosing it with the 'Browse' button.
When he button ''Add file' is clicked, the format of the file is checked, and then the file is copied in the workspace with the extension '.word_mapping'
Set subject identifiers:
This step allows choosing the column corresponding to the subject identifier. If a column mapping file already exists, the identifiers are retrieved in this file and given as parameters by default. If not, a column mapping file is created.
Set other identifiers (optional):
This optional step allows choosing the column corresponding to site and visit.
If a column mapping file is already existing, the identifiers are retrieved in this file and given as parameters by default
This step allows replacing terms of raw files by other terms.
A column of a raw file has to be chosen on the dropdown list. When clicking on “Search”, a field is displayed for each term of this column. A new term can be indicated in this field It is possible to map terms for a property with terms from another property, by selecting a property in the 'Copy mapping from column' list and click on the 'OK' button
The “Numerical” button can be checked for a numerical column. In this case, the “Search” will checked that all values are numerical (or empty or dots). If not, it will automatically proposes to replace not numerical values by dots.
By clicking on the 'OK' button, all the new terms of all columns are saved in a word mapping file.
Set study tree:
This step allows defining the study ontology tree for clinical data, from the study root. A node can be added by selecting the parent on the tree, filling the field 'New node' and clicking on 'Add node'.
A node or a property can be removed by selecting on the tree and clicking on the button 'Remove a node'.
A property can be added by selecting the parent on the tree, then the property on the dropdown list, and by clicking on the 'Add property' button
An operation on a numerical property can be added by selecting a free text parent on the tree, then properties and operations in the lists, and by clicking on the 'Add operation' button. An operation can only be set for a numerical property (column containing only numbers and dot characters), considering the word mapping file, and will be calculated by patients (it is only useful in case of several visits).
When the button 'OK' is clicked, the column mapping file is updated
Set labels ontology:
This step allows choosing labels (node titles) for the study parameters, and optionally mapping these labels with a controlled vocabulary code.
These controlled vocabulary codes can be find with bioontology portal: http://bioportal.bioontology.org/
Only labels which have been put in the data tree are presented.
This step allows loading clinical data from raw files and mapping files, using a Kettle job. A tree is displayed indicating the location where the study will be loaded, with the study to load in orange, to check that the study tree is well defined.
If security is required for this study, please check the ‘Security required’ line.
If you want to use an ETL server to load the study, check the “Use ETL server” line. A database connection is needed for this step.
This step allows accessing error logs for clinical data loading.
If an error has occurred while the kettle job was running, it is indicated, but details are given in an error file saved in the workspace
If an error has occurred while the stored procedure was running, this error is detailed. A database connection is needed for this step.
This step allows controlling clinical data loading quality.
The number of subjects with loaded data is displayed.
When a subject identifier is chosen in the list, values for each property are got in the raw files and in the database, and displayed. It is also indicated if theses values are the same in the two cases.
The button ”Launch a comparison” allows comparing files and database values for all subjects.
The result of the comparison is written in a file called “QClog.txt”. A database connection is needed for this step.
- ICE User Guide v1.4 https://drive.google.com/file/d/0B8lizkKDeaKhMWZBWnlnODVEQW8/view