Scripts are available to install tranSMART 16.3 using a postgres database on a clean system (e.g. an Amazon Web Services AWS instance) running Ubuntu 14.04 or Ubuntu 16.04.
The instructions below are for ubuntu 16.04. To load on the older Ubuntu 14 simply replace ubuntu16 by ubuntu14 in each example command.
The scripts have been tested on a number of systems but can fail for various reasons beyond the control of the developers. An example is changes to the available R packages in the R installation step. Please contact firstname.lastname@example.org if you encounter problems. The scripts and the underlying installation components can usually be quickly updated to address issues.
All-in-one Install Script, plus Data-Loading
The "all-in-one" instructions require the minimum of human interactions. In these instructions, there are three major steps. First a set of instructions guides you in setting up a Scripts folder on the home directory of an account with sudo privileges. In the second step, you run the install script that will install all of the support command line tools, create the required support directories, load the required toolsets for PostgreSQL, R and Rserve, load the required R packages, load and start SOLR and the SOLR web interface, and load the transmart war files into Tomcat 7. In addition, the second step sets up the transmart database. In the third step, you run a script that loads the demo datasets. Optionally, by editing a configuration file, you can select additional datasets to load.
Upon completion of the install you can (optionally) run a set of checking scripts that will check to make sure everything is installed correctly and running properly.
In an additional supplementary process, you can use the full PGP keys to verify the downloaded install artifacts. This process is described in this additional document. Therein you will see instructions for downloading tranSMART Foundation PGP keys and using them to verify the digital signature of the downloaded artifacts (that is, the zip files of, the install scripts, transmart-data, tranSMART-ETL, transmart.war, and gwava.war).
Make sure the machine or VM you are installing on is according to the Hardware and OS Requirements as specified above.
Set up a transmart account
We assume that you have set up an account with sudo privileges to use in these instructions. For sake of illustration we will use an account named transmart. If you have not already set up such an account, you do so with the following commands (assuming, in this case, that you are logged in with the initial admin account, ubuntu).
Set up the install script
To download the Scripts zip file and install its folder, run the following commands in the home directory of the account that you are going to use to run transmart (e.g in our case, the account named transmart).
Run the install script
Next, we run the install script. This script, if successful, installs tranSMART. It the script fails, read the last message, the error message, carefully; it will usually suggest a fix for the problem. Then, fix the problem and rerun the script. The script is robust with respect to being rerun; and will usually run to completion. To recover error messages, review the install.log file in the install user's home directory.
In the case that the error seems to have left things in an unrecoverable state, delete the folder ~/transmart (and its contents) and restart the install script as below. This will simply reload all the tranSMART artifacts, without having to reload the command-line tools.
Note: during this process, you will be prompted for the account's password, by the 'sudo' command, at several points along the way.
After completing the installation procedure above, you have an 'empty' instance of tranSMART. Now, you need to load data, either your own or some of the Curated Data Repository sets provided by the Foundation. In these instructions we will illustrate loading data using scripts that access the curated datasets. We will use the script in the load datasets code block, below, to load datasets into the tranSMART database. These instructions describe how to load a representative sample of curated datasets. However, they can be easily generalized; you can choose which datasets you wish loaded (the select datasets code block).
For complete details of the loadable datasets, and additional details on the loading process, see the instructions at the start of the Curated Data Repository wiki page.
These two files are in the Scripts/install-ubuntu directory are involved in this process:
- datasetsList.txt - the list of possible datasets to load, and
- load_datasets.sh - the script to load the datasets.
By default, a representative sample of the datasets are loaded. Specifically, these datasets are loaded, by default:
EtriksGSE2125, PfizerGSE22138, RanchoGSE11903, RanchoGSE4382, SanofiGSE38642 . The Curated Data Repository sets wiki page has details of the source and content of these datasets.
Optionally, you can edit the list of datasets to load. Using a text-file editor, like vim, as in the details instructions below, edit the file datasetsList.txt, to uncomment the lines indicating of the datasets that you wish to load and/or comment out the lines of the datasets already loaded or not to be loaded. Please note, it is generally not a good idea to try and load a dataset twice. For each dataset in that list, you can find a description on the Curated Data Repository sets wiki page. The detailed command sequence, below, includes the loading of vim (a visual version of vi) and the editing of the list of datasets. If you are unsure about this editing process you can run the script without editing the file. Doing so will load the 5 representative datasets, above.
Once you are in the (vi) editor, you can use the following commands:
i - to switch into "insert" mode (the tag --INSERT-- will appear at the bottom of the screen)
- the arrow keys to move from line to line or in position within a line
- the 'delete' key to delete the character to the left of the cursor
- (when in insert mode) any character key to type that character (for example #, at the start of a line, to comment out that line)
- the ESC (escape) key to get out of insert mode
- ZZ - (two upper-case z characters - when not in insert mode) to exit and save
- :q! - (when not in insert mode) to exit immediately without saving
Edit the list of datasets by adding the # character at the start of a line to comment out those datasets that you do not which to load and deleting the # character at the front of the lines corresponding to the datasets that you which to load.
Once you have set up, in datasetsList.txt, the list of datasets you wish to load, then, run the script, load_datasets.sh, as indicated, here.
With this, the install is complete (with datasets loaded). Open a browser to http://localhost:8080/transmart/ where you should see the tranSMART Web Application's login page. On the Analyze tab, in the tree-interface on the left, you will see the list of datasets loaded. The default login is admin/admin, and you can change the password of that account at the Admin Page in the interface.
Is the system running?
Finally, at any time, you can test to see if the install was successful, that is if all the "moving parts" were installed and are running, by using the following commands:
Review the log file, checks.log, in your home directory. For details see the section, below, on using the web application, towards the end of this document. To restart the needed tool interfaces (after a reboot, for example) see the section, below on Running SOLR, Rserve, and Tomcat.