wiki:BIOS_VirtualMachine

BIOS data location and access overview

BIOS project processed results are stored at the BIOS VM. BIOS VM is a virtual machine running at SURFsara's HPC cloud. It behaves like a normal Linux server and can be connected via both command line terminal or graphical desktop.

BIOS Metadatabase can be accessed via a web browser or any client program supporting HTTPS. To get access, please contact Leon Mei or Maarten van Iterson.

Raw files are stored at SURFsara's Grid. SURFsara's Grid is an online data storage system. Accessing Grid requires a certain level of technical skills and to follow a registration procedure.

BIOS virtual machine provided by SURFsara

For safety and privacy reasons, BIOS data (genome, transcriptome, methylome and phenome) is only accessible for downstream analysis at a SURFsara virtual machine (VM). The BIOS VM is managed by Leon Mei and Maarten van Iterson. Since the resources and capacity of this VM are limited, it should only be used for downstream analysis. If you want to work on many BAM files or similar expensive analysis you would normally use a cluster for, it's probably better to get acquainted with working on the grid directly. If you are not sure contact Leon.

The current test VM runs these specs:

  • 64 processors
  • 240Gb RAM
  • 5Tb disk space mounted at /virdir. This is the place you could keep your analysis data. The files in /virdir/Scratch are not backed up. The files in /virdir/Backup folder are backed up very irregularly. If you have important data and scripts, make sure your scripts are version controlled at an external code repository. For important file backups, please contact Leon Mei.
  • 2GB soft limit and 3GB hard limit per user at /home

BIOS VM Access

To get access, fill in this data request form, and when approved, please send an email to Leon Mei (h.mei[at]lumc.nl) with your public SSH key (instructions).

Please be aware that we have a firewall running on this VM. Only access from known IP ranges is allowed. Most of institutues' IP should be whitelisted already. If you receive ssh access deny error, please contact us.

For remote access from a Linux or Mac OSX terminal, type

ssh username@bios-vm.bbmrirp3-lumc.surf-hosted.nl

where your private SSH key is in the standard location ~/.ssh/id_rsa (alternatively, specify it with -i).

For terminal access from Windows, use the PuTTY tool and configure the VM IP address, your username, and your private SSH key (see (instructions)).

See this page for graphical user interface (desktop layout) access to the VM

Connection troubleshooting page (under construction)

Data Storage

Read Storage in the cloud for details on data storage on the VM and data backup policies.

Rstudio server

There is a Rstudio server running on the BIOS VM: http://bios-vm.bbmrirp3-lumc.surf-hosted.nl/rstudio

You could log in using your username and password as your ssh session.

UCSC Genome Browser tracks

Viewing BIOS data in the UCSC Genome Browser can be done by using the WWW export directory on the virdir and selecting the exported URLs as custom tracks.

Please note that no privacy sensitive data should be stored here, as it will be world-readable.

Example sessions:

Grid SRM access from the BIOS VM

In case you need to have access to some raw BIOS data (e.g., RNAseq, methylation), you will have to download them from the Grid SRM storage to the BIOS VM. Here are instructions on how to do it.

Note: Requesting access to the SRM takes quite some time and using the SRM itself is not the easiest thing to learn. As such, if there already is someone at your institute with access and experience using the SRM, it might be faster and easier to ask that person for help.

Grid SRM Access

Get a grid certificate, see the SARA website)

Prepare a proxy

To download data from the Grid SRM to the BIOS VM you'll need a proxy and your keys. You should have access to a UI to start a proxy. For example gb-ui-lumc.lumc.nl, ui.lsg.psy.vu.nl or another site.

  • On the UI there should be a .globus folder in your home dir, that contains the grid certificate. Copy the .globus directory from your local home folder to the UI home folder. Make sure the permissions are set accordingly. (log into an UI and issue the commands chmod 644 usercert.pem and chmod 400 userkey.pem). These files don't need to be renewed.
    .globus/:
    total 8
    -rw-r--r-- 1 mgalen mgalen 1769 Aug 14 16:55 usercert.pem
    -r-------- 1 mgalen mgalen 1751 Aug 14 16:55 userkey.pem
    
  • The proxy can be started by logging into an UI and use startGridSession:
    startGridSession bbmri.nl:/bbmri.nl/RP3
    
  • This creates your own x509 in the /tmp dir on the UI which looks something like this.
    -rw------- 1 mgalen   6.1K Aug 27 09:55 x509up_u40208
    
  • You may have to change the permissions of this file using chmod 644 x509up_u40208. Copy this file to a place at the BIOS VM for later use. (Maybe to /tmp also.) Make sure you copy the x509 file associated with your username. This is valid for 7 days, you need to renew this weekly.
  • Now log in the BIOS VM and use this command to fetch the file you just created on the UI to the BIOS VM.
    scp mgalen@uimd.grid.sara.nl:/tmp/x509up_u1234 /tmp    (replace 'uimd.grid.sara.nl'  with the address of your UI)
    

Downloading files

  • Once these files are in place, you can copy data from the Grid SRM to the BIOS VM using curl. For example, login to the VM and issue the following command, where -E points to the path where you put the proxy file. Don't forget to redirect the output from curl to a local filename.
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/README >README
    
  • You can also upload data to the Grid SRM from the BIOS VM using curl. To upload a local file test.txt, use the --upload-file (or -T) argument:
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/test.txt --upload-file test.txt
    
    (In practice, of course use a more appropriate directory on the Grid SRM instead of the project root.) Instead of specifying the full target name including filename, you can also just specify the target directory ending in a /. Curl will than use your local filename also on the Grid SRM:
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/ --upload-file test.txt
    
  • If you want to delete a file from the Grid SRM, use -X DELETE (use this with caution):
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/test.txt -X DELETE
    
  • Just checking if a file exists, without really downloading it, can be done with the -I option (A response with 200 OK means the file exists, 404 Not Found means it doesn't):
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/README
    HTTP/1.1 200 OK
    Date: Tue, 28 Jan 2014 15:26:17 GMT
    ETag: 0000EADAD57CE41D47F5A8F069A7C24F8003_-1773128220
    Last-Modified: Wed, 15 Jan 2014 14:31:15 GMT
    Content-Length: 154
    Server: Jetty(7.3.1.v20110307)
    
    mgalen@cloud-KVM:~$ curl --CApath /etc/grid-security/certificates/ -E /tmp/x509up_u40208 -L https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/bbmri.nl/RP3/nonexisting
    HTTP/1.1 404 Not Found
    Content-Type: text/html
    Transfer-Encoding: chunked
    Server: Jetty(7.3.1.v20110307)
    
Last modified 5 years ago Last modified on Jul 9, 2019 5:33:56 PM

Attachments (3)

Download all attachments as: .zip