Transferring Files To/From the Helix Systems
There are several secure options for transferring files to and from
Helix and Biowulf, which use encrypted
passwords (unlike FTP). File transfers
to and from the systems should be performed using one of these
more secure services. Detailed setup & usage instructions for
each method are below.
Click on each section to expand, or click
here
to expand/collapse all sections.
Confused? Click here to see a comparison
of the different methods.
Disk usage quotas: Click here if you
need more disk storage space on Helix and Biowulf.
Mount Helix Systems Directories To Desktop (Inside NIH Network Only):
This method will allow you to easily drag/drop
files between your local machine and your global Helix Systems
directories. This includes /home, /data, and /scratch. Please see
disks.html
for more information about /home, /data, and /scratch.
This method can only be used for machines that are within the NIH network, including VPN connections. The NCI-Frederick campus is outside the main NIH campus firewall, so users at NCI-Frederick will need to use VPN.
-
On your desktop machine, open the 'Computer' tab and open the Tools → Map Network Drive tab.
-
Enter the directory you want to mount as follows:
- /home/[user]: \\helixdrive.nih.gov\[user]
- /data/[user]: \\helixdrive.nih.gov\data
- /scratch: \\helixdrive.nih.gov\scratch
- Shared group area (e.g. /data/PQRlab: \\helixdrive.nih.gov\name_of_shared_area
Make sure to replace [user] with your Helix login!!!
Because Helix is now authenticated using NIH Login, you should not have to enter your username or password. Click the 'Finish' button.
-
You have successfully mapped your Helix Systems directory to
your desktop machine! You should see a network icon in the My Computer folder. You
can create a shortcut to this drive on your desktop.
-
Please note that the disk usage information is not correct for your /home directory, but
it is correct for your /data directory.
More about /home, /data, and /scratch directories.
Desktop machines within the NIH network can map Helix directories via Helixdrive, so that you can
easily drag/drop files between your local machine and your Helix /home,
/data and /scratch directories. [More
information about /home, /data, and /scratch].
Note:
Mac users should consider creating (or editing) the following file on their system if they would like
like to use mapped network drives:
/etc/sysctl.conf
include this line in the file (it may be the only contents of the file):
net.inet.tcp.delayed_ack=0
After the file is created/appended, reboot your Mac. This will profoundly increase file-transfer performance.
Without this alteration, performance may be bad enough to render Helixdrive shares unusable. If you are unable or
unwilling to set this file, you'll likely want to use sftp or scp rather than Helixdrive.
This method can only be used for machines that are within the NIH network, including VPN connections.
The NCI-Frederick campus is outside the main NIH campus firewall, so users at NCI-Frederick will need to
use VPN.
- From the main Mac menu, click on Go → Connect to server.
- For 'Server address', enter the Helix directory you want to mount:
- /home/[user]: smb://helixdrive.nih.gov/user
- /data/[user]: smb://helixdrive.nih.gov/data
- /scratch: smb://helixdrive.nih.gov/scratch
- Shared group area (e.g./data/PQRlab): smb://helixdrive.nih.gov/name_of_shared_area
(Replace 'user' by your Helix username.)
- Click 'Connect' and in the subsequent window, enter your NIH Login username and password. As of 28 Aug 2012, NIH AD usernames and passwords are used to connect to all Helix & Biowulf services.
- The requested area should now be mounted as a shared drive. In your Finder window, you will see 'helixdrive.nih.gov' listed under 'Shared', and can drag and drop files to your Helix directories.
Helixdrive uses NTLMv2 to authenticate users which is not a protocol available on some (particularly older) Linux distributions. Distributions known to work "as is" are RedHat Enterprise 5, CentOS 5 and Fedora 7 or higher. However almost any recent Linux distribution should have support for NTLMv2. RedHat Enterprise 4 and CentOS 4 and older are known to not work.
Note that this method is most suitable for transferring small files. Users transferring large amounts of data to and from Helix/Biowulf should continue to use scp or sftp.
This method can only be used for machines that are within the NIH network, including VPN connections. The NCI-Frederick campus is outside the main NIH campus firewall, so users at NCI-Frederick will need to use VPN.
Typical mount commands for accessing a CIFS file system that uses NTLMv2:
To mount Helix /home/[user]:
mount -t cifs -o rw,sec=ntlmsspi,uid=johndoe,domain=NIH.gov //helixdrive.nih.gov/[user] /mnt/helix-home
To mount Helix /data/[user]:
mount -t cifs -o rw,sec=ntlmsspi,uid=johndoe,domain=NIH.gov //helixdrive.nih.gov/data /mnt/biowulf-data
To mount Helix /scratch:
mount -t cifs -o rw,sec=ntlmsspi,uid=johndoe,domain=NIH.gov //helixdrive.nih.gov/scratch /mnt/helix-scratch
To mount a shared group area: (e.g. /data/PQRlab
mount -t cifs -o rw,sec=ntlmsspi,uid=johndoe,domain=NIH.gov //helixdrive.nih.gov/PQRlab /mnt/biowulf_PQRlab
GUI File Transfer Clients:
Download from winscp.net
Click 'Open'
Select 'Next'
Select 'I Accept' then click 'Next'
Accept the default location or choose one yourself then click 'Next'
Click 'Next'
Click 'Next'
Click 'Next'
Click 'Next'
Click 'Install'
Uncheck the 'Launch WinSCP' box then click 'Finish'.
To open WinSCP, double click on the shortcut on your desktop.
Fill the host name, your helix user ID and password, select 'SFTP', then click 'Login'.
Click 'Yes'. This window only show up the first time you use WinSCP.
The left panel shows the directories on your desktop PC and the right panel shows your directories on Helix.
Click on the 'Preference' icon and browse through the tags to get an idea of all the options available.
To locate the file source and destination, simply use the two drop down boxes. Drag and drop files or folders to start transfer.
Fugu is a graphical frontend to the commandline Secure File Transfer
application (SFTP). SFTP is similar to FTP, but unlike FTP, the entire
session is encrypted, meaning no passwords are sent in cleartext form,
and is thus much less vulnerable to third-party interception.
Fugu allows you to take advantage of SFTP's security without having to
sacrifice the ease of use found in a GUI.
Fugu also includes support for SCP file transfers, and the ability to
create secure tunnels via SSH.
Download Fugu from the U. Mich. Fugu website.
Doubleclick on the downloaded Fugu_xxxx.dmg file to open. A small window with the Fugu icon will appear,
Grab the fish and copy it to your Applications folder, your
Desktop and/or your Dock.
Start Fugu by clicking on the Fugu icon. In the box for 'Connect
to:', enter 'helix.nih.gov' and click 'Connect'. Enter your NIH Login password
when requested. You should now see a window with one pane listing
files on your local desktop machine, and the other pane listing files
in your Helix account space.
You can now transfer files by dragging and dropping between the two panes.
Download Filezilla from sourceforge.net (current version = 3.5.3).
Save the setup.exe to your desktop.
Double-click on the setup.exe icon, and accept the license agreement.
Choose components, install location, and start menu folder. The defaults are almost always acceptable.
Click install. Accept and finish.
Start the Filezilla client.
Select File > Site Manager...
Click New Site and configure for helix as detailed below:
Click connect, and drag and drop files across systems.
Commandline File Transfer:
Both psftp and pscp are run through the Windows console (Command Prompt in
start menu), and require the directory to the PuTTY executables be included
in the Path environment variable. This can be done transiently through the console:
or permanently through the System Control Panel (see here for more information).
pscp
Secure Copy (pscp) is a command line mechanism for copying files to and from remote systems.
From the console, type 'pscp'. This will bring up a help menu showing all the options for pscp.
PuTTY Secure Copy client
Release 0.58
Usage: pscp [options] [user@]host:source target
pscp [options] source [source...] [user@]host:target
pscp [options] -ls [user@]host:filespec
Options:
-V print version information and exit
-pgpfp print PGP key fingerprints and exit
-p preserve file attributes
-q quiet, don't show statistics
-r copy directories recursively
-v show verbose messages
-load sessname Load settings from saved session
-P port connect to specified port
-l user connect with specified username
-pw passw login with specified password
-1 -2 force use of particular SSH protocol version
-4 -6 force use of IPv4 or IPv6
-C enable compression
-i key private key file for authentication
-batch disable all interactive prompts
-unsafe allow server-side wildcards (DANGEROUS)
-sftp force use of SFTP protocol
-scp force use of SCP protocol
To copy a file from the local Windows machine to a user's home directory on helix, type
C:> pscp localfile user@helix.nih.gov:/home/user/localfile
You will be prompted for your helix password, then the file will be copied.
To do the reverse, i.e. copy a remote file from helix to the local Windows machine, type
C:> pscp user@helix.nih.gov:/home/user/remotefile .
(you must include a '.' to retain the same filename, or explicitly give a name
for the remotefile copy).
psftp
Secure FTP (psftp) allows for interactive file transfers between machines in
the same way as good old FTP (non-secure) did.
From the console, type 'psftp'. This will start a sFTP session, but it will
complain that no connection has been made. To transfer a local file to helix, at the psftp prompt type:
psftp> open user@helix.nih.gov
You will again be prompted for a password.
Once a session to helix has been established, the standard FTP commands can be used.
For even more information, see http://the.earth.li/~sgtatham/putty/0.58/htmldoc/
scp is a secure,
encrypted way to transfer files between machines. It is available
on Macs and Unix/Linux machines.
To transfer a file from your local machine to Helix, open a terminal window on your local
machine. In this window type
scp mylocalfile username@helix.nih.gov:/home/username
where 'username' is your Helix username. The scp program will prompt
you for your Helix password before transferring the file.
To download a file from
your Helix account to your desktop machine, use the following command
in a terminal window on your local machine.
scp username@helix.nih.gov:/home/username/myfile .
As before, 'username' is your Helix username, and scp will prompt
you for your Helix password before transferring the file.
sftp
is an interactive file transfer program, similar to ftp(1), which
performs all operations over an encrypted ssh(1) transport. It
may also
use many features of ssh, such as public key authentication and
compres-
sion. sftp connects and logs into the specified host, then
enters an
interactive command mode.
From the user perspective, sftp works very like ftp. Sample session (user input in
bold)
[mysystem:~] user% sftp helix.nih.gov
Connecting to helix.nih.gov...
Notice to Users
This U.S. Government computer system is provided for authorized use
only. Any and all uses of this system and all files on this system
may be monitored, copied or disclosed by authorized personnel. The
data on the system may be searched at the request of law enforcement
or other persons, as appropriate, and may be disclosed and used for
disciplinary or civil action or criminal prosecution. Use of this
computer system constitutes consent to these policies, which may take
precedence over privacy rights.
user@helix.nih.gov's password:
sftp> get blast_output
Fetching /home/user/blast_output to blast_output
/home/user/blast_output
100% 5520 5.4KB/s 00:00
sftp> put myseqfile
Uploading myseqfile to /home/susanc/myseqfile
myseqfile
100% 5820 5.4KB/s 00:00
sftp> quit
[mysystem:~] user%
A modified version of the 'scp' file transfer program
called 'scp-hpn' is now available for download. Installing
scp-hpn on your Linux workstation or server will allow
significantly increased file transfer speeds to/from the
Biowulf login node. Testing by the Biowulf staff has
achieved average data transfer rates of over 100 MB/s on
1 Gb/s network paths.
You can download scp-hpn from http://biowulf.nih.gov/hpn.tar.gz,
place the hpn.tar.gz file in the top level of your home directory
on your Linux workstation or server and run the command
"tar xfvz hpn.tar.gz". See the README file included in the
distribution for additional details. Note that this program is
not available for Windows or MacOS.
Downloading data from NCBI:
NCBI makes a large amount of data available through the NCBI ftp site, and also provides most or all of the same data on their Aspera server. Aspera is a commercial package that has considerably faster download speeds than ftp. More details in the NCBI Aspera Transfer Guide.
You can use the Aspera command-line client (
ascp) on Helix or Biowulf to download data from NCBI directly into your Helix or Biowulf account space.
ascp is installed on both Helix and Biowulf. Typing 'ascp' without any parameters will give you a summary of ascp options.
Sample session (user input in bold):
helix% cd /data/username
helix% ascp -i /usr/local/aspera/connect/etc/asperaweb_id_dsa.putty -QT -l500M anonftp@ftp-private.ncbi.nlm.nih.gov:/snp/organisms/human_9606/ASN1_flat /data/username/snp_human_9606
ds_flat_ch1.flat.gz 100% 323MB 184Mb/s 00:17
ds_flat_ch10.flat.gz 100% 193MB 196Mb/s 00:20
ds_flat_ch12.flat.gz 100% 186MB 196Mb/s 00:16
ds_flat_ch11.flat.gz 100% 198MB 196Mb/s 00:24
ds_flat_ch18.flat.gz 100% 107MB 196Mb/s 00:12
ds_flat_ch14.flat.gz 100% 123MB 14.8Mb/s 00:15
[...]
If your download stops before completion, you can use the -k2 flag to resume transfers without re-downloading all the data. e.g.
helix% ascp -k2 -i /usr/local/aspera/connect/etc/asperaweb_id_dsa.putty -QT -l500M anonftp@ftp-private.ncbi.nlm.nih.gov:/snp/organisms/human_9606/ASN1_flat /data/username/
ds_flat_ch1.flat.gz 100% 323MB 0.0 b/s 00:03
[...]
ds_flat_chPAR.flat.gz 100% 7742KB 402 b/s 00:01
ds_flat_chUn.flat.gz 100% 39MB 107Mb/s 00:00
ds_flat_chX.flat.gz 100% 104MB 196Mb/s 00:18
ds_flat_chY.flat.gz 100% 14MB 3.3Mb/s 04:59
Completed: 1706213K bytes transferred in 301 seconds
(46432K bits/sec), in 30 files, 1 directory.
In the example above, the client skips over the files that had previously been transferred, and will download only the remaining files.
Typical file transfer rates from the NCBI server are 400 - 500 Mb/s, so '-l500M' is the recommended value.
Data transfer by this method will be slower than using the command-line client on Helix, but may be more convenient for smaller transfers. You will need to download the free Aspera client browser plugin, install it on your desktop browser, and download the data to a Helix/Biowulf data area that is mapped onto your desktop system.
- Download the Aspera Connect browser plugin from the Aspera website and install on your Mac, Windows, or Linux system.
- Map your Helix /data or /scratch area on your desktop system as described in the section above on Mapped Network Drive.
- Start up Aspera Connect on your Mac, Windows or Linux system. Go to Preferences->Network, and set the connection speed to the maximum value. In our tests, the actual typical download speed to a desktop system is 50 - 100 Mb/s.
- Point your browser to the NCBI Aspera server and select the directory or files you want to download. Select your Helix data or scratch areas as the download target area. You can monitor the download in the Aspera transfer manager window.
By clicking on the icon in the transfer manager window, you can open the Transfer Monitor which will show a more detailed graph of the transfer rate
- Start firefox on helix.
- Download the Aspera Connect browser plugin from
the Aspera website
to your home directory. Click on the download tab and choose the operating system, which is v2.4.7 - Linux x86_64
- Close firefox
- cd to the directory to which you downloaded the file
- At the helix prompt, type
sh aspera-connect-2.4.7.37118-linux-64.sh
which will create the directories .aspera/connect in your home directory.
To actually use the aspera plugin on helix, you will have to install the NoMachine
NX client specific to your operating system from
NoMachine web site.
Select your client from the Client Products list. If you are using
Windows, also download all the fonts before running the software. Once you
have the client installed, run the "NX Connection Wizard"
Give the session a name such as helix
Host: helix.nih.gov
Internet Connection: LAN
Desktop: Unix Gnome
NOTE: KDE is not installed on helix
You can click through with no changes after that.
- Login as yourself
- Applications -> Accessories -> Terminal
- At the helix prompt, enter
./.aspera/connect/bin/asperaconnect&
An icon that looks like the letter G should show up in the
right hand corner next to the clock. Right-click on the icon and click "Preferences" to set them as requested
on the NCBI 1000genomes page. Note: change the download directory to your /data directory.
- Start firefox on helix again and go to the website to download your
files.
It is also possible to download data from NCBI using ftp. In our tests, the Aspera client gave up to 5x faster transfer speeds than NCBI. However, some data may only be available on the NCBI ftp server.
On Helix or Biowulf, use ftp ftp.ncbi.nlm.nih.gov to access the NCBI ftp site. Sample session (user input in bold):
helix% ftp ftp.ncbi.nlm.nih.gov
Connected to ftp.wip.ncbi.nlm.nih.gov.
220-
Warning Notice!
[...]
---
Welcome to the NCBI ftp server! The official anonymous access URL is ftp://ftp.ncbi.nih.gov
Public data may be downloaded by logging in as "anonymous" using your E-mail address as a password.
Please see ftp://ftp.ncbi.nih.gov/README.ftp for hints on large file transfers
220 FTP Server ready.
500 AUTH not understood
500 AUTH not understood
KERBEROS_V4 rejected as an authentication type
Name (ftp.ncbi.nlm.nih.gov:susanc): anonymous
331 Anonymous login ok, send your complete email address as your password.
Password:
230 Anonymous access granted, restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd blast/db/
250 CWD command successful
ftp> get wgs.58.tar.gz
local: wgs.58.tar.gz remote: wgs.58.tar.gz
227 Entering Passive Mode (130,14,29,30,195,228)
150 Opening BINARY mode data connection for wgs.58.tar.gz (983101055 bytes)
226 Transfer complete.
983101055 bytes received in 1.3e+02 seconds (7.7e+03 Kbytes/s)
ftp> quit
221 Goodbye.
helix%
Web Browsers
It is not possible to transfer files in or out of your Helix account
via ftp in a web browser. Such file transfers are inherently
insecure with unencrypted passwords being sent over the network.
FTP is inherently insecure because it sends data and most importantly your password in plain, unencrypted text.
SCP and sFTP use an SSH2 encrypted connection to transfer both data and password information. While security is
good, it comes at the price of slower transfer rates than FTP.
For those who would need the transfer rates of FTP and are not concerned with data insecurity, we provide access to
anonymous FTP on Helix.
The rate of data transfer is only an issue for data amounts greater than 256MB. For amounts less than this,
any application will suffice. To optimize transfer rates for large amounts of data, use less demanding encryption ciphers,
such as blowfish or arcfour, and try to transfer the data when the network is less busy (before 10 am and after 6 pm).
Also use the most appropriate application based on the table below.
The Helix Staff has compared the applications and our results are below. For the most part we recommend using
Filezilla for Windows and Fugu for Macs. scp is the default and best option for Linux/Unix machines.
Platform |
Application |
Pros |
Cons |
All platforms |
Filezilla v3.0 |
Better control over transfer during the process, fewer and simpler controls than WinSCP, fastest transfer rates by sFTP. |
scp not an option. |
Windows |
WinSCP |
Much faster transfer rates than PuTTY-pscp/psftp, but
slightly faster than Filezilla for uploads using scp (rates were found to vary considerably by cipher used, in the order of Blowfish > AES >> 3DES), highly comprehensive configuration. |
Cumbersome user interface for changing local and remote
directories. |
|
pscp/psftp |
Direct command line control over process. |
Need to run through the command prompt, slowest transfer rates seen. |
|
Mapped Network Drive |
Convenient. |
Fairly slow transfer rates, especially very large files. |
Macs |
Fugu |
Easy to configure and use. Same transfer rates as scp. |
None. |
|
Mapped Network Drive |
Convenient |
Fairly slow transfer rates, especially for large files. |
|
scp,sftp |
Can be used for scripting & automatic file transfers, fastest transfer rates with appropriate ciphers. |
non-GUI interface. |
Linux/Unix |
scp,sftp |
Same as for Macs. |
Same as for Macs. |