Computing Services

ICS-ACI Users’ Guide


1 Introduction

Cyberscience is a fast growing mode of discovery, in addition to traditional theory and experiment, because it provides a unique virtual laboratory to investigate complex problems that are otherwise impossible or impractical to address. Such problems can range from understanding the physics of the origins of the universe, the genomic/molecular basis of disease, or the socioeconomic impacts of a digital society; to designing smart structures and nanoscale tailored materials; or to developing systems for clean energy or realtime responses to threats. The intellectual strength of computational science lies with its universality – all research domains benefit from it. The expectation is that The Institute for CyberScience will succeed both in facilitating research across a broad spectrum of disciplines and in securing significant external resources for cyberscience-related research for years to come.


1.1 What is ICS-ACI?

The Institute for CyberScience Advanced CyberInfrastructure (ICS-ACI) is our high-performance computing (HPC) infrastructure. The name also refers to the services associated with this system. ICS-ACI provides secure, high-quality advanced computing and storage resources to the Penn State research community.


1.2 What does ICS-ACI do?

ICS-ACI contributes to the ICS mission by providing researchers with the hardware, software, and technical expertise needed to solve problems of scientific and societal importance. ICS-ACI provides a variety of services, including operations, backup, technical consulting, and training. It offers over 1000 servers with more than 23,000 processing cores, 6 Petabytes (PB) of disk parallel file storage, 12 PB of tape archive storage, high-speed Ethernet and Infiniband interconnects, and a large software stack. ICS-ACI is also compliant with specific NIH and NIST security controls.


1.3 Our Mission

The mission of the Institute for CyberScience is to build capacity to solve problems of scientific and societal importance through cyber-enabled research. As computation and data science become increasingly vital modes of inquiry, we enable researchers to develop innovative computational methods and to apply those methods to research challenges. Specifically, we:

  • foster a collaborative, interdisciplinary scholarly community focused on the development and application of innovative computational methods;
  • expand participation in interdisciplinary research through strategic investments and effective outreach;
    and
  • provide a vibrant world-class cyberinfrastructure by maintaining and continually improving hardware and software solutions and technical expertise.

1.4 Our Vision

ICS will expand its role as an international leader in advancing cyberinfrastructure along with computational and data-driven methods and in driving their application to interdisciplinary research. We will use our expertise coupled with our state-of-the-science research infrastructure to support cyber-enabled interdisciplinary collaborations and attract the worlds best researchers. These researchers will form a vibrant intellectual community empowered to use the latest and most effective computational methods to make transformative discoveries for science and society.


2 ICS-ACI History

In 2011 Penn State established an intra-university Cyberscience Task-Force to develop a strategic and coherent vision for cyberscience at the university. On the recommendations of this task-force, the Institute for CyberScience was established in 2012. ICS is one of five interdisciplinary research institutes under the Office of the Vice President for Research. Peer institutes include the Huck Institutes of the Life Sciences, the Materials Research Institute, the Institutes of Energy and the Environment, and the Social Science Research Institute.

Under ICS’s first director, Padma Raghavan, ICS began a cluster hiring initiative in 2012-2013, in partnership with Penn State colleges and institutes. This initiative, the ICS Co-hire program, brought in promising computational experts from a range of fields.

In 2014 Executive Vice President and Provost Nick Jones and Vice President for Research Neil Sharkey initiated a series of steps designed to help Penn State deliver the broad spectrum of computing and data services that are required to advance research. As part of this ongoing process, ICS continues to develop and sustain advanced cyberinfrastructure, with the goal of accelerating research outcomes by enhancing researcher productivity.


3 System Overview

ICS-ACI is a heterogeneous cluster that consists of multiple node-types connected to a common file system. The primary portions are ACI-B, the batch portion of the cluster; ACI-I, the interactive portion; and the data-manager nodes.


3.1 ACI-B

ACI-B, the batch portion of ICS-ACI, is used to submit jobs to dedicated resources. ACI-B has the hostname

aci-b.aci.ics.psu.edu

and can be logged into using ssh. Users will be placed on a head node, which is not intended for heavy processing. The head node should only be used to submit jobs.

Typically, a job submission script including the resource requests and the commands is submitted. A job scheduler will wait until dedicated resources are available for this job. Jobs are submitted either to the Open allocation, which any PSU student/faculty/staff are able to use, or to a paid allocation. Jobs are typically submitted with the qsub command:

qsub subScript.pbs

3.1.1 Types of ACI-B Nodes

Compute resources are available in four configurations: Basic Memory, Standard Memory, High Memory, and GPU.

Node TypesSpecifications
Basic2.2 GHz Intel Xeon Processor, 24 CPU/server, 128 GB RAM, 40 Gbps Ethernet
Standard2.8 GHz Intel Xeon Processor, 20 CPU/server, 256 GB RAM, FDR Infiniband, 40 Gbps Ethernet
High2.2 GHz Intel Xeon Processor, 40 CPU/server, 1 TB RAM, FDR Infiniband, 10 Gbps Ethernet
GPU2.5 GHz Intel Xeon Processor, 2 Nvidia Tesla K80 computing modules/server, 24 CPU/server, Double Precision, FDR Infiniband, 10 Gbps Ethernet


3.2 ACI-I

ACI-I provides a set of interactive cores that are configured as common GUI interactive systems. ACI-I is a shared resource where users are placed on an interactive node with other users. ACI-I has the hostname

aci-i.aci.ics.psu.edu

and can be logged into using ssh or with Exceed onDemand.

Often ACI-I is used to develop and test small scale test cases due to the ability to use a graphical user interface. Once the model has been developed, it can be submitted as a job to ACI-B to take advantage of the greater computational resources available on ACI-B.

For example, a researcher might log in to ACI-I to develop a finite element model using the graphical user interface for COMSOL. To test the model, small simulations on a course mesh can be run on ACI-I. Then once the model has been deemed satisfactory, the researcher can log in to ACI-B to submit the COMSOL model using a much finer mesh.

Individual processes are limited to

  • 4 processors
  • 12 CPU hours per process
  • 48 GB resident memory

on ACI-I. Note that the resident memory constraint still allows for memory that can be sent to virtual memory during times of high usage.


3.3 ICS ACI Open Queue

All Penn State students, faculty and staff are able to run on the open queue on ACI-B for no charge. The current limits per user on the open queue are:

  • 100 jobs pending
  • 100 cores executing jobs at any given time
  • 48-hour job wall-times
  • 24-hour interactive session durations 

Jobs requiring more time or processors than this are required to run on an allocation.

Jobs running on the open allocation are placed on available compute nodes. These are available as they are not being used by the group who has an allocation reservation on that node. If that group does require these resources, the running open queue jobs are pre-empted. Once the allocation job has completed, your job will continue, if the code running allows for this to occur.

ACI-I is open to any and all users, regardless of allocation.


3.4 Filesystems

The ICS-ACI system has several filesystems available for users for active and archival storage. Active storage can be utilized by running jobs and archival storage is intended for long-term data storage.

Active Storage
All of the active storage is available from all of the ICS-ACI systems. Individual users have home, work and scratch directories that are created during account creation. The work and scratch directories should have links within the home directory, allowing for easy use. A user’s home directory is for personal files and cannot be shared. Work and scratch are able to be shared. Both home and work are backed up. Scratch is not backed up and files are subject to deletion 30 days after creation. Do not keep important files on scratch.

Group directories can be created to help facilitate research within a group and can be purchased as an allocation. Note that individual allocations will have separate locations within the group directory.

Archival storage
The archival storage is only available on the file manager nodes. Archival storage can be purchased as an allocation.

SpaceLocationQuotaFile LimitBacked-UpFile Lifetime Limit
Home/storage/home/userID10 GB500,000YesNone
Work/storage/work/userID128 GB1,000,000YesNone
Scratch/storage/scratch/userIDNone1,000,000No30 Days
Group/gpfs/group/groupID5 TB blocks1,000,000 per TBYesNone
Archive/archive/groupID5 TB blocksN/AYesNone


3.5 Data Manager

The data manager nodes are dedicated to file transfers, both within ICS-ACI and between ICS-ACI and other systems. It can be used with command line file-transfer tools, such as rsync, sftp or scp, as well as with Globus, WinSCP, or FileZilla.

The data manager hostname is

datamgr.aci.ics.psu.edu

For example, to connect to data manager, you can usee the command

ssh datamgr.aci.ics.psu.edu

to log in. After logging in, you can perform your file transfer.


4 System Access

The ICS-ACI systems are available for all users with Penn State access. Non-Penn State members who are collaborating with Penn State researchers are able to get a Penn State SLIM access account and then sign up for an ICS-ACI account.


4.1 Sponsorship

Each non-faculty member signing up for an account must have a sponsor. This is typically the adviser or course instructor. The request requires the Penn State username and not an alias. The Penn State directory can be used to figure out the username if only another email alias is known.


4.2 Permissions to use Resources

The users who have access to an allocation are placed in an allocation group. Users can see all of the groups they are in by using the id command.

To gain access to an allocation or a group storage, have the PI send an email to the i-Ask center (iask@ics.psu.edu) stating the user IDs (ex. abc123) and the allocation(s) and group storage(s) you wish to add them to. This explicit permission must be granted before users are allowed access.


4.3 Getting an Account

Users with a Penn State ID can sign up for an account using: https://accounts.aci.ics.psu.edu

Faculty member accounts require no sponsorship. Students and staff require a faculty sponsor, which must be listed by their original Penn State ID, rather than by an alias. The sponsor will get an email stating they were listed as a sponsor. The faculty member can respond to the iAsk center (iAsk@ics.psu.edu) with either explicit approval or a denial. If no denial is given, the student or staff member will be granted implicit approval after two business days. Faculty members can send an email with multiple users if they will be sponsoring multiple accounts, such as for a class project. After an account has been approved, it can take up to twenty-four hours before the system updates and the user is able to login.

Users who do not have Penn State ID but are collaborators from other institutions need to acquire a Penn State SLIM account before they sign up for ACI account and DUO. To request SLIM account, please follow these instructions.

You will need to wait for your SLIM access account to be created before you can proceed to request your ICS-ACI account or sign up for two-factor authentication.


5 Basics of the ICS-ACI Resources


5.1 System Usage

The ICS-ACI system uses the Red Hat 6.6 Linux operating system with the module system set up for software. All users will have to use the terminal to access programs, including Exceed onDemand users of ACI-I.

5.1.1 Shells

Unix/Linux shells are command line interpreters that allow for a user to interact with their operating system through utility commands and programs. The Default Unix/Linux shell is BASH (the Bourne-Again SHell) which has extensive online documentation, and common or necessary commands are shown in the table below.

CommandDescription (For full documentation, use the command 'man command' to bring up the manual or find online documentation)
lsThe 'list' (ls) command is used to display all the files in your current directory. Using the '-a' flag will also show any hidden files (typically files beginning with a '.' like .bashrc)
cdThis is the 'change directory' command. Use this to traverse directories (like 'cd work'). To move back a directory level, use 'cd..'.
mvThe 'move' command takes two arguments, the first being the file to move and the second being the directory said file should be moved to ('mv file.txt /work/newdirectory/'). Note: 'mv' can also be used to rename a file if the second argument is simply a new file name instead of a directory.
mkdirThis command is used to make directories.
rmdirThis command is used to delete directories ('rm -rf directory' would also work).
touchThis command is used to create files in a similar way to mkdir. ('touch test.txt' will create an empty text file named test).
rmThis is the 'remove' command. As mentioned above, it can be used recursively to delete entire directory trees, or it can be used with no flags and a file as the argument to delete a single file.
locateThis command is used to locate files on a system. The flag '-i' will make the query case-
insensitive, and asterisks ('*') will indicate wildcard characters.
clearClears the terminal of all previous outputs leaving you with a clean prompt.
historyShows the previous commands entered.
findUsed for finding files, typically with the -name flag and the name of the file.
grepUsed for searching within files.
awkA programming language typically used for data extraction.
idShow all of the groups a user is in.
duShow the disk usage. Typically used with -h (for human readable) and –max-depth=1 to limit to only the directories in that level rather than all files.
envPrint out all of the current environment variables.
lessView a file.
cpCopy a file. Note the -r (recursive) flag can be used to copy directories.
aliasCreate an alias (something short) that is interpreted as something else (a complicated command).
pwdPrint the current working directory.
chmodChange file permissions.
chgrpChange group for a file or directory.
lddShow the shared libraries required for an executable or library.
topSee the node usage. Often used with command U .
/usr/bin/timeShow time and memory statistics for a command being run. Often used with the -v (verbose) flag.
bgContinue running a paused task in the background
fgBring a background task into the foreground
Ctrl + cKill a process.
Ctrl + zSuspend a process
Ctrl + r Search through your history for a command that includes the text typed next.


There are also some special characters to be aware of that will be helpful.

  • ~ is your home directory
  • . means here
  • .. means up one directory
  • * is the wildcard: * for all files or *.png for all png files
  • |is pipe (send the output to another command)
  • > means write command output to a file (Example: ls > log.ls)

Most commands have a manual that show all of the different ways the command can be used. For example

man ls

shows all of the info for the ls command. You can use the arrows to scroll through the manual and the letter q for quit. Some commands will also provide a shortened version of the manual showing the available flags if an incorrect flag is used. For example,

mam-list-funds -banana

brings up a list of all of the flags available for mam-list-funds. Any non-working flag will allow for this. Note that this doesn’t give information about what the flags do, just what the flags are. This may be enough to remind you of something you had done previously.

All shells utilize configuration files. For BASH, this is split between 2 files: ~/.bash_profile and ~/.bashrc.
(NOTE: ~/ in Linux is a way to specify your own home directory!). The .bash_profile file is always parsed when a terminal is open, including with an SSH session. To connect the two in such a way that .bashrc will always be sourced for a session, make sure this code is included in your ~/.bash_profile:

if [ -f ~/.bashrc ]; then . ~/.bashrc fi

5.1.2 Alternative Shells

BASH is only the default shell, and it doesn’t come with quite a few features that many Linux power-users would like to have on the command-line. Other common shells include KSH (KornSHell), ZSH (Z SHell), and FISH (Friendly-Interface SHell). These shells all have documentation available online regarding their installation and customization.

5.1.3 Environmental Variables

Environment variables are values that pertain to certain aspects of an operating system’s configurations. These variables are typically used by utilities and programs for things like finding out where the user’s home directory is ($HOME) or where to look for executable files ($PATH). The prompt for BASH is held as the variable PS1.

You can print the environment variable to the screen using the echo command:

echo $HOME

A good way to view environment variables that are set is by using the env command

env

which outputs all of the variables currently in use.

To change the value of an existing variable or to create and set a new variable, we use export. For example, to set a variable called workDir to a directory called here within your home directory, the command would be:

export workDir=$HOME/here

Once this environment variable is set, you are able to use this. For example, to change to this directory, the command would be:

cd $workDir

For something like PATH where you really do not want to overwrite what values are already stored, you can append values with

export PATH=$PATH:/new/dir/path/

In lists of values, the colon (:) is used as the delimiter. The dollar sign ($) is used to reference variables, so that export command essentially appends the new directory to the list of existing directories searched for executables. It is possible to prepend as well, which may come in handy if you compile a different version of an existing command.

For more general reading on environment variables in Linux, see these pages on variables and environmental variables.

The environment variables allow for script portability between different systems. By referencing variables like the home directory ($HOME) you can generalize a script’s functionality to work across systems and accounts.

Variable NameDescription
USERYour user ID
HOSTNAMEThe name of the server that the script is being run on
HOMEYour home directory
WORKYour work directory
SCRATCHYour scratch directory
TMPDIRThe directory in which a job's temporary files are placed (created and deleted automatically)

5.1.4 References

The Linux terminal and submitting jobs are not unique to ICS-ACI. You can find many different training resources online for these. The Linux foundation offers free training. Lots of great information and tutorials for everyone from beginner Linux user to advanced users can be found here. Linux has been around for a long time. Therefore, any problem you might be having, someone has probably already had. It is always worthwhile to look around stack exchange to see if your question has already been answered.


5.2 Module System

ICS-ACI now uses the Lmod environment modules system. Environment Modules provide a convenient way to dynamically change the users environment through modulefiles. This includes easily adding or removing directories to the PATH, LD_LIBRARY_PATH, MANPATH, and INFOPATH environment variables. A modulefile contains the necessary information to allow a user to run a particular application or provide access to a particular library. All of this can be done dynamically without logging out and back in. Modulefiles for applications modify the users path to make access easy. Modulefiles for library packages provide environment variables that specify where the library and header files can be found. Learn more about modules on TACC’s website.

5.2.1 Module Families

ICS uses module families for compilers and parallelization libraries. Modules that are built with a parent module, such as a compiler, are only available when the parent module is loaded. For example, the version of LAPACK built with the gcc module is only available when the gcc module is located.

A good way to illustrate how the module families work is to view the available modules before a family is loaded as well as after. You can do this with the gcc family by inspecting the output of


module purge

module avail

module load gcc/5.3.1

module avail

5.2.2 Using Modules

You can load modules into your environment with the command with the module load command. For example, to load the gcc module, you can use the command:

module load gcc/5.3.1

Note that the version number is not required. Each software will have a default module that will be loaded if no version number is provided. However, it is recommended that you put the version number so that you know and have a record of what version is being used.

You can view the modules that you currently have open using the module list command:

module list

You can also unload modules that you do not need in the same way:

module unload gcc/5.3.1

It is also possible to remove all of your loaded modules at once using purge:

module purge

5.2.3 Querying Modules

You can view the available modules using the command:

module avail

Note that this only looks at the available modules, which may be limited by module family based on modules are currently loaded. You can search all of the modules using module spider. For example, to search for VASP, you can use the command

module spider vasp

which will search through all module names and module files to return anything related to vasp.

The module can be used to give you information about the software using the module show command. For
example, the information about the hdf5 module, which was built using the gcc module, can be seen using the commands:


module load gcc/5.3.1

module show hdf5/1.8.18

The output of this is shown:


---------------------------------------------------------------------

/opt/aci/modulefiles/Compiler/gcc/5.3.1/hdf5/1.8.18.lua:

---------------------------------------------------------------------

help([[HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections. The HDF5 technology suite includes: A versatile data model that can represent very complex objects and a variety of metadata. A completely portable file format with no limit on the number or size of data objects in the collection. A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces. A rich set of integrated performance features that allow for access time and storage space optimizations. Tools and applications for managing, manipulating, viewing, and analyzing the data in the collection.
]])

whatis("Description: HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections.")

whatis("URL: https://support.hdfgroup.org/HDF5/")

prepend_path("PATH","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1/bin")

prepend_path("LD_LIBRARY_PATH","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1/lib64")

prepend_path("C_INCLUDE_PATH","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1/include")

prepend_path("CPLUS_INCLUDE_PATH","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1/include")

prepend_path("LIBRARY_PATH","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1/lib64")

pushenv("HDF5","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1")

pushenv("HDF","/opt/aci/sw/hdf5/1.8.18_gcc-5.3.1")

Note that this tells you some information about the software, gives a website for more help and shows the environment variables that are modified. The environment manipulation section can be very helpful for users who are compiling codes and linking to libraries as these paths indicate where the relevant objects may be found.

5.2.4 Controlling Modules Loaded at Login

Most shells have a configuration file that allows you to set aliases (nicknames for commands both simple or complex), set environment variables, and automatically execute programs and commands. In this case we are interested in the last mentioned feature: automating commands at login. For BASH there are two files at play: ~/.bash_profile and ~/.bashrc. To force your bashrc to be sourced in every opened terminal and SSH session, include this code in your bash_profile:

if [ -f ~/.bashrc ]; then

. ~/.bashrc

fi

Once that has been done, you can add whatever automated module loads you want in the .bashrc file by including:

module load <module name>/<version>

The version specification is optional, excluding it will cause whatever the default version is to be loaded. Other shells have similar configuration methods that are detailed in online documentation.


5.3 Connecting to ACI-B

Users can connect to ACI-B with the hostname

aci-b.aci.ics.psu.edu

using ssh. Users connecting with ssh are encouraged to use the secure x-window forwarding flag (-Y) if x-windows will be used during the session. Note that the screen may not show * symbols for each keystroke when your password is being entered.

ssh -Y @aci-b.aci.ics.psu.edu


5.4 Connecting to ACI-I

Users can connect to ACI-I with the hostname

aci-i.aci.ics.psu.edu

using ssh or Exceed onDemand. Note that the screen may not show * symbols for each keystroke when your password is being entered.

5.4.1 Exceed onDemand

Exceed onDemand is a commercial software product that allows a high-fidelity graphical connection to be made to a remote system by means of a dedicated display server. Remote Desktop connections present users with a familiar desktop GUI and do not require manual configuration of an SSH client. The remote GUI applications may be presented as a self-contained remote desktop with its own window management (similar to Remote Desktop) or may be seamlessly integrated into the users existing desktop.

The information in this document is intended for use with only ACI-I. The graphical interface still requires the use of the terminal program to start GUIs from individual programs. This can be found using the drop down menus in the upper left hand portion of the screen:

Applications System Tools Terminal

To install Exceed onDemand on Windows:

  1. Download the latest version of the Exceed onDemand client for Windows: either a 32-bit (x86) or 64-bit (x64) version.
  2. When the download completes, unzip the file and run Setup.exe.
  3. The Open Text Exceed OnDemand Client Setup Wizard will appear. Click the “Next” button.
  4. In the License Agreement step, select the “I accept” radio button, and click “Next.”
  5. Click “Next” several times to proceed through the Customer Information, Destination Folder, Custom Setup, Additional Install Options steps, accepting the default options.
  6. In the “Ready to Install the Product” step, click “Install.” When the product has installed, click “Finish.”
  7. An Exceed onDemand Client shortcut should now appear on your desktop and Start Menu. Double-click the shortcut to launch the Exceed onDemand client. In the window that appears, enter:
    • aci-i.aci.ics.psu.edu in the “Host” box
    • your Penn State Access Account ID in the “User ID” box
    • your Penn State Access Account password in the “Password” box
    • click “Login”
  8. Once you have logged in, a new dialog will appear asking you to choose the Xconfig and Xstart that you would like to use.
    • To have the ACI-i desktop appear in a separate window, as it does with Remote Desktop, choose one of the Desktop_Mode Xconfig options and the Gnome_Desktop.xs Xstart option.

    Once you have chosen your desired Xconfig and XstarLint options, click the “Run” button. You will then be connected to ACI-i.

To install Exceed onDemand on Mac OS X:

Note: If you are running OS X Mountain Lion (10.8) or newer, you must first install XQuartz, log out, and log back in so that you have a working X-Server on your machine. Previous versions of OS X included an X-Server, but versions 10.8 and above do not. XQuartz can be downloaded from the XQuartz page.

    1. Download the Exceed onDemand client for Mac: EoDClient8-13.8.9.dmg.
    2. Navigate to your downloads folder, unzip the file, then double-click on the downloaded dmg file, and the Exceed OnDemand setup package will appear. Double-click on the Exceed onDemand Client.pkg
    3. The Open Text Exceed OnDemand Client Setup Installer will appear. Click the “Continue” button.
    4. In the License step, click “Continue,” then “Agree.”
    5. In the Installation Type step, click “Install.” If the installer asks for your password, enter your local password (the one you use to log into your Mac).
    6. In the Summary step, click “Close.”
    7. An Exceed onDemand Client application will now appear in your Applications folder. Navigate there in Finder and double-click on the application.
    8. In the window that appears, enter 
      • aci-i.aci.ics.psu.edu into the “Host” box
      • your Penn State Access Account ID into the “User ID” box
      • your Penn State Access Account password into the “Password” box
      • click “Login.”
    9. Once you have logged in, a new dialog will appear asking you to choose the Xconfig and Xstart that you would like to use.
      • To have the ACI-i desktop appear in a separate window, as it does with Remote Desktop, choose one of the Desktop_Mode Xconfig options and the Gnome_Desktop.xs Xstart option.

      Once you have chosen your desired Xconfig and Xstart options, click the “Run” button. You will then be connected to ACI-i.

To install Exceed onDemand on Linux:

  1. Download the Exceed onDemand client for Linux, either a 32-bit (eodclient8-13.8.9.994-linux-i586.tar) or 64-bit (eodclient8-13.8.9.994-linux-x64.tar) version.
  2. Open a terminal and type “tar xzf <your_downloads_folder>/eodclient-latest-linux-x64.tar.gz”, then type “cd Exceed_onDemand_Client_8”, and finally “./eodxc”
  3. In the window that appears, enter 
    • aci-i.aci.ics.psu.edu into the “Host” box
    • your Penn State Access Account ID into the “User ID” box
    • your Penn State Access Account password into the “Password” box.
  4. Next you must choose the Xconfig and Xstart that you would like to use.
    • To have the ACI-i desktop appear in a separate window, as it does with Remote Desktop, choose one of the Desktop_Mode Xconfig options and the Gnome_Desktop.xs Xstart option.

    Once you have chosen your desired Xconfig and Xstart options, click the “Run” button (green arrow). You will then be connected to ACI-i.

ICS recommends starting with the configuration:

  • Xconfig: Desktop_Mode_1280x1024.cfg
  • Xstart: Gnome_Desktop.xs

Some users may find other settings, such as full-screen mode, to be more conducive for their research needs, but these do not consistently work well across all operating systems and monitor types.

Users are only allowed to have one session open at a time. Sessions can be continued by quitting rather than logging out. Continued sessions are problematic though, so please save your work even if sessions are left open.


5.5 Transferring Data to and from ICS-ACI

There are many different ways to transfer data to and from ICS-ACI.

5.5.1 Command line File Transfer

There are two main command-line SSH commands to transfer files: scp and sftp. scp is a non-interactive command that takes a set of files to copy on the command line, copies them, and exits. sftp is an interactive command that opens a persistent connection through which multiple copying commands can be performed.

scp
To copy one or more local files up to the ICS-ACI server, the scp syntax would be:

scp local_file <username>@datamgr.aci.ics.psu.edu:<target_directory>

The default port for scp is set to 22. If you use this port you will be automatically directed to Duo Push authentication during 2FA.

For user abc123 to copy the local files foo.c and foo.h into their home directory on the host aci-b.aci.ics.psu.edu, the following command would be used:

[abc123@local ~]$ scp foo.c foo.h abc123@datamgr.aci.ics.psu.edu:~/.

Users are able to choose which 2FA setting is used by utilizing port 1022.

[abc123@local ~]$ scp -P 1022 foo.c foo.h abc123@datamgr.aci.ics.psu.edu:~/.

The -r (recursive) flag can be used to transfer directories.

[abc123@local ~]$ scp -r dirA abc123@datamgr.aci.ics.psu.edu:~/.

Users can also copy files from ACI onto their own computer using

[abc123@local ~]$ scp abc123@datamgr.aci.ics.psu.edu:~/fileA .

sftp
sftp is an interactive command that uses the same syntax as a standard command-line ftp client. It differs from a standard ftp client in that the authentication and the data transfer happen through the SSH protocol rather than the FTP protocol. The SSH protocol is encrypted whereas the FTP protocol is not.

There are a number of basic commands that are used inside of stfp:

  • put filename: uploads the file filename
  • get filename: downloads the file filename
  • ls: lists the contents of the current remote directory
  • lls: lists the contents of the current local directory
  • pwd: returns the current remote directory
  • lpwd: returns the current local directory
  • cd directory: changes the current remote directory to directory
  • lcd directory: changes the current local directory to directory

The syntax for calling sftp is:

sftp username@hostname

To choose between different options for 2FA you have to set the port to 1022 using P flag similar to ssh.

An example sftp session, with both the inputs and outputs, would be:

[abc123@local ~]$ sftp abc123@aci-b.aci.ics.psu.edu

Connecting to aci-b.aci.ics.psu.edu...

Password: <user abc123 password>

# Duo Push authentication

Connected to aci-b.aci.ics.psu.edu.

sftp> pwd

Remote working directory: /storage/home/abc123

sftp> lpwd

Local working directory: /home/abc123

sftp> cd work/depot

sftp> pwd

Remote working directory: /storage/work/abc123/depot

sftp> lcd results

sftp> lpwd

Local working directory: /home/abc123/results

sftp> ls -l

-rw-r--r--

1 root

root

5 Mar

3 12:08 dump

sftp> lls -l

total 0

sftp> get dump

Fetching /storage/work/abc123/depot/dump to dump

/storage/work/abc123/depot/dump

100%

5

0.0KB/s

0.0KB/s

00:00

sftp> lls -l

total 4

-rw-r--r-- 1 abc123 abc123 5 Mar

3 12:09 dump

sftp> put data.txt

Uploading data.txt to /storage/work/abc123/depot/data.txt

data.txt

100%

15

0.0KB/s

sftp>

rsync
rsync is a utility that can be used to copy files and for keeping files the same on different systems as a rudimentary version control system. The benefit to using rsync over scp is that if an scp is stopped for any reason (poor wireless connection, large files, etc) the restart will begin as if no files were copied over. The rsync utility will only copy the files that were not successfully moved over in the previous command. Once you have SSH access between two machines, you can synchronize dir1 folder ( home directory in this example) from local to a remote computer by using syntax:

rsync -a ~/dir1 username@remote_host:destination_directory

where remote host is ACI host name as in scp command. If dir1 were on the remote system instead of the local system, the syntax would be:

rsync -a username@remote_host:/home/username/dir1 place_on_local_machine

If you are transferring files that have not been compressed yet, like text files, you can reduce the network transfer by adding compression with the -z option:

rsync -az source_dir username@remote_host:target_dir

The -P flag is very helpful. It combines the flags –progress and –partial. The former gives you a progress bar for the transfers and the latter allows you to resume interrupted transfers:

rsync -azP source_dir username@remote_host:target_dir

In order to keep two directories synchronized it is necessary to delete files from the destination directory if they are removed from the source. rsync does not delete anything from the destination directory by default. To change this behavior use the –delete option:

rsync -a --delete source_dir username@remote_host:taget_dir

If you wish to exclude certain files or directories located inside a directory you are syncing, you can do so by specifying them in a comma-separated list following –exclude= option:

rsync -a --exclude=pattern_to_exclude source_dir username@remote_host:target_dir

One common pitfall that can affect users transferring files between systems with different usernames and groups can be the permissions assigned to the files being rsync-ed. The --chmod option can be used both to set the permissions for the user, group and other independently, as well as to set any directory permissions for inheritance of files created within the directory after the transfer is complete.  Multiple commands can be strung together using commas. For example, the following will provide full permissions for the user, read and execute permissions for others in the group and will cause all of the future files created within any directories being transferred to inherit the group that the directory has.

rsync -a--chmod u=rwx,g=rx,Dg+s source_dir username@remote_host:target_dir

5.5.2 Graphical File Transfer

WinSCP and FileZilla provide a free secure FTP (SFTP) and secure copy (SCP) client with graphical interface for Windows, Linux and Mac using SSH, allowing users to transfer files to and from our cluster file system using a drag-and-drop interface. Please use either the SCP or SFTP protocol with port 22 with the data manager nodes

datamgr.aci.ics.psu.edu

to transfer files. Please note that your two factor authentication is required.

For more information, please visit the WinSCP homepage or the FileZilla homepage.

You can see the connection process in this ICS tutorial video.

It is also possible to use the online interface for either Box or DropBox within Firefox on ACI-I for users who logged on with Exceed onDemand. It is not currently possible to sync to your storage space on ICS-ACI at this time.

5.5.3 Web-based Services

Globus is one of the recommended methods of transferring very large data. Most HPC centers have endpoints set up allowing for optimized transfer between large centers. Users can also install personal endpoints on their own machines using Globus Connect. It is a software which user installs on his personal computer and which allows a user to use a web interface to transfer files.

Users must sign up for a globus account as this is a separate service from ICS-ACI. You have to define endpoints. An endpoint is a location to/from which the files are to be transferred. There are 2 types of endpoints: server endpoints (like ACI) and personal endpoints (user’s laptop). To define endpoint go to Manage Endpoints section on the globus webpage (you have to be logged in). Then click add Globus Connect Personal In Step 1 you have to define the name for your personal endpoint and click on Generate Setup Key. Copy the key to the clipboard. In Step 2 you have to download and install Globus Connect Personal app appropriate for your OS. To finish the installation you will be asked to paste the generated key which you copied to the clipboard. Each time you want to transfer file you will have to have the Globus Connect Personal running on your local computer. If you do not have a Globus account, create one here. Set up a free Globus Online account using your PSU ID. If you wish to use Globus to transfer data to/from your local computer, install the Globus Connect Personal tool on your computer. More information on the Instructions for Installing Globus Connect Personal tool for linux can be found here. Instructions for mac and windows are also available.

On the Globus Start Transfer page, choose one of three ACI end points (for example PennState_ICS-ACI_DTN_EndPnt_01) as the end point on one side. This will pull up an authentication window. Use your Penn State ID for the user name, and your Penn State ID password for the pass phrase. This will set up an authenticated session.

Select and authenticate with the other endpoint for your transfer, and initiate your transfer. You may need to click on refresh list if you can’t see the transferred files despite the transfer has completed.

There are other specialized data-transfer software available for specific needs, such as Aspera. Contact the iAsk center if you have any questions regarding using one of these specialized tools on ACI.

5.5.4 File Permissions

File permissions can be seen using the -l flag for ls:

ls -l

The letters at the beginning indicate the file or folder permissions while the third and fourth columns show the owner and group associated with the file. The letters used are typically rwx, for read, write and execute. These are grouped in sets of three, the first set for the owner, the second for the group and the third for the entire world. Users may change permissions using the chmod command. An excellent overview of how to change permissions using chmod can be found here.

Users may also want to change the group of their files using the chgrp command.


6 Application Development


6.1 Version Control

Version control is a way to track multiple versions of a code. This has a place in development, primarily with adding new features while still using the original code or with multiple developers, and if the code has minor variants for reasons such as slightly different input/output data types or for use on different compute resources. One popular version control tool is git, which uses a distributed approach which allows for many development points. The basic git workflow is to

  • Modify files – create new code, fix bugs, etc.
  • Stage the files – explicitly state what will be deposited
  • Commit your files – store a snapshot

Your repository will have a master branch, where the current production code usually exists, and other branches that may be for any other purpose, such as development or variations. Branches can either be merged back to the master branch as features are added and execution is validated, or kept separate if the usage requires multiple working versions of the code. It is up to the user to define how their repository is set-up as well as to keep non-local versions of the repository as up-to-date as desired. There are great online resources for git including excellent documentation and tutorials.


6.2 Basic Compilation

You can your own compile code for running on ACI. A basic compilation might look like

gcc -O2 -lm -o hello.out hello.c <\pre>

where the gnu compiler is used to compile a C code in the file hello.c.

  • gcc Compiler being used
  • -O2 Optimization Flag
  • -lm Link to the math library
  • -o hello.out Output file
  • hello.c Input file

It’s possible to link to pre-compiled libraries that are created by you or that already exist on the system. The module show command can be very useful in determining the locations of the libraries and header files required to compile codes. You can link in a variety of ways.

  • -L (as in Love) Path to a directory containing a library
  • -l (as in love) Library name
  • -I (as in Iowa) Path to header files

Complicated compilations can also be done using a build automation software package such as make, which is available without a module, or cmake, which is available as a module. The general build automation process involves using a Makefile that has:

  • Outputs – the executable/library being created
  • Dependencies – what each output relies on
  • Instructions – how to make/find each dependency

and can set environment variables. The nomenclature for repeated sections can use regex and so can be very complicated. Information about these tools can be found in online references, such as the make and cmake manuals. Some common pitfalls that the iAsk center sees for using make are:

  • Makefiles require tabs and not spaces at the beginning of indented lines
  • The -j flag and an integer can be used to compile on multiple processors
  • The -f flag can be used to specify the name of the makefile if not Makefile
  • Some makefiles are configured for the compute environment. You may need to use the command ./configure if there isn’t a make file and configure scripts exist./li>

It is possible to either make the output file directly executable and add the location to your path to call this from anywhere, or to execute the output from the location of the file directly. For our hello example from before, this can be done using the command ./hello.out from the directory in which the executable exists.


6.3 Libraries

ICS offers many optimized libraries for users to link to. Please see the most common libraries listed below.

6.3.1 MKL

Intel Math Kernel Library (MKL) consists of commonly used mathematical operations in computational science. The functions in MKL are optimized for use on Intel processors. More information can be found here.

The MKL module can be loaded using the command

module load mkl

6.3.2 LAPACK

LAPACK (Linear Algebra Package) is a software library used for numerical linear algebra. It can handle many common numerical algebra computations such as solving linear systems, eigenvalue problems, and matrix factorization. It depends on BLAS. More information can be found on the website.

You can load the LAPACK module using the commands:

module load gcc/5.3.1
module load lapack/3.6.0

6.3.3 BLAS

BLAS (Basic Linear Algebra Subprograms) is a collection of low level matrix and vector operations such as vector addition, scalar multiplication, matrix multiplication, etc. For more information, refer to this link.

The BLAS module can be loaded with the command

module load blas

6.3.4 ScaLAPACK

ScaLAPACK (Scalable LAPACK) is a library that consists of subset of LAPACK routines that have been modified to be used in parallel computations. Like LAPACK, its purpose is to be used to perform linear algebra operations in a high performance computing environment. It is provided by Univ. of Tennessee; Univ. of California, Berkeley; Univ. of Colorado Denver; and NAG Ltd. More information can be found on the website.

6.3.5 Boost

Boost is a C++ library that contains many useful functions covering a wide range of applications such as linear algebra and multithreading. More information can be found here.

You can load Boost with the command

module load boost

6.3.6 PETsc

The Portable, Extensible Toolkit for Scientific Computation (PETsc, pronounced PET-see) is a suite of data structures and routines for solving partial differential equations and sparse matrices in a parallel fashion that is scalable. It was developed by Argonne National Laboratory.

You can load the PETsc module using the command

module load petsc/3.7.3

More information on features, tutorials, manuals, etc can be found on the website.


7 Running Jobs on ACI-B

Jobs are submitted from the head nodes of ACI-B and will run when dedicated resources are available on the compute nodes. ICS-ACI uses Moab and Torque for the scheduler and resource manager. Jobs can be either run in batch or interactive modes. Both are submitted using the qsub command.


7.1 Requesting Resources

Both batch and interactive jobs are required to provide a list of requested resources to the scheduler in order to be placed on a compute node with the correct resources available. These are given either in the submission script or on the command line. If these are given in a submission script, the must come before any non-PBS command.

Typical PBS directives are:

PBS DirectiveDescription
#PBS -l walltime=HH:MM:SSThis specifies the maximum wall time (real time, not CPU time) that a job should take. If this limit is exceeded, PBS will stop the job. Keeping this limit close to the actual expected time of a job can allow a job to start more quickly than if the maximum wall time is always requested.
#PBS -l pmem=SIZEgbThis specifies the maximum amount of physical memory used by any processor ascribed to the job. For example, if the job would run on four processors and each would use up to 2 GB (gigabytes) of memory, then the directive would read #PBS -l pmem=2gb. The default for this directive is 1 GB of memory.
#PBS -l mem=SIZEgbThis specifies the maximum amount of physical memory used in total for the job. This should be used for single node jobs only.
#PBS -l nodes=N:ppn=M
This specifies the number of nodes (nodes=N) and the number of processors per node (ppn=M) that the job should use. PBS treats a processor core as a processor, so a system with eight cores per compute node can have ppn=8 as its maximum ppn request. Note that unless a job has some inherent parallelism of its own through something like MPI or OpenMP, requesting more than a single processor on a single node is usually wasteful and can impact the job start time.
#PBS -l nodes=N:ppn=M:OThis specifies the node type (node type=O). You can only specify the node type when using the "Open Queue".

Node types available on ICS-ACI:
Node Type = OCPURAM
basicIntel Xeon E5-2650v4 2.2GHz 128 GB Total
lcivybridge
scivybridge
Intel Xeon E5-2680v2 2.8GHz 256 GB Total
schaswellIntel Xeon E5-2680v3 2.5GHz 256 GB Total
himemIntel Xeon E7-4830v2 2.2GHz 1024 GB Total
#PBS -A allocNameThis identifies the account to which the resource consumption of the job should be charged (SponsorID_collab). This flag is necessary for all job submissions. For jobs being submitted to a system’s open queue you should use -A open.
#PBS -j oeNormally when a command runs it prints its output to the screen. This output is often normal output and error output. This directive tells PBS to put both normal output and error output into the same output file.

7.1.1 Sample Batch Submission Script

The following is a submission script for a Matlab job that will run for 5 minutes on one processor using the open
queue.


#!/bin/bash

#PBS -l nodes=1:ppn=1

#PBS -l walltime=5:00

#PBS -A open

# Get started

echo "Job started on ‘hostname‘ at ‘date‘"

# Load in matlab

module purge

module load matlab/R2016a

# Go to the correct place

cd $PBS_O_WORKDIR

# Run the job itself - a matlab script called runThis.m

matlab-bin -nodisplay -nosplash < runThis.m > log.matlabRun

# Finish up

echo "Job Ended at ‘date‘"

This script would be submitted using the command

qsub subScript.pbs

from the directory containing the submission and matlab scripts.


7.2 Interactive Compute Sessions on ACI-B

Interactive jobs may be submitted to ACI-B using the -I (for interactive) flag. Interactive jobs require resource requests and an allocation. An interactive job can be submitted using a command similar to:

qsub -A open -l walltime=1:00:00 -l nodes=1:ppn=2 -I

The job will be given a job ID and your session will wait until this job has the resources to start. You will then be placed on the compute node and given a usable terminal session within your current session. For example a user submitting an interactive job may see

[abc123@aci-lgn-001 ~]$ qsub I l nodes=1:ppn=1 l walltime=4:00:00 -A open

qsub: waiting for job 2449840.torque01.util.production.int.aci.ics.psu.edu u to start

qsub: job 2449840.torque01.util.production.int.aci.ics.psu.edu ready

[abc123@comp-bc-0267 ~]$

Note that the node the user is on changes from log-in node (aci-lgn-001) to a basic core compute node (comp-bc-0267) when the job starts. You can ask for x-windows to be displayed using the -X flag with the qsub command, as long as you have logged into ACI-B using the -Y flag with ssh. Note that some users experiencing difficulty with interactive x-windows on ACI-B jobs will often use Exceed onDemand to connect to ACI-I, and then ssh with the -Y flag to ACI-B from ACI-I.

It is recommended that you compile your code using an interactive job on the nodes that your job will run.


7.3 PBS Environmental Variables

Jobs submitted will automatically have several PBS environment variables created that can be used within the job submission script and scripts within the job. A full list of PBS environment variables can be used by viewing the output of

env | grep PBS > log.pbsEnvVars

run within a submitted job.

Variable NameDescription
PBS_O_WORKDIRThe directory in which the qsub command was issued.
PBS_JOBIDThe job's id.
PBS_JOBNAMEThe job's name.
PBS_NODEFILEA file in which all relevant node hostnames are stored for a job.

7.3.1 Viewing and Deleting Jobs

There are several ways to view existing jobs. The qstat command can give some basic information about your own queued and running jobs.

qstat

Some helpful flags are -u (user), -s (status), -n (to show the nodes running jobs are placed on) and -f to show more information for a specified job. For example, to view more information about job 536, you can use the command

qstat -f 536

Common status for jobs are Q for queued, R for running, E for ending, H for being held and C for complete.

You can also view all of the jobs running, waiting and being held using the showq command:

showq

It may be helpful for you to view all of the jobs running on an allocation. For example, if you are a member of the abc123_a_g_sc_default allocation, you can view the running and queued jobs using the command:

showq -w acct=abc123_a_g_sc_default

You may delete your jobs using the qdel command. For example, the job 546 may be deleted using the command:

qdel 546

Jobs that are not responding may require being purged from the nodes. You can do this with the -p flag:

qdel -p 546

Note that you are only able to delete your own jobs, not other users.

7.3.2 Additional Job Information

You can use the checkjob command to view some additional information about queued and running jobs. For example, to give very verbose information about job 548, you can use the command:

checkjob 548 -v -v


7.4 GReaT Allocations

All jobs submitted to an allocation that have available resources are guaranteed to start within 1 hour. Note that the resources include both available hours as well as the requested resources. For example, a group that has a 40 core allocations on two standard memory nodes is limited to the RAM and CPUs on both nodes. Single processor jobs that request most of the memory on the nodes may block other jobs from running, even if CPUs are idle.

Users submitting to an allocation can run in ‘burst’ mode. Your group may use a number of cores up to four times your Core Limit (referred to as your 4x Core Limit). When your group submits jobs that exceed your Core Limit, you are considered to be ‘bursting,’ and your jobs will run on our Burst Queue. Bursting consumes your allocation faster than normal. How much you can burst is determined by your 90-day sliding window.

How much you can use ICS-ACI is governed by the size of your allocation and how much you have used the system in the past 90 days. In any given 90-day period, you may use up to your Core Limit times the number of hours in 90 days (2160). The amount of core-hours you have available is governed by a 90-day sliding window, such that the core-hours you use in any given day become available again after 90 days.

Example: If you have a 20-core allocation, you can consume 43,200 (20 * 2160) core-hours within any given 90-day period. Your average rate of usage in any 90-day period cannot exceed 20 cores per hour. Core-hours you use on the first day of your allocation will become available again on the 91st day; core-hours you use on the second day become available again on the 92nd day; and so on. If you never burst, you can use all your cores continuously.

Example: With your 20-core allocation, you run jobs requiring 20 cores continuously. In any given 90-day period, you will use 43,200 core-hours, and your average rate of usage is 20 cores per hour.

DayCore-Hours AvailableUsage on this DayCore-Hours Used This Day
143,20020 cores x 24 hours480
242,72020 cores x 24 hours480
342,24020 cores x 24 hours480
441,76020 cores x 24 hours480

Usage continues at the same rate of 20 cores, 24 hours per day.

DayCore-Hours AvailableUsage on this DayCore-Hours Used This Day
9048020 cores x 24 hours480
9148020 cores x 24 hours480
9248020 cores x 24 hours480

Note that on Day 91, the core-hours used on Day 1 become available again; on Day 92, the core-hours used on Day 2 become available again; and so on.

Example: Bursting above your allocation may lead to days with 0 hours available.

DayCore-Hours AvailableUsage on this DayCore-Hours Used This Day
143,20020 cores x 24 hours480
242,72080 cores x 24 hours1920
340,80080 cores x 24 hours1920
438,8801,76080 cores x 24 hours1920

Usage continues at the same rate of 80 cores, 24 hours per day.

DayCore-Hours AvailableUsage on this DayCore-Hours Used This Day
2448060 cores x 8 hours480
25000
26000

At this point, no core-hours are available, and no jobs can be run against the allocation until Day 91, when the core-hours used on Day 1 become available again.

DayCore-Hours AvailableUsage on this DayCore-Hours Used This Day
9148020 cores x 24 hours480
92192000
93384020 cores x 24 hours480

Identifying Allocation Usage:

Users are able to see their allocations with the balance using the command mam-list-funds. This is typically used with the -h flag to show the allocation and balance in hours.

mam-list-funds -h

The allocation topology, end date and node-type can be shown using the mam-list-accounts command.

mam-list-accounts

Note that this shows you expired allocations as well. The second column (Active) will show True for active allocations and False for expired allocations.

Users interested in their own usage may want to investigate several of the other mam commands:

mam-list-usagerecords
mam-list-transactions

8 Software Stack

ICS has a software policy that explains the details of how software can be used. The policy can be read on our policy page.


8.1 User Stack

Users are able to install software into their own home and work directories as well as in group spaces. ICS strongly recommends that research groups who compute in multiple locations do this for all of their software so that the version can be consistent across platforms.

The iAsk center can provide guidance for the installation of many software packages.


8.2 System Stack

Many commonly used applications are built and maintained by the system.

8.2.1 System Stack Requests

Requests for software to be placed on the system stack can be made to the iAsk center. Users requesting software should show the reason for the request, typically due to licensing issues or because of the broad user-base across campus.


8.3 System Stack Applications

Because of the module families, it is hard to view all of the available software on the system. The software list can be found on the software stack webpage or by looking in the directory where the software modules are:

ls /opt/aci/sw/

8.3.1 COMSOL

To open COMSOL, first log into ACI-I using Exceed onDemand. More information regarding how to do this can be found in section 5.4.1. Next open a terminal by going to the top left corner and clicking on

Applications -> System Tools -> Terminal

In the terminal window type the following commands:

module load comsol
comsol

The graphical user interface for COMSOL should now be opened and COMSOL can be used as usual. However, it is worth mentioning that ACI-I is only intended to run short jobs. Often researchers will use ACI-I to develop and test their COMSOL models before submitting them as jobs on the more computational powerful ACI-B cluster. Running a COMSOL model on the ACI-B system is a relatively straightforward process. To do so, first create your model (often done using the GUI in ACI-I). Next log into ACI-B, and submit your job to the scheduler. For information on submitting a job to ACI-B, see section 7.

An example of a PBS script to submit a COMSOL job:

#!/bin/bash

#PBS -l nodes=1:ppn=4

#PBS -l walltime=12:00:00

#PBS -A open

#PBS -o ComsolPBS.output

#PBS -e ComsolPBS.error

#PBS -m abe

#PBS -M abc1234@psu.edu

#PBS -n myComsolJob

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

# Load in Comsol

module purge

module load comsol

# Go to the correct place

cd $PBS_O_WORKDIR

# Run the job itself

comsol batch -inputfile inputFile.mph -outputfile /path/to/output/outputFileName.mph -batchlog log.txt

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

More information on options used for submitting comsol jobs using the command line can be found by typing the commands:

module load comsol
comsol -h

8.3.2 Julia

Julia is a high-level, high-performance dynamic programming language for numerical computing. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s Base library, largely written in Julia itself, also integrates mature, best-of-breed open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.

The system Julia module is compiled with the GCC compiler. Using Julia requires the gcc module to be loaded:

$ module load gcc
$ module load julia
$ julia

Example Julia Code:

Pkg.add("Winston")

using Winston

# optionally call figure prior to plotting to set the size

figure(width=600, height=400)

# plot some data

pl = plot(cumsum(rand(500) .- 0.5), "r", cumsum(rand(500) .- 0.5), "b")

# display the plot (not done automatically!)

display(pl)

# by default display will not wait and the plot will vanish as soon as it appears

# using readline is a blunt wait to allow the user to choose when to continue

# println("Press enter to continue: ")

# readline(STDIN)

# save the current figure

savefig("winston.svg")

# .eps, .pdf, & .png are also supported

# we used svg here because it respects the width and height specified above

An example PBS submission script for a julia simulation can be found:

#!/bin/bash

#PBS -l procs=1

#PBS -l walltime=240:00:00

#PBS -l pmem=1000mb

#PBS -n jobName

#PBS -m ea

#PBS -M PSU1234@psu.edu

#PBS -j oe

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"echo " "

module load gcc

module load julia

cd $PBS_O_WORKDIR

julia jobName.ji

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

8.3.3 Matlab

Matlab is a widely used programming environment and language. The GUI can be accessed on ACI-I using the following commands:

module load matlab
matlab

Matlab can also be run in batch mode, either on the command line or submitted as a job. Jobs run in batch mode must have an *.m file. An example that writes a random matrix as a .csv file:

%This Matlab script makes a random matrix and outputs a csv file of it.

%This was made as a simple example to demostrate how to submit batch Matlab

%codes

%Created by i-ASK at ICS of Penn State

% June 27, 2017

%% Create Random Matrix

RandomMatrix = rand(5);

%% Export csv file

csvwrite(’output.csv’,RandomMatrix)

This can be saved as Example.m and submitted to ACI-B using the following submission script:

#!/bin/bash

#PBS -S /bin/bash

#PBS -l nodes=1:ppn=1,walltime=00:05:00

#PBS -N MyJobName

#PBS -e error.txt

#PBS -o output.txt

#PBS -j oe

#PBS -A open

#PBS -m abe

#PBS -M abc1234@psu.edu

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

# Load in matlab

module purge

module load matlab

# Run the job itself

matlab -nodisplay -nosplash -r Example > logfile.matlabExample

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

For more information about Matlab, please refer to the Matlab website.

8.3.4 Mathematica

Mathematica builds in unprecedentedly powerful algorithms across all areas many of them created at Wolfram using unique development methodologies and the unique capabilities of the Wolfram Alpha. Mathematica is built to provide industrial-strength capabilities with robust, efficient algorithms across all areas, capable of handling large-scale problems, with parallelism, GPU computing, and more . Mathematica provides a progressively higher-level environment in which as much as possible is automated so you can work as efficiently as possible.

The Mathematica module can be loaded with the command

module load mathematica

A sample Mathematica code for printing random numbers into a text file:

Accumulate[RandomReal[{-1, 1}, 1000]]>>"output.txt"

Quit [ ]

More examples of Mathematica code can be found here.

Sample Shell script for Batch System

#!/bin/bash

#PBS -N jobname

#PBS -l nodes=1:ppn= 1

#PBS -l mem=2gb,walltime=00:10:00

#PBS -A open

#PBS -o samplePBS.output

#PBS -e samplePBS.error

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

# Load in Mathematica

module purge

module load Mathematica

# Go to the correct place

cd $PBS_O_WORKDIR

# Run the job itself

math -noprompt -run ’<<samplecode.nb’

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

An additional PBS submission script sample for Mathematica is given here:

#!/bin/bash

#PBS -l nodes=1:ppn=1

#PBS -l walltime=5:00

#PBS -A open

#PBS -o MathematicaPBS.output

#PBS -e MathematicaPBS.error

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

# Load in Mathematica

module purge

module load Mathematica

# Go to the correct place

cd $PBS_O_WORKDIR

# Run the job itself

math -noprompt -run ’<<input.m’

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo "

8.3.5 Stata

Stata is a powerful statistical package with smart data-management facilities, a wide array of up-to-date statistical techniques, and an excellent system for producing publication-quality graphs. It is widely used by many businesses and academic institutions especially in the fields of economics, sociology, political science, biomedicine and epidemiology. Stata is available for Windows, Linux/Unix, and Mac computers. There are four versions of Stata as follows:

  • Stata/MP for multiprocessor computers (including dual-core and multicore processors). Stata/MP is licensed based on the maximum number of cores than an individual job can use. RCC licenses Stata/MP16, which can run on up to 16 cores.
  • Stata/SE for large databases
  • Stata/IC which is the standard version
  • Small Stata which is a smaller, student version of educational purchase only.

For parallel processing in Stata you must use stata-mp at the bottom of your PBS script. You must also indicate the number of processors (up to 16) in the PBS script as well as your do file. As a line in your do file be sure to include ”set processors n”, where n = the number of processors and should be the same as the number in your PBS script. An example PBS script is below where the number of processors is set to 8.

Setup:

In Linux, load the module with the following command before you start working with Stata:

module load stata

Note that this command will load the current version. Other available versions can be checked by following command:

module avail stata

Usage To start Stata , type:

stata-mp

Use only ACI-I for interactive jobs. If you are remotely connecting to our systems via Exceed onDemand, we recommend using the GUI version of Stata:

xstata-mp

Batch usage: Sample PBS script is given below:

#!/bin/bash

#PBS -l nodes=1:ppn=1

#PBS -l walltime=00:15:00

#PBS -A open

#PBS -n jobName

#PBS -M user123@psu.edu

#PBS -m abe

#PBS -j oe

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

# Load in Stata

module purge

module load stata

# Go to the correct place

cd $PBS_O_WORKDIR

# Run the job itself

stata -b do filename

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

You can use stata-mp by substituting the stata command with the following:

stata-mp -b do jobName.do

8.3.6 Python

Python is a multi-use programming language used in a wide variety of fields. It can be run in batch mode on ACI-I or used in submitted jobs on ACI-B.

An example python script, named jobName.py:

import sys

jobName = [Hello, World]

for i in jobName:

print i

sys.exit(0)

#end of jobName.py

This script can be submitted as a job on ACI-B with the following script:

#!/bin/bash -l

#PBS N jobName

#PBS l nodes=1:ppn=12

#PBS l walltime=00:05:00

#PBS j oe

#PBS -M abc123@psu.edu

# Get started

echo " "

echo "Job started on ‘hostname‘ at ‘date‘"

echo " "

#load in python

module purge

module load python/2.7.1

#go to the correct work directory

cd $PBS_O_WORKDIR

python jobName.py

# Finish up

echo " "

echo "Job Ended at ‘date‘"

echo " "

For more information, please feel free to refer to the Python website.

An excellent resource for various plotting methods found within python can be found at the matplotlib gallery.


9 Policies

By requesting an ICS-ACI user account, users acknowledge that they have read and understood all ICS-ACI and applicable Pennsylvania State University policies and agree to abide by said policies. All policies can be found at our policies page.


9.1 Authentication and Access Control

This policy serves to manage the lifecycle of ICS accounts created under the current state and configuration of the ICS-ACI. It specifies criteria for creating a user account, using an account, and termination of a user account.


9.2 Data Protection and Retention

This policy outlines the protection of data that is created, collected or manipulated by personnel that fall within the scope of ICS-ACI and applies to any person who uses ICS-ACI resources or handles data managed by ICS-ACI. It specifies data retention policies and resources, as well as the responsibilities of the principal investigator, the University, and ICS-ACI.


9.3 Software Acceptable Use

This policy explains how software is introduced, installed, and maintained on the ICS computer system. The policy details how users can install their own software in their user or group spaces, as well as how ICS will regularly update the ICS-ACI software stack. Information on how to request new software for the ICS-ACI software stack is included. The policy also discusses who is responsible for licensing and usage agreements in various circumstances, and the rights that ICS reserves to make changes to installed software in order to keep ICS-ACI systems safe, up-to-date, and in compliance with University and government regulations.


9.4 SLA Terms and Conditions

This terms and conditions document outlines the basic provisions that guide the working relationships between researchers and ICS-ACI.


10 For Further Assistance

The i-ASK Center provides prompt, expert assistance for any issues researchers might encounter on ACI and is the point of contact for the ICS technical staff. This help desk web portal lets you submit and track tickets and features a helpful list of FAQs. The i-ASK Center also provides timely system alerts regarding maintenance and other events that impact the system.