The SDSC Encryption /
Authentication (SEA) System

Wayne Schroeder
San Diego Supercomputer Center
University of California at San Diego
April, 1998

This is a preprint of an article accepted for publication in Concurrency Practice and Experience - Special Issue - Seamless Computing Frameworks, Copyright (1999) John WIley & Sons Ltd.

The published version is available here on the www.interscience.wiley.com online version of the journal.

Abstract

As part of the Distributed Object Computation Testbed project (DOCT) [1] [2] and the Data Intensive Computing initiative of the National Partnership for Advanced Computational Infrastructure (NPACI) [3], the San Diego Supercomputer Center has designed and implemented a multi-platform encryption and authentication system referred to as the SDSC Encryption and Authentication, or SEA, system. The SEA system is based on RSA and RC5 encryption capabilities and is designed for use in an HPC/WAN environment containing diverse hardware architectures and Operating Systems (including Cray T90, Cray T3E, Cray J90, SunOS, Solaris, AIX, SGI, HP, NextStep, and Linux). The system includes the SEA library, which provides reliable, efficient, and flexible authentication and encryption capabilities between two processes communicating via TCP/IP sockets, and SEA utilities/daemons, which provide a simple key management system. It is currently in use by the SDSC Storage Resource Broker (SRB) [4] [5] , as well as by user interface utilities to SDSC's installation of the High Performance Storage System (HPSS) [6]. This paper presents the design and capabilities of the SEA system and discusses future plans for enhancing this system.

Introduction

As often occurs in various fields of computational technology, the current authentication and privacy infrastructure consists of multiple competing and evolving systems, developed under various corporate alliances and rivalries and characterized not only by standardization and interoperation, but also by competition, differentiation, and rapidly developing new requirements and new solutions. Kerberos, DCE, SSH, PGP, SSL, underlying encryption technologies such as RSA and RC5, and other systems compete with, and sometimes complement, each other. Each system has its own features, strengths, and weaknesses, and thus varying applicability to different authentication and encryption tasks [7], [8], [9], [10], [11], [12], [13], [14], [15], [16].

Within NPACI*, we are interested in supporting heterogeneous distributed systems. Thus, we need solutions that cover a wide range of platforms, including the SGI/Cray PVP (C90/T90), Cray SMP (T3E), workstations and clusters of workstations by DEC, HP, IBM, SGI, Sun, and Windows NT PCs. We need solutions that integrate well into an HPC environment, which includes long-running jobs, checkpoint/restart capabilities, batch queuing systems, and the unique Cray PVP architecture.


* SDSC is now the leading-edge site of the National Partnership for Advanced Computational Infrastructure (NPACI). As part of this new NSF program, SDSC and its partners are developing software infrastructure to link the highest performance computers, data servers, and archival storage systems to enable easier and even more effective use of the aggregate computing power and information.

The SDSC Encryption / Authentication (SEA) System was developed as part of the Distributed Object Computation Testbed (DOCT) research project to address the requirement of providing authentication and encryption capability between two processes communicating via TCP/IP sockets in a distributed environment. The intent was to investigate and develop an authentication/encryption system to suit the needs of DOCT, as well as to serve the immediate needs of other SDSC/NPACI research projects.

The SEA system is being utilized as an optional component of the SDSC Storage Resource Broker (SRB) and of the High Performance Storage System (HPSS) Interface utility (HSI) [17]. The SRB is client-server-based middle-ware implemented at SDSC to provide uniform access interface to heterogeneous, distributed storage resources/devices. SRB, in conjunction with the SDSC Metadata Catalog database [4] [5], provides a means for accessing data sets and resources through querying their attributes instead of knowing their physical names and/or locations. As part of the SRB client and server, SEA can be used for secure authentication and/or encryption. The HSI utility is being developed by SDSC to provide an advanced user interface to the HPSS archival storage system and to do so, optionally, in a non-DCE environment. Integrated as part of the HSI client and server, SEA provides secure, remote, password-less access into HPSS.

Requirements

As explained in the DOCT Authentication Mechanisms white paper [18], one of the authentication capabilities we were interested in investigating for DOCT was self-introduction. The DOCT project developed prototype electronic filing for possible use in the US Patent and Trademark office. In this environment, it is feasible for the system to allow the patent applicant to initially introduce himself/herself to the system. There would be no benefit for anyone to misidentify themselves at this stage, since legal rights would be subsequently conferred to that identity. While the final DOCT prototype employed Certificate Authorities in the patent submission process, the self-introduction capabilities of SEA, extended with password or 'Trusted Agent' functions, have been useful in other applications.

In addition to self-introduction, we were particularly interested in an authentication and encryption system that would operate in a distributed, batched computing environment (e.g., non-expiring tickets), operate on a wide variety of architectures (including Cray PVP), be easily integrated with the SRB and other applications, and be easily installed on independent hosts.

Of the existing authentication systems, Kerberos was the best alternative to developing our own system. We had already ported Kerberos to the Cray architecture [15] and it provides much of the authentication and encryption functionality required for the DOCT project. However, Kerberos had no self-introduction feature, had limited batch support (it would have required user involvement in the regeneration of tickets over time), and it was not easily installed on independent hosts. Inter-realm authentication also typically requires trust between N administrators in an N2 pattern.

Using existing standards like RSA and RC5 as the lower-level building blocks, we were able to build an effective solution with the features required. Alternative mechanisms are discussed more thoroughly in the 'Comparison with Other Security Systems' section of this paper. See the SEA software design document [19] and the Authentication Mechanisms paper [18] for additional information.

Features

The features of the SEA system include the following:

Components

The SEA System is built upon RSAREF 2.0 and RC5. RSA is used to exchange a random session key, which is then used with a more efficient symmetric key algorithm, in this case RC5. Other building blocks include an SDSC-developed random number generator and middle-level routines. The entire SEA package, excluding RSAREF, is about 7,000 lines of 'C' code. RSAREF 2.0 is about 4,750 lines of 'C' code.

RC5

Sample source code from "Applied Cryptography" by Bruce Schneier [7] was used as the basis for our RC5 implementation. To this we added two important features: porting the software to the C90/T90 and Cipher Block Chaining.

Porting the RC5 code to the Cray C90 required a number of changes to deal with longer word size. Since 32 bit integers are not available on the C90, 32-bit masks were needed in many operations to return the 64-bit computed quantities to 32-bits as needed in the algorithm. The lack of a 32-bit addressable unit is a significant and common problem when porting systems-related software to the Cray PVP architecture. Since the hardware cannot directly address units smaller than a word (8 bytes), the compiler allocates 8 bytes for all integers. The sizes of shorts, ints, and longs are the same. Most modern packages are designed for various word sizes but usually assume that there is some definition (usually an int or long) that can allocate 32-bit integers. The algorithms are then designed to work with 32-bit units.

As acquired, the RC5 algorithm would encrypt any 8-byte block of plain-text to the same cipher-text, regardless of its location in the input stream, presenting a weakness that could be exploited. To harden the implementation, a Cipher Block Chaining (CBC) algorithm was implemented. In CBC, the cipher stream is mixed (typically XORed) with the plain-text as it is encrypted, ensuring variation in the cipher-text even with recurring plain-text patterns.

RSAREF 2.0

RSAREF 2.0 was downloaded from RSA Data Security Inc. This was ported to the C90, which involved numerous changes in various modules to deal with the 64-bit word size. This is the only known port of RSAREF 2.0 to the C90 and is also being used in SDSC's Legion [20] C90/T90 porting effort.

It was found that the MD5 implementation within RSAREF 2.0 conflicted with that available in libnsl on Solaris systems, so names of the key MD5 routines were changed to keep them separate.

Random number generation

Random numbers are needed for various purposes: a seed for RSA Encryption, the RC5 session key, padding for seaWrite buffers (for RC5, buffers must be a multiple of 8 bytes long, so libsea pads them), and challenge values for authentication.

The implemented getRandomNoise() routine generates pseudo-random numbers for use in libsea by generating an MD5 hash of a series of buffers. The MD5 hash provides a chaos-producing effect; that is, small variations in the input values produce radically differing MD5 output. The input to the MD5 hash includes the output of an fstat call on stdin, an fstat on stdout, a stat on "/", a gettimeofday, and an internal floating point number that is incremented on each call. We believe that this provides sufficiently random values.

Middle layer

Various middle-level routines were developed as the interface between the API routines and the lower-level encryption routines. These included key management routines to read and write encrypted and semi-encrypted public and private RSA keys. Like the encryption routines, these were developed such that they would function identically on all architectures, so that encrypted keys could be accessed from each architecture.

Utilities/Daemons

Four programs were developed, layered on the SEA library, to handle the management of the RSA keys: 1) a key generation and registration utility, seaauth, 2) a key management daemon for accepting and storing registered public keys, 3) a 'Trusted Agent', and 4) a separate key generation utility. These are described in further detail in later sections.

The encryption and authentication functions are provided by separate APIs. An application can utilize either one or both together. A stronger exchange of the session key is accomplished if an application performs user authentication first. This is described in further detail in the "Man-in-the-Middle" discussion in the following section.

SEA Encryption

After the application client and server establish communication via a TCP/IP socket, the following routes can then be called by the server and client, respectively, to establish encryption: The fd is a socket connecting the client and server. The routines perform a handshake, using RSA to exchange a random key for use with this session. Upon success, each returns zero. If an error occurs, a message is printed to standard out, and a negative value is returned.

The authUser argument is optional. If non-null, and not a pointer to a null string, it should be the userid that has been authenticated earlier in this session. libsea uses this to accomplish a stronger session key exchange (countering "man-in-the-middle" attacks).

Once encryption is established, the following two routines are called, instead of write and read, to send and receive encrypted data on the socket:

All arguments and return values are equivalent to read and write. This includes the return value, which is the positive length for a successful read or write, zero when disconnected, and negative upon error. To reduce memory requirements and improve performance seaWrite encrypts the buffer in place. Applications that need subsequent access to data written via seaWrite must copy the data.

The application can establish encryption at any point in the session. The only requirement is that the client and server call the seaBeginEncryptionClient/Server routines at the same point in the communications exchange. As mentioned above, the application calls seaRead and seaWrite to exchange encrypted information. If data does not need to be encrypted, the application can use read and write, instead of seaRead and seaWrite, even after encryption is established on the socket. Multiple sockets can be encrypted via multiple calls to the seaBeginEncryptionServer/Client routines.

RC5 encryption requires buffers to be integer multiples of 8 bytes in length. If the user passes a buffer to seaWrite that is not a multiple of 8 bytes, the SEA routines pad the buffer with random data, saving and restoring the contents of the 0 to 7 bytes of storage following the data. This means, however, that a call such as seaWrite(fd,"Message",7), will fail, since the storage area for constants is not writable. Instead, application programs must copy data to a buffer instead: char buf[20]; strcpy(buf,"Message"); seaWrite(fd,buf,7);

Also, the buffer passed to seaRead and seaWrite must be aligned on a 4-byte boundary so that the encryption routines can perform the integer arithmetic on 4-byte items. For some compilers, this means that character arrays need to be defined as multiples of 4 bytes in length (i.e., char buf[10000]; , not char[10002]).

Note that seaWrite and seaRead will transfer data without encryption, if the seaBeginEncryptionClient/Server routines have not been called. We are considering adding a seaEndEncryption routine as well.

The following is a brief description of the Encryption protocol:

As mentioned earlier, after establishing a communications socket, the client calls seaBeginEncryptionClient(fd) (where fd is the file descriptor for the socket), and the server calls seaBeginEncryptionServer(fd, authUser).
  1. seaBeginEncryptionServer reads the public key file (or alternative, see below) and sends this data to the client.
  2. seaBeginEncryptionClient reads the message, randomly generates a 64-bit session key, encrypts it with the public key, and sends it to the server.
  3. seaBeginEncryptionServer reads the message, reads the private key file, and decrypts the data.
If an error occurs, an error message is first exchanged. This provides information and also prevents one side from waiting indefinitely for the next message in the sequence. Also with errors, a message is displayed, and the function returns with a negative value.

At this point, both sides have a secret random key for use in the RC5 encryption that has been exchanged securely.

To guard against "man-in-the-middle"* attacks, the SEA library can optionally use the user public/private keys to exchange an additional portion of the random session key. In addition to the basic exchange, if authentication has previously been accomplished on this session, the Server will generate a random byte string (as an additional section of the session key), encrypt it in the user's public key, and send it to the Client. Only the real Client (with access to its private key) will be able to decrypt that portion of the session key. On the Server-side, the authenticated user name is passed to the SEA library in the authUser argument. On the Client-side, the name is maintained internally to the library. The argument is needed on the Server-side since, for the SRB, one process does the authentication, and a separate spawned process sets up the encryption.


* In a "man-in-the-middle" attack, a process inserts itself between the Client and Server. One way to do this is to fake a DNS entry so that the Client actually connects to the attacker's process when trying to connect to the Server. The community is in the process of developing a secure DNS system, but it is not expected to be completed for about a year and a half.

The Server-side of this exchange needs access to a public/private RSA key pair. There are three methods of providing these:

  1. The location of these key pair files is defined in the seaInternal.h file at compile time (see seaInternal.h comments for details). These keys can be generated with the SEA genkey utility.
  2. If these keys are not available, the SEA library recognizes this and will perform encryption setup using a key-pair that is built into the source. The SEA script mkinit.pl can be used to install a new pair of built-in keys.
  3. The third method is via a call to seaMakeEncryptionKey(). This generates a new pair of RSA keys, holds them in memory, and sets the SEA library to use them. It usually takes 15 to 30 seconds to generate keys. If seaMakeEncryptionKey() is called and succeeds, the generated keys will be used instead of the file or built-in key pairs. This method should be used in long-running daemons as it provides very secure and changing RSA keys. seaMakeEncryptionKey returns zero on success, -1 on failure.
See the source, elib.c in particular, for additional information (see the Availability section).

Encryption Performance

Any secure encryption algorithm can potentially reduce overall communications performance considerably. Instead of primarily moving buffers (or, usually, pointers) and orchestrating I/O, substantial logical and arithmetic operations must be performed on each data item.

The RC5 implementation in SEA, however, is quite efficient and is comparable to SSH's encryption mechanisms (using rough timing estimates, using the default settings of 512-bit RSA keys, 14 RC5 rounds, and SSH's IDEA algorithm).

In cases where the security constraints are less severe, the SEA system can be configured with milder encryption. If the SEA library is configured to use 7 RC5 rounds, SEA encryption is almost twice as fast.

In addition to this, RC5 rounds can be adjusted by users with an environment variable. On the client side, if environment variable SEA_LEVEL is "Low" or "low" (starts with 'l' or 'L'), rounds is set to 7; Medium or medium (starts with 'm' or 'M'), rounds is set to 14; High or high ('h' or 'H'), rounds is set to 22. If it is set to a numeric value between 1 and 200, rounds is set to that value. This rounds value is sent to the Server side when encryption is established.

SEA's encryption performance on an UltraSparc1 using 14 RC5 rounds is comparable to the SSH's copy function, scp. SSH has very efficient encryption implementations (In an SDSC study done a few years ago, SSH outperformed Kerberos for comparable types of encryption). By setting SEA_LEVEL to Low, the performance is almost two times faster while still providing good security against casual eavesdroppers:

SSH scp using IDEA encryption (default at SDSC): about .61 MB/sec
SSH scp using 3DES encryptionabout .45 MB/sec
Test programs using SEA RC5 using 7 roundsabout 1.14 MB/sec
Test programs using SEA RC5 using 14 roundsabout .64 MB/sec
Test programs using SEA RC5 using 28 roundsabout .33 MB/sec

SEA Authentication

The following two routines are called to authenticate a user or process. The following is a brief description of the Authentication protocol:

The client and server processes first establish a communications socket. Then the client calls seaAuthClient, and the server calls seaAuthServer.

  1. seaAuthClient sends the id string which is to be authenticated (e.g., 'schroede@sdsc') to the Server.
  2. seaAuthServer receives the id string and attempts to load the corresponding public key from the defined directory. It then generates a random challenge 8-byte quantity, encrypts this with the client's public RSA key, and sends the cipher-text to the Client.
  3. seaAuthClient receives the challenge, loads the local private key (decrypting it in the process), decrypts the challenge, and sends the decrypted challenge back to the Server.
  4. The Server receives the challenge response (or error message) and compares it to the original challenge pattern. If they match, the Authentication is successful, as the Client has proven that it has access to the private key that corresponds to the registered public key.
There are also two support functions which can be used by the application to determine if it needs to prompt for a password:

See the "User private key files" section below for a related description.

Introduction Trust Models

There are multiple distinct authentication environments.

In the DOCT environment, we may wish to allow users to introduce themselves to the system, and from then on it is sufficient that the system knows that the same person is communicating with the system (i.e., they have the private key). In this case, we can simply have the software generate public/private keys and send the public key to the key manager (see below), in effect saying, "here's my public key, from now on you can identify me with it."

In the NPACI/SDSC environment, a preferred method is to confirm that the user is actually running on one of our trusted hosts. SDSC provides passwordless pftp access to HPSS from the T90 via a daemon that confirms that the client is running as the claimed user. In a similar way, the SEA system just needs to confirm that the user is logged onto the T90 as a particular user to acquire the access privileges of that user. Since that host is well secured, further authentication is not needed, and the benefits of simplified registration outweigh the risks. So for the introduction function (when initial user public/private keys are established), the system has to confirm that the user is actually logged in on a trusted host (e.g., the T90) and running as that user. Once that is done, the SEA authentication system can be confident that a user registered as user@sdsc actually is that SDSC user. This is accomplished by the SEA 'Trusted Agent.'

In a third environment, positive control is required, but the hosts involved may not be well-secured trusted hosts. In this case, a password is required to confirm user identity during registration.

For an initial introduction/registration, the seaauth utility generates a pair of RSA keys, stores the private key locally (encrypted), and sends the public key to the key manager to be recorded for future use. Once this is done, the user can authenticate to the programs using SEA (the SEA library accesses these keys to securely authenticate).

The SEA system provides for three "Trust Models" for the initial introduction/registration.

Higher-level functions

Two utilities (seaauth, keygen) and daemons (keyd, and TA ('Trusted Agent')) provide the key generation and management functions. These make use of the SEA library routines for encryption and authentication themselves and also make use of various SEA library routines for key file access and management and other common functions.

The one user utility is seaauth, used to set up and modify authentication with the SDSC Encryption/Authentication (SEA) system. The command line is of the form seaauth reg | auto | noauto | passwd | rereg | unreg [objectname], where one of the five commands is required and is optionally followed by an objectname. Each command performs the following function.

By default, all of the above operate on one's SEA user id, which is username@domain, where username is the Unix login name and domain is predefined; for example 'schroede@sdsc.'

Other entities can also be authenticated. In this case, the object name is specified on the command line.

At SDSC, one's encrypted private keys are stored in the home directory. For most SDSC workstations, home directories are shared (NFS mounted), but when they are not (e.g., for the T90), users need to move their encrypted private key between hosts.

Users should not attempt to move the unencrypted private keys. They will not function on another host (they are designed to be host-specific), and the transfer could expose the private key information on the network. Unencrypted private keys are stored on local disk (/tmp) so as to not transverse the network (i.e., via NFS). (These private keys are actually encrypted, but only via determinate values such as hostname.) It is the user's responsibility to keep the private key private.

As configured at SDSC, the private key file is HOME/.SEAusername@domain, for example, /users/sy/schroede/.SEAschroeder@sdsc. The unencrypted private key will be stored as /tmp/.SEAusername@domain.hostname, for example, /tmp/.SEAschroeder@sdsc.c90. Both of these files will be owned by the user and stored without group or other access (mode 600).

The applications that utilize the SEA system will access the keys to authenticate the user. If the user has no unencrypted private key, the application will prompt for the private key password.

In addition to these, seaauth also contains some test and debug options. These include -n to display the network messages, -d to display the network messages and the unencrypted messages, as well as test1 through test6, and a -mmessage to be used in conjunction with some of the tests. These test and demonstrate libsea capabilities. test1 connects to the key daemon, establishes encryption, performs authentication, and optionally sends the -mmessage. test2 does the same except without establishing encryption. And test3 does the same except without authentication.

seaauth connects to the key daemon (keyd) to register the public key (and also for the tests). This is an encrypted session, using the normal libsea encryption functions (although encryption would not be required). keyd receives the public key from seaauth and stores it in the public key directory via libsea routines. The directory is defined in seaInternal.h. The request to store a new key will be rejected if it already exists (for that user or process name).

Write access to the public key directory must be carefully controlled. Read access to the public key data is not a concern, and the seaAuthServer routine, in fact, requires read access. The key daemon carefully manages this area. The administrator must assure that only trusted users can update this directory. Normally, this is only the administrator who is running the SEA system and perhaps root. The login password for this account should be well protected, as the SEA trust hierarchy rests on the security of the Unix file permissions on the keyd host. So the plain-text password for this login account should not be sent across the network (e.g., via telnet), but secure mechanisms (such as SSH or Kerberos ktelnet/krlogin) should be used.

User Private Key Files

User private key files need to be available to the authentication library on the Client side. For batch jobs, they are normally stored in a format such that passwords are not needed to make use of them.

At SDSC, we have configured SEA to store user-password-encrypted private keys in the users' home directories (e.g., an NFS-mounted file system) and unencrypted private keys in a local file system (so they won't cross a network). Users are able to create unencrypted private keys from their encrypted private keys via the seaauth auto command. These can be used for batch jobs or interactive sessions (i.e., the library is able to automatically authenticate using them).

The "unencrypted" private keys are encrypted but only in a "semi-secure" manner. The SEA routines encrypt and decrypt the user private key data as it is being written and read from the private key files. This algorithm takes constant information (such as UID and host), generates a key (via MD5), and uses this key to encrypt. This provides a little additional security, but only because the algorithm is not widely known. Other than this, as with Kerberos, DCE, and other systems, we are relying on the security of the Operating System to protect the secrecy of these private key files.

The private key storage location is a configurable option defined in the seaInternal.h file. A file system that is local (e.g., not NFS-mounted) could be used for the unencrypted private keys. On the T90, for example, the home directories could be used for the unencrypted private keys, as they are not NFS-mounted. On SDSC workstations, we store them in /tmp, since that file system is local and home directories are not.

User Public Keys Files

Each computer system which supports the SEA Server-side authentication needs access to the public key files. If the public key file directory is NFS-mounted, they are readily available in a secure manner (their contents need not be kept private). If that is not available, a second function of the 'Trusted Agent' (TA) can be utilized.

The SEA 'Trusted Agent,' in addition to its function of confirming user logins and public keys, also synchronizes a replica public key directory. These two functions work well together, as both are needed for each system that has an independent file system.

The Key Daemon periodically sends a summary of the key file directory to each TA. This is a blank-delimited list of each key file name with a checksum value. The TA compares this with a summary of its local copy. If a local file does not exist in the master list, it is removed. If a file is missing, or has a differing checksum (has changed), the TA requests the file from the Key Daemon and receives and stores it. The Key Daemon repeats this process quickly if the TA requests a file and when an update (registration, unregistration, or reregistration) occurs. Since these files seldom change and are quite small, this works well and is expected to scale well to fairly large directories.

Beyond this, it may be feasible to provide a daemon to return the public key data instead of accessing files. Since it would be critical to authenticate this daemon reliably, we could use SEA for this, storing the Public Key Daemon's key locally, connecting and authenticating it, and receiving the desired public key. This adds additional overhead/delay (up to two authentications to achieve one) as well as some complexity (the existing type of authentication plus the daemon logic). So, at least for now, the synchronization of public key files is preferable.

Web of Trust

The SEA system implements a chaining of trust that begins with the SEA administrator, extends via the Trusted Agent (run by the administrator), and then goes to Users and Processes. Users can register with the SEA system (via the TA or password and Key Daemon) and can in turn, if included in the privileged list, register new keys as arbitrary names.

Each new registration is logged, along with the authenticating agent. This authenticating agent is first authenticated as part of the registration process and is either a Trusted Agent or a User. Thus a chain of trust is created, providing secure authenticating functionality distributed to a wide set of people.

Another web of trust involves the interrelationships between SEA and the Operating System. SEA depends on the normal Unix file permissions to protect both the public and private keys. This is a reasonable assumption, since if the OS can't be trusted, the host cannot be secure. As is the case with other authentication mechanisms, if the root user login is compromised, forged authentication/access will be possible, and there is little that can be done about this. Computer security involves not only secure authentication/encryption, but also installing vendor patches, system monitoring, configuration control, and related activities, all of which SDSC actively pursues.

Integration of SEA with SRB

The SRB is client-server middleware that allows clients in a wide area network to access heterogeneous storage resources using a uniform interface. In the previous versions of the SRB, a plain-text ASCII string within the connect message was utilized to identify the client user. This was adequate for testing and prototype purposes, but was not secure. Thus, SEA has been integrated into the SRB Version 1.1 to provide secure authentication and optional encryption.

SRB User Access Control

Once a user is identified, access is controlled to data via that id. Access Control Lists are maintained in the Metadata Catalog (MCAT) [4] [5] specifying who has what type of access to each particular dataset.

Communication Changes

The SRB communication routines had to be modified to incorporate SEA. The SRB communication routines used file stream I/O and did this via multiple subroutine packages. While the use of file stream I/O was a reasonable design choice, it prevented the access to buffers that is needed for an add-on encryption scheme.

Thus, the communication routines (two sets) were modified to use sockets and new buffering routines. For example, calls to putc were changed to calls to commPutc, which stores data into a small buffer and calls seaWrite when full (similarly, calls to fflush were changed to a new routine that calls seaWrite).

The read system call and seaRead behave differently from stream file I/O reads in that they will return data that is currently available rather than waiting for the buffer to fill. To present the same stream-like interface to the higher-level communications routines, the mid-level routines loop on reads until the buffer fills or the connection ends.

SRB Encryption

With these communications changes in place, calls to seaBeginEncryptionClient and seaBeginEncryptionServer cause libsea to encrypt communications data. In the SRB communication scheme, both control information and data are transferred on the same socket, so when encryption is enabled, it is performed on both.

A new flag was introduced in the SRB connect message to request encryption. The client user can control this with the SEA_OPT environment variable. When the flag is set, the client side calls seaBeginEncryptionClient, and the server side calls seaBeginEncryptionServer to establish encryption.

In some cases, to service an SRB client request, an SRB server may need to communicate with one or more other SRB servers. In this case, if the SRB client has requested encryption, then encryption is also requested on the second (or subsequent) connections, thus extending the protection of data. Since one SRB Server calls another via the SRB client routines, adding encryption is only a matter of propagating the encryption flag. However, this does require the SRB in the middle to decrypt and reencrypt (in a different key) the data.

SRB Authentication

Currently, similar to the SEA encryption flag, a new flag is used in the SRB connect message to specify that SEA authentication is to be used. Eventually, we plan to phase this out and require SEA authentication by default.

The SRB server and client make the SEA authentication calls to confirm the identity of the client.

SRB-to-SRB Authentication

As mentioned above, there are cases where an SRB will make client calls to another SRB. This can occur when a particular storage type (an Illustra DB, for example), is available through a second SRB but not the first. In this case, the SRB to SRB authentication is implemented using the SEA user authentication capabilities and a list of privileged user ids (for example, "srb@sdsc"). It would also be possible to authenticate to an alternative process or object name but, for now, authenticating to the SRB userid is sufficient. If the SEA-authenticated user is privileged, then the username as passed by the connect message is used. In this way, the SRB to SRB connections can proxy for the original user. The list of SRB names (users with privileges) is maintained in the Metadata Catalog (MCAT) and retrieved by each SRB at startup. This provides substantial flexibility in the configuration of multiple SRBs and meshes well with the MCAT core functions as the SRB data and metadata repository.

We decided to modify the connect message to contain two fields, the clientUser and the proxyUser, to clarify the use of two types of user ids. This is similar to Unix's model of Real and Effective User IDs. SEA is used to confirm the identity of the proxyUser field (real UID). (An initial integration of SRB and SEA kept the format of the SRB connect message unchanged, except for the addition of new flags, and used SEA to confirm the user id.) With SEA authentication from SRB clients, the proxyUser field is compared with the SEA-authenticated name as determined by the seaAuthServer routine. If the proxyUser matches an entry in the privileged user list, then the clientUser field is allowed to vary from the proxyUser name. For non-privileged users, the clientUser field must match the proxyUser. It is the clientUser name that is used for access control.

A second (or later) SRB in a chain of SRBs (for a particular connection) trusts the information passed to it when that connection has been identified as from a privileged user. The user of the original SRB client is authenticated to the first SRB, and the second SRB then "knows" that the user has authenticated. This provides a secure, flexible, and simple authentication mechanism that meets the needs of the SRB infrastructure.

Server Identification

SEA could be used to authenticate the SRB server to the client, although some method of reliably providing the SRB public key to the clients would be needed.

Integration of SEA with HSI/HPSS

HPSS, the High Performance Storage System, is a hierarchical archival storage system being developed by IBM Government Systems and five DOE laboratories (Los Alamos, Lawrence Livermore, Lawrence Berkeley (NERSC), Oak Ridge, and Sandia). Other collaboration partners and early deployment sites include the CERN (European Laboratory for Particle Physics), RUS (Rechenzentrum Universitaet Stuttgart), SLAC (Stanford Linear Accelerator Center), NASA Langley Research Center, San Diego Supercomputer Center, Cornell Theory Center, Maui High Performance Computing Center, Fermi National Accelerator Laboratory, Caltech/Jet Propulsion Laboratory, and the University of Washington. The HPSS architecture is based on the IEEE Mass Storage Reference Model: version 5 and is network-centered. The control network uses the DCE's Remote Procedure Call technology. In implementation, the control and data transfer networks may be physically separate or shared. Storage devices include large tape library (robot) systems and high performance disk. SDSC is currently operating the largest HPSS system in production (currently containing about 48.2 Terabytes of data in about 4 million files).

An important feature of HPSS is its support for both parallel and sequential input/output (I/O) and standard interfaces for communication between processors (parallel or otherwise) and storage devices. In typical use, clients direct a request for data to an HPSS server. The HPSS server directs the network-attached storage devices to transfer data directly, sequentially or in parallel, to the client node(s) through the high-speed data transfer network.

HSI is an HPSS Interface utility being developed by SDSC to provide an advanced user interface to HPSS [17]. Features include basic FTP and Unix commands, recursive operations, HPSS-specific commands, directory convenience features, command-line and interactive modes, and startup command files. Users can easily and quickly utilize the basic FTP-like interface and move on to a large set of advanced features as needed.

HPSS is DCE-based and HPSS client user interface utilities (pftp, HSI) use DCE authentication. However, there is also a need for non-DCE clients, and SEA has been selected as one major alternative.

HSI optionally includes five modes of authentication: none (for debug), local password (on the storage system host), DCE, Kerberos, and SEA. Multiple authentication systems can be built into HSI simultaneously. HSI first attempts to use the default authentication method (SEA as configured at SDSC), but provides a startup option to allow selection of an alternative method (DCE at SDSC).

HSI uses SEA routines for authentication, prompting for a password to decrypt the private key if necessary. Multiple principals are supported; a flag can be utilized to access alternative SEA key files. We plan to use SEA to encrypt transmitted passwords when SEA authentication is not being used.

By running seaauth auto to create a semi-encrypted private key, users can store and retrieve HPSS files via HSI from batch jobs without placing passwords into scripts. This is also useful in interactive sessions where multiple HSI sessions are initiated.

We currently do not allow general access to HPSS via remote FTP utilities due to the danger of passwords being compromised as they cross the wide-area network. With HSI/SEA, we'll be able to provide secure remote access, even in a passwordless manner.

Comparison with Other Security Systems

The SEA system provides an efficient, flexible, relatively simple and effective authentication/encryption capability and is currently being used as part of multiple SDSC research and development projects. In the longer term, however, additional development is required to meet our needs, either to extend SEA, extend alternative systems to replace it, or in some way blend the two approaches.

While SEA supplements existing login mechanisms, other standard systems, such as Kerberos or SSH, must be used to log into SDSC hosts. In addition, the Globus [21], and Legion [20] metacomputing systems, which are becoming a part of the NPACI infrastructure, have their own authentication mechanisms. A Single-Sign-On environment, in which users could authenticate once and then access multiple computational resources, archival storage, and data intensive subsystems, is a long-term goal. SEA, or its replacement, needs to interoperate with existing major standards.

Also, SEA would not scale well to multiple cooperating realms or to multi-thousands of users. Currently there is no mechanism to exchange trusted public keys between administrative realms (separate key daemons), although such a capability could be developed (based on Trusted Agent functions). The current file- and directory-based public key management system would scale to a few thousand, but beyond that would require some redesign/development.

There are many alternatives for extending or replacing SEA, with varying sets of tradeoffs. New capabilities for blending these systems may became available.

DCE could be an option, but it is too expensive, especially for an academic environment. The computing community has shown some interest in DCE for many years, but it is still not a widely deployed, let alone universal, system. This contrasts sharply with SSL, which very quickly has become a ubiquitous web solution.

Kerberos is becoming part of the NPACI infrastructure and is capable of providing the needed authentication and encryption functionality on our primary platforms, including the Crays, and has improved in recent releases. However, like DCE, Kerberos is not expected to become universal in the NPACI environment either, as some sites do not plan to run it. Also, integration into applications is somewhat involved and there is limited batch support. Some batch support is provided via post-dated and renewable tickets but use of this method still requires some awareness and ticket management operations by users.

SSH is an excellent product for providing secure interactive access and file transfer. It is easily installed on independent hosts and is available on the needed platforms, including the Cray. As NPACI phases out plain-text password access, SSH and Kerberos are the primary replacement mechanisms. Currently, however, SSH does not provide interprocess authentication; that is, there is no SSH library and no API.

When we started the SEA project, we investigated converting some of the SSH software into a library, but found that it would be difficult to do so. SSH is well-structured software, but is designed to be an integrated whole.

There are various ways in which SSH keys used for login could, eventually, be used in a role that SEA is serving:

For all of these, there would be some confusion of function for the keys. In SSH login sessions, the user's SSH private key is available on his/her workstation and the public key in his/her home directory on the host being remotely accessed. To use this same system to authenticate on the accessed host, either a second pair of keys would be needed or the private key would need to be moved to the accessed host. In the latter case, there is no SSH mechanism to automatically do this, and it would reduce security somewhat by exposing the encrypted private key on the network and on another host.

The third major option is SSL/X.509. SSLeay is an excellent SSL/X.509 package, and SDSC recently ported it to the Cray T90 with relative ease. Globus will be using SSL/X.509 as its primary security infrastructure, and interoperability between SEA applications and Globus would be useful. SDSC has recently started to run a Certificate Authority, and we expect that SSL/X.509 will be useful as part of the NPACI Interaction Environments (visualization, etc.) as well as Data Intensive applications. An SSL/X.509 capability would enable the development of secure Web-based computation and data access.

Future Directions

Security is an important aspect of any computer operation. Since there is greater demand for computing resources then we have available at SDSC, we need to ensure that these resources are only being used by the researchers who have received allocations via the NSF peer-review process. Also, many of the users' data files represent intellectual property, both academic and commercial, and need to be presented only to authorized individuals.

Secure, remote, Web-based access to supercomputing and storage resources is an exciting prospect. Users will be able to interact with data and the NPACI computational infrastructure via Web-pages, pull-down menus, and CGI and Java scripts. A secure authentication mechanism that interoperates with SSL/X.509 makes it practical, providing an end-to-end inter-process authentication as part of the distributed computational environment. A number of NPACI/SDSC research/development projects will be exploring this over the coming years.

New capabilities in each of the major systems, SSL/X.509, Kerberos, and SSH will allow us to create a more cohesive authentication infrastructure over the coming months and years.

SSH now has Kerberos capabilities that can be utilized. First, if a Kerberos ticket exists (and SSH has been built with the Kerberos libraries), the SSH client can authenticate to the server via Kerberos tickets (and optionally forward that ticket), instead of using the SSH mechanisms. Second, a Kerberos-enabled SSH daemon can create a Kerberos ticket-granting-ticket on behalf of the user. This performs a function similar to a kinit to automatically prepare to authenticate to Kerberos services.

We are configuring Kerberos at the various NPACI sites to utilize the SDSC KDC for the 'npaci' domain (e.g., 'user@npaci.edu'). This will provide NPACI authentication services across the country without the 'N squared' problem (in which inter-realm authentication would be required between each of the N realms).

There is also an Internet Draft, "Public Key Cryptography for Initial Authentication in Kerberos" (PKINIT) [24], proposing extensions to Kerberos to provide a method for using public key cryptography during initial authentication. When this becomes available, users could use X.509 certificates to authenticate into a Kerberos realm.

Over the coming months, we may be able to remove the encrypted password entries in the passwd files on all of the SDSC Unix platforms and utilize the Kerberos database as the central master for all of them.

We are working toward the goal of providing a cohesive SDSC/NPACI authentication system. Utilizing the GSS-API would be an important step in this direction, whether using Kerberos, SSL/X.590, or DCE underneath. For NPACI, some combination of SSL/X.509 and Kerberos should be available in the coming months for most of our users. For all three of these, a GSS-API interface is available. Thus we'd need to convert our applications from the simple SEA API to the more elaborate GSS-API, perhaps devising a SEA API library as a simple interface to the GSS-API calls. For the SRB and related research projects, it remains an open question as to whether we will build upon Kerberos or X.509 tickets, or both.

Conclusion

By utilizing and augmenting available encryption implementations, RSA and RC5, we have created an encryption/authentication system designed to meet specific requirements of the SDSC environment. The SDSC Encryption/Authentication system provides a relatively simple, yet strong, easily installed and easily integrated, supplemental authentication/encryption system for use in practical, real-world, distributed applications. It is the foundation for some of SDSC/NPACI's current and future distributed computing and security research.

Using SEA, SDSC is augmenting existing applications, in a relatively unobtrusive manner, to provide secure remote access to the SDSC data-intensive and archival storage resources.

Future plans include merging and/or interoperating with additional standards, either through further development of SEA or through its eventual replacement with ported and augmented existing systems as they evolve. Various standards-based systems, and interoperating standards-based systems, are expected in the near future, and SDSC plans to make use of these while maintaining some of the advantages of the SEA system.

Availability

Since SEA uses encryption technology (RC5 and RSAREF2.0 (including Cray versions)), we can only release it to domestic U.S. sites. Contact Wayne Schroeder, schroede@sdsc.edu, to obtain a copy. See the SEA software design document [19] for installation instructions.

References

[1] Baru, C.K., Moore,R.W., Rajasekar, A.,Schroeder, W., Wan, M., "A Data Handling Architecture for a Prototype Federal Application," Proceedings of the IEEE Conference on Mass Storage Systems, College Park, MD., March 23-27, 1998.

[2] Distributed Object Computation Testbed (DOCT) Project Home Page, http://www.sdsc.edu/DOCT, June 1996.

[3] National Partnership for Advanced Computational Infrastructure Home Page, http://www.npaci.edu.

[4] Baru, C. K., Marciano, R., Moore,R.W., Rajasekar,A., Wan, M., "Metadata to Support Information-Based Computing Environments," http://www.sdsc.edu/~baru/MD97_Paper.html.

[5] SRB Technical Information Page, http://www.npaci.edu/Research/DI/srb/.

[6] High Performance Storage System, http://www.sdsc.edu/hpss/.

[7] Schneier, Bruce, "Applied Cryptography, Second Edition, Protocols, Algorithms, and Source Code in C", John Wiley and Sons, Inc., 1996.

[8] Schiller, Jeffery I, "Secure Distributed Computing", Scientific American, Volume 271 Number 5, pp. 72-76, November 1994. [Kerberos]

[9] Kerberos Users' Frequently Asked Questions, http://www.veritas.com/common/f/97042301.htm, and new version at http://www.nrl.navy.mil/CCS/people/kenh/kerberos-faq.html, March 1998.

[10] RSA Home Page, http://www.rsa.com/ .

[11] International PGP Home Page, http://www.ifi.uio.no/pgp/.

[12] SSH Home Page, http://www.cs.hut.fi/ssh/ .

[13] SSH at NPACI, http://www.sdsc.edu/projects/ssh/ssh.html.

[14] International Cryptography Pages, http://www.cs.hut.fi/crypto.

[15] Schroeder, Wayne, "SDSC's Installation and Development of Kerberos," Proceedings, Thirty-sixth Semiannual Cray User Group Meeting, Fairbanks, Alaska, September 1995, http://www.sdsc.edu/~schroede/kerberos_cug.html .

[16] Schroeder, Wayne, "Kerberos/DCE, the Secure Shell, and Practical Internet Security," Proceedings, Thirty-eighth Semiannual Cray User Group Meeting, Charlotte, North Carolina, October 1996, http://www.sdsc.edu/~schroede/ssh_cug.html.

[17] Gleicher, Michael, HPSS Interface (HSI) Hypertext Manual, http://www.sdsc.edu/Storage/hsi/.

[18] Jakobsson, Markus, DOCT White Paper, "Authentication Mechanisms", July 1997, http://www.sdsc.edu/~schroede/authmech.html.

[19] Schroeder, Wayne, DOCT White Paper, "Software Design Document for an MDAS/NPACI/DOCT Authentication/Privacy Mechanism", October 1997, http://www.sdsc.edu/~schroede/auth.html.

[20] Legion, http://www.cs.virginia.edu/~legion/ .

[21] Globus, http://www.globus.org .

[22] libSSH, http://csel.cs.colorado.edu/~kohno/projects/ssh.html. Also, a Work in Progress briefing at the Usenix Security Symposium January 26-29, 1998, San Antonio, Texas.

[23] SSLeay, http://www.psy.uq.edu.au:8080/~ftp/Crypto/ .

[24] PKINIT Internet Draft "Public Key Cryptography for Initial Authentication in Kerberos", Clifford Neuman, John Wray, Brian Tung, J. Trostle, M. Hur, A. Medvinsky, March 1998, ftp://ietf.org/internet-drafts/draft-ietf-cat-kerberos-pk-init-06.txt, this is a link from http://www.ietf.org/ids.by.wg/cat.html.

Biography

Mr. Schroeder is a Research Programmer/Analyst in the Data Intensive Computing group at the San Diego Supercomputer Center (SDSC), specializing in security systems, archival storage, networking, and HPC systems. He has over twenty years of experience, most of which has been in support of high-performance scientific computing environments in operating systems, utilities, applications, networking, archival storage, security, and databases. Additional information is available at http://www.sdsc.edu/~schroede.

Acknowledgements

Michael Gleicher integrated SEA into HSI.

Michael Wan assisted with the integration of SEA with the SRB.

The SDSC/NPACI Kerberos/SSH infrastructure is being enhanced by Tom Perrine and SDSC and NPACI systems staff.

The DOCT project was funded jointly by DARPA and the USPTO under Project F19628-96-C-0020.