
Recommendations for
An Advanced Research Infrastructure Supporting
the
Computational Science Community
Report from the Post vBNS Workshop
March 4-5, 1999
On behalf of the institutions and organizations in the Partnerships for Advanced Computational Infrastructure (PACI) program, we urge the National Science Foundation to commit itself and its directorates to a vision of an advanced research infrastructure responsive to the computational and networking requirements of tomorrow's research community.
PACI Vision for the Research and Education Infrastructure
During a workshop held March 4-5, 1999 in La Jolla, California, 31 researchers and technologists representing 19 institutions discussed the role of high performance networking within the context of the Partnerships for Advanced Computational Infrastructure (PACI) program. Leaders of the National Partnership for Advanced Computational Infrastructure (NPACI) and the National Computational Science Alliance (the Alliance) and related research organizations discussed the problems and potential solutions associated with implementing a national infrastructure for scientific research. They agreed that networks are of central important to PACI, both in furthering the development of new mechanisms and techniques for conducting scientific research and in attaining PACI's goal of providing "the foundation for meeting the expanding need for high end computation and information technologies required by the U.S. academic community".
The rapidly evolving Internet has altered both the manner in which communities of researchers interact and the way in which science itself is pursued. From the 56Kbps ARPAnet, to the T1 and T3 NSFnet, to the OC48 very-high-performance Backbone Network Service (vBNS), Abilene, and CalREN networks, to NTON's OC192 backbone, a digital infrastructure is emerging as the foundation of our research and education (R&E) system. This broad research infrastructure integrates the people and vital institutional resources supported by NSF and other Federal agencies. At the core of this infrastructure, persistent, robust, high performance networks link supercomputers, large instrument facilities, and data repositories into a cohesive national research and education resource.
As this infrastructure becomes accessible throughout more than 250 institutions and organizations associated with the PACI program, it will engender a radical transformation in high-end applications and the manner in which researchers and educators throughout the globe access and manipulate R&E resources. In turn, as early developers and adopters of network-based information technologies, PACI partners will serve as a critical role models for scientific and education communities throughout our nation.
Relationship of PACI and Advanced Networking
In the preparatory workshop survey, one participant wrote that we must :
... recognize that PACI is an `environment' with the flow of computing and storage between central sites and remote sites dependent on advanced networks -- where the bigger and faster the networks are, the more rapid the cyclic evolution in which new capabilities are alternately centralized or distributed.
Unlike some segments of the education community, PACI's success depends on the presence of high speed, high performance networks. PACI achievements associated with gigabyte to terabyte level files transfers among data storage and computational centers (e.g., the Human Genome project, physics computations) and interactive use of remote astronomical, oceanographic, and telemicroscopy resources all depend on reliable, high performance networks.
PACI's visions - virtual national machine room connections of linked computer resources, Grid research infrastructure testbeds, and geographically distributed researchers in collaboratories interacting via advanced audio, visual, data processing, and data manipulation capabilities - are reflected in the enabling tools under development within the Alliance and NPACI partnerships. These tools, aimed at facilitating the distributed computing, remote manipulation of instruments, and distributed collaborations, initially will be rolled out within the context of PACI. Once successfully deployed, their use should expand as they are adopted by the broader community, including the Computer Science and Computational Science Research Grant recipients and others with whom PACI interacts. In turn, the availability of these tools and the advanced networks associated with them directly leverages NSF's other investments in the research of their grantees by enabling researchers to work more efficiently, and accomplish feats that might otherwise be unobtainable.
NSF's Role in an Advanced Research Infrastructure
In acknowledging its role in supporting the nation's advanced research infrastructure, NSF needs to consider the following:
During the workshop, short presentations were given by a dozen individuals covering the role of high performance networks in supporting specific PACI initiatives. By the second day, these discussions had evolved to a focus on the themes of end-to-end issues, middleware, and evolving paradigms. The following sections describe the outcomes and conclusions of those discussions.
Workshop Discussions
1. Enabling New Paradigms within Computational Science
One of the more important discussion areas in the conference was the question of how an advanced research infrastructure - particularly an advanced networking infrastructure - can change the fundamental character of research. The consensus was that networks will enable persistent information structures independent of physical location and serve as "creators of communities".
Participants cited numerous examples of potentially revolutionary approaches to scientific research and data analysis, including cases involving remote instrumentation and virtual machine rooms. These efforts are preliminary, yet promise to provide significant applicability and potential for replicability throughout scientific disciplines. PACI's responsibility as an early adopter extend to its providing leadership and guidance to less technologically advanced science disciplines as they incorporate information technologies into their own communities.
Since we are engaged in a dynamic cycle of advanced infrastructure leading to advanced research, requiring even more advanced commercial and R&E infrastructures, it is critical that linkages be established with private sector technology users and suppliers. This will help with steering commercial offerings toward directions appropriate to the R&E community's needs and also will assist in driving down costs by establishing early demand for certain technologies and services such as wireless and multicast networking capabilities. Augmenting campus infrastructures with new technologies through partnerships with commercial firms is another way to provide real-world prototypes that will assist in moving emerging technologies into the marketplace.
Evolution toward new paradigms for doing science within PACI and the R&E community as a whole is dependent on the presence of persistent, robust high performance networks and efficient, relatively transparent middleware to coordinate the networks and the resources they connect. PACI is committed to facilitating this access in support of CISE's goals, specifically to:
2. Improving End-to-End Networking for PACI Researchers
a. Encourage the development of enhanced campus infrastructures - Participants concluded that continuation of an NSF High Performance Connection program was in the best interest of the nation's R&E sector, but that any new program should include consideration of ...
Local campuses often have complex organizational and political environments. Researchers involved in PACI may require assistance in educating campuses and their primary funding agencies about the reasons for, and opportunities associated with, upgrading their campus' data infrastructures. It would help to have PACI leadership identify areas in which the program could serve as an advocate or broker for researchers, in order to:
Campuses and funding agencies also need to focus on enhancing the "soft" side of the campus networking infrastructure, and to develop programs and innovative solutions to address critical labor shortage problems while augmenting the general support infrastructure (LANs, WANs, applications) available to researchers.
b. Expand the range of facilities connected via advanced networks - PACI's focus on distributed systems and remote control of instruments is changing the ways science is conducted. Astronomers' remote use of the very large array radio telescope and neuroscientists' use of the NCMIR electron microscope for real-time steering of instruments, compute-intensive calibration, deconvolution, mosaicing, visualization, and scientific analysis are illustrative of these new opportunities. Examples of similar remote instrumentation cases exist across PACI and include disciplines such as environmental science and oceanography. Distributed collection and processing of data have similar demands, as illustrated by regional atmospheric research monitoring/modeling efforts at Oklahoma University and the monitoring/analysis efforts of the Long-Term Ecological Research (LTER) community. Given the emerging opportunities provided through networking these facilities with remote users, NSF should take steps to enhance researchers' access to other critical facilities via the network, including datacenters, large instrumentation facilities, cluster resources, and museums.
c. Develop and deploy tools for end-to-end management - Although tools exist for measuring and managing the operation of the network backbone, there is a dearth of tools or feedback mechanisms for analyzing end-to-end performance across the network. Users are particularly in need of tools and mechanisms that give indications of performance before and during execution of major applications or activities, such as transfer of a massive file across the network.
Active and passive traffic measurement tools are a priority, and should include appropriate feedback loops for
Tools providing this level of granularity can assist in establishing baselines for expected performance, identifying the location of poor performance, and assisting with validation of Service Level Agreements (SLA). If possible, measurement initiatives should...
Performance-tuning tools, as another class of requisite end-to-end management tools, should be made more widely available, and campus network engineers should be trained to use them. Emerging tools should aim to simplify the task of tuning sophisticated applications and hardware for optimal traffic performance, with the goal of making this effort as transparent to the users as possible.
End-to-end trouble tracking persists as a problem for the end-user. Individual points of authority and responsibility need to be identified. The responsible party should assist in diagnosing the problem and identifying its point of origin (e.g., backbone, local access provider, or campus). This party should then track the problem until it is resolved, providing feedback to users during the process as appropriate. NLANR (www.nlanr.net) is funded by NSF to assist in this type of user-oriented service within the High Performance Connection community. However, campus network officials should continue to have primary responsibility for intra-campus issues.
d. Other end-to-end issues - Other discussions focused on techniques for improving accessibility of scientific data by users. Deployment of cache hierarchies, for example, was a technique recommended to deal with issues of scale relating to massive scientific data sets. Intelligent access and temporary storage of key data will enhance their accessibility by remote researchers and reduce the load on backbone networks.
Efforts should be made to raise academia's overall sensitivity to the critical role that campus networks play in connecting researchers to the broader R&E community. Suggestions for consideration by NSF included:
3. Expanding PACI's Middleware Offerings
For the purposes of these discussions, "middleware" is defined as the software below actual applications and above the transport protocols of a specific network. Development and use of middleware that enhances user control over applications, while making the network itself relatively transparent to users, is of vital importance to our being able to realize the potentials of advanced networks for supporting disciplinary research.
PACI metasystems thrusts, particularly the Globus and Legion software initiatives and the Grid, are important steps in this direction. Further efforts are also needed to make the geographical locations of specific resources, such as supercomputers, datacenters, and instruments, transparent to users. Scheduling software, which permits users to access remote resources as system capabilities change on a minute-to-minute basis, is critically important. Ideally, economic models should be incorporated into these metasystems wares, permitting users to trade off use of multiple remote resources depending on availability, cost, or other considerations.
Many of the challenges inherent in middleware development are technical, still others are sociological and cultural. The latter should be addressed within the framework of national, state, and local campus policy dialogues in order to foster convergence and consensus. This is particularly important and timely given the increasing priority on campus-specific middleware and services such as DNS, multicast, and security. PACI's should assume a role of computational sciences advocate and proactively foster relationships among designers of network components and designers of middleware and applications, as well as with important communities, such as the Digital Libraries community. NSF could assist in this process through facilitating campus development, through encouraging invention and technology sharing and transfer programs among universities, and through helping to resolve policy issues that may inhibit campuses' abilities to take full advantage of opportunities associated with emerging information technologies.
The Partnership for Advanced Computational Infrastructure (PACI) program was established in 1996 within the Advanced Computational Infrastructure and Research (ACIR) division of the the National Science Foundation (NSF) Computer and Information Science and Engineering (CISE) directorate. Both PACI and CISE's Advanced Networking Infrastructure and Research (ANIR) NSFnet Program are intended to support cross-disciplinary, cross-directorate infrastructure requirements - with PACI focused on computational science infrastructure and ANIR focused on the networking infrastructure. Together, these programs provide the underpinning for many of the nation's scientific research endeavors. As of early 1999, PACI's Advanced Computational Science Alliance (Alliance) and National Partnership for Advanced Computational Infrastructure (NPACI) programs consisted of 250 academic and commercial organizations and almost 600 projects.
The workshop, Future Scenarios - NSF Networking Research and Associated Infrastructure Support - PACI, was held at the Sea Lodge Hotel, La Jolla, CA on March 4-5, 1999. Dr. Sid Karin, Director, SDSC and NPACI, and Dr. Larry Smarr, NCSA and Alliance, chaired the event. Thirty-one (31) individuals from 19 institutions participated in the discussions.
George Brett, NLANR (ghb@nlanr.net)
James Brunt, UNM/LTER (jbrunt@sevilleta.unm.edu)
Charlie Catlett, NCSA (catlett@ncsa.uiuc.edu)
Andrew Chien, UCSD (achien@cs.ucsd.edu)
Dick Crutcher, U of Illinois, (crutcher@astro.uiuc.edu)
Tom DeFanti, UIC (tom@eecs.uic.edu)
Roscoe C. Giles, Boston U (roscoe@bu.edu)
Mark Ellisman, UCSD (mhellisman@ucsd.edu)
Andrew Grimshaw, UVA (grimshaw@virginia.edu)
Douglas Van Houweling, Internet2 (DVH@Internet2.edu)
Ron Hutchins, Georgia Tech (ron.hutchins@oit.gatech.edu)
David Jahns, University of Oklahoma, (djahn@ou.edu)
Lennart Johnsson, UH (johnsson@cs.uh.edu)
Sid Karin, SDSC (skarin@sdsc.edu)
Ken Klingenstein, CU (ken.klingenstein@colorado.edu)
Bill Lennon, LLNL (wjlennon@llnl.gov)
Tracie Monk, SDSC (tmonk@caida.org)
Klara Nahrstedt, UIUC (klara@cs.uiuc.edu)
Larry Smarr, NCSA (pls@ncsa.uiuc.edu)
Bill St. Arnaud, Canarie (bill.st.arnaud@canarie.ca)
Peter Taylor, SDSC (taylor@sdsc.edu)
Doug Toussaint, U. of Arizona (doug@physics.arizona.edu)
Michael Vildibill, SDSC (mikev@sdsc.edu)
Glen Wheless, Old Dominion (wheless@ccpo.odu.edu)
Paul Woodward, U. of Minnesota, (paul@lcse.umn.edu)
NSF:
Bob Borchers (rborchers@nsf.gov)
Javad Boroumand (jborouma@nsf.gov)
Aubrey Bush (abush@nsf.gov)
Bill Decker (wdecker@nsf.gov)
Steve Elbert (selbert@nsf.gov)
Don Mitchell (dmitchel@nsf.gov)
Additional information on PACI is available from NSF at http://www.cise.nsf.gov/acir/. Details on the Alliance program are available on the NCSA homepage, http://www.ncsa.edu/; details on the NPACI program are available at http://www.npaci.edu/. NSF's Advanced Networking Infrastructure division projects are described at http://www.cise.nsf.gov/anir/. The agenda and presentations from this meeting are posted at http://www.npaci.edu/post-vBNS/PACI/
Last updated 2 April 1999
For comments or questions, contact tmonk@caida.org