Thursday, November 19, 2009

Thursday Keynote

Today, the thrust area highlighted in the keynote is Sustainability. The speaker will be former Vice President Al gore. Right now we are hearing from the SC '10 chair inviting us all to New Orleans next year.

Technology thrusts for '10 include Climate Simulation, Heterogeneous Computing (e.g. combining x86 processors with GPU processors), and Data Intensive Computing.

SC 10 will have more exhibitor space than ever and 300+ Gbps network connectivity.


Now up, Al Gore.

"I used to be the next President of the United States" *laughter* "I don't think that is funny"

Al is a "recovering politician", talking about working to help establish funding for supercomputing centers and the Internet during his political career. AL GORE INVENTED SUPERCOMPUTERS. Al Gore was actually present at the first SC conference in 1988 in Orlando, FL.

After a few more funny stories about life after politics (people telling him he looks just like Al Gore at restaurants), and apologizing because he was supposed to attend in 1997 but had to fill in for President Clinton at some other event, he is now talking about how important the work we are doing is. Technology is important for the discoveries, but also so that developing countries can skip the pollution intensive phase developed countries have gone through. More urban development will occur in the next 35 years than has occurred in the history of civilization. Modeling will help develop these new urban areas to be more sustainable and less automobile centric.

Supercomputing was a transformative development, similar to the invention of the telescope or other revolutionary scientific instrument.

90 million tons of global warming pollutants go into the atmosphere today, and now we know how thin our atmosphere really is. It is not as vast as it appears to our perspective, and images from space have confirmed it is a very thin shell and not impervious to our actions. Oil is the dominant source of energy after the first oil well in the world was drilled in Pennsylvania in 1859. In the last 100 years, the worlds population has quadrupled, should stabilize at ~9 billion in 2050.

The climate change crisis is actually the second major climate crisis, the first was the effects of CFCs on the Ozone layer and the large hole that developed over Antartica. One year after the discovery of the hole a treaty was developed to phase out the chemicals responsible. This was criticized as too weak, but tougher and tougher standards were set as alternatives were developed, and now it has been a success. The same holds true for climate change, any politically acceptable treaty will not be enough, but it will be a start. We need to introduce systematic changes that produce much higher levels of efficiency, and supercomputing will be key to replacing terribly inefficient technology such as the internal combustion engine. If you measure the energy used to move a person with an automobile, the car is about 99.92 inefficient when you look at how much energy actually goes to moving the passenger compared to moving the vehicle or what is lost to inefficiencies. As these technologies are developed it will start to make business sense to move to less polluting technology. Coal fired generation is also inefficient, only 1/3rd of the potential energy becomes electricity, the rest is lost mostly as waste heat. Solar and wind needs supercomputing.

Moores Law was a self-fulfilling expectation, not a law. It was anticipated how much of a revolution this growth in computational power would be so R&D resources were allocated to allow the growth. By next year there will be more than 1 billion transistors per person in the world. We can transform our energy system by recognizing the potential and making the necessary R&D investments. As long as we send 100s of billions of dollars every year to other countries for energy we are vulnerable. We need to stay the course, after the drop in fuel prices people do not see it as much of a crisis - they are in a temporary trance.

Supercomputing vastly expands our ability to understand complex realities. How do we, as human beings, individually relate to the powerful tool of supercomputing. The 3D internet (or as the CTO of Intel told us it is more accurately the 3D Web) will revolutionize the interface between humans and incredibly powerful machines. Biocomputing will revolutionize the treatment of disease. Modeling will revolutionize alternative energy.

Climate modeling will help develop political consensus to act quickly to solve the crisis. Without action, climate change can threaten human civilization. It is a planetary emergency.

short advertisement:
repoweramerica.org, 100% of proceeds from the sale of his new book will be donated to environmental causes

The environmental crisis does not trigger instinctive fears or responses to the threat, it requires reasoning. Since the distance between causes and consequences is so long we need to make the global scale and speed of this crisis obvious. Climate modeling and supercomputing will be critical.

We are not operating in a sustainable manner, understanding and responding is the challenge of our time.

Solar is promising, the key is to driving scalability and reductions in cost (Moore's Law for solar technology). We need a distributed energy architecture. Solar, wind, geothermal, biomass are all important as is new nuclear technology. Global scalability is limited by controlling nuclear proliferation. Electric vehicles can be a distributed battery for peak electricity loads.

Reducing deforestation is also an important challenge. Deforestation is responsible for significant CO2 emissions, and also the loss of genetic information (biodiversity) is important. Selling rain forest for wood is like selling a computer chip for the silicon. The real value is the biodiversity (drugs, biofuels, etc).

We are challenged to apply our skills to build a consensus to move forward quickly to solve the climate crisis.

Wednesday, November 18, 2009

Data Challenges in Genome Analysis

COMING SOON

The 3D Internet

Apologies Dave/Keith - I am definitely not as thurough blogging these things as Gregg.


One of the "thrust areas" for SC '09 was "the 3D internet", and this was the focus of the opening address Tuesday morning. The speaker was Justin Rattner (CTO Intel), but before Justin could take the stage we were addressed by the SC '09 chair, Wilf Pinfold, gave us a run down of the conference.

Despite some worries about the economic downturn impacting the conference, the numbers were still strong. Wilf informed us that all 350+ booths were sold, and there is 265,000+ square feet of exhibition space. 204 miles of fiber were run throughout the conference center, and with support of local internet providers, 400 Gbps of internet connectivity was provided to the conference center for the duration of SC 09. This year there was a 22% acceptance rate for technical papers, highlighting that this is a premiere conference in high performance computing. This SC also made an effort to be more sustainable - it is the first SC held in a LEED certified conference center, all plastic is plant-based and biodegradable, and plastic water bottles have been replaced by water coolers. Also, the conference center has recycling bins throughout, which appears to be the norm for Portland.

So on to Intel CTO Justin Rattner and the 3D Web. Justin noted how revenue in HPC has been growing very slowly (almost flat?), and that the field relies mostly on government funding for R&D and on "trickle-up" technology from the consumer space (e.g. Intel/AMD processors, GPUs,...). He thinks that the 3D Web can be the "killer app" for HPC and really help propel it to the next level. The 3D web will be continuously simulated, multi-view, and immersive and collaborative. A demo was given of ScienceSim, specifically Utah State's FernLand, which is a fern lifecycle and population genetics simulation. ScienceSim is based on OpenSim, and the idea is to standardize this technology so we have interoperability between virtual worlds After a virtual interaction with the FernLand researcher in ScienceSim, Justin welcomed Shenlei Winkler of the Fashion Institute of Technology on stage.

Shenlei discussed how the fashion industry is fairly unique in that it is a >$trillion industry yet has avoided heavy computerization. At FIT they are exploring using OpenSim to replace the traditional workflow where a designer comes up with a design, which is sent to a factory in another country where a prototype run is made and shipped back for evaluation, and this is repeated as the design is refined. With the 3D Web they can do this iterative process in a virtual world, cutting time, cost, and environmental impact. She stated how one of the important developments would be cloth simulation so designers can see the difference of how different fabrics drape and flow. Justin showed some impressive demonstrations of cloth physics simulations that were computationally intensive (~6 minutes per frame on a small cluster, much more HPC power will be needed for real-time cloth physics in virtual worlds).

Justin wrapped things up with a demonstration of a system with an Intel "larrabee" coprocessor. Matrix mathematics was offloaded to this coprocessor, achieving 1 teraflop on a single over-clocked chip. Supposedly this is the first 1 teraflop performance achieved with a single chip. Programming tools also need to improve to take advantage of this type of architecture, and Ct was discussed. Ct is a high-throughput C++ based language that takes some of the effort for programming this type of architecture off the shoulders of the programmer.

Tuesday, November 17, 2009

FOR THOSE THAT DOUBT ME


Here I am with the father of Beowulf computing, Donald Becker. Don wrote a lot of the early Ethernet drivers for Linux and invented the concept of Beowulf computing. This would be like having your picture taken with Henry Ford if you were in the automotive business. I think I was supposed to hold him up for a keg stand of some special beer that a "secret" Portland-based HPC related compay (cough *Portland Group* cough) commissioned for the Beobash. The only problem is that Josh from Penguin Computing had to leave, and he was the other party involved in this task.

Anyway, we are having a TORQUE meeting tomorrow after the Top 500 announcement. If for some crazy reason anyone reading this is at SC '09, meet me at the Top 500 BoF Tuesday November 17th and you can talk about TORQUE with me and other users/developers at a venue to be determined.

Monday, November 16, 2009

SC '09

I'm here at SC '09. I got into my hotel last night after a long trip from Bangor, Maine. Today I stopped by the convention center and took care of my registration, and then checked out a little bit of downtown via the Max light rail (the convention center and downtown area are in the "free zone" so it is very convenient). The large (~255,000 square foot according to the website) exhibition opens tonight at 5:00PM, and the technical program kicks off tomorrow morning, so I should have some more interesting things to blog about soon. I still need to plan my schedule for the next few days to make sure I get in all of the talks that look interesting.

On a side note, I know Portland, ME is much smaller than Portland, OR and this wouldn't really be feasible, but how cool would it be to have light rail connecting the Old Port, Downtown, PWM, the Amtrak station, the Maine Mall, etc, with spurs out to surrounding communities (South Portland, Scarborough, Saco, Wetbrook, Yarmouth, Freeport, etc). It would make greater Portland a very "green" city.

Monday, June 15, 2009

A Rational Design Process

In grad school I took a software engineering course that was based largely on reading classic software engineering papers and discussing them. Occasionally we also implemented some of the example systems described in the papers (for example, we implemented the simulator described in Leslie Lamport's Time, Clocks, and the ordering of events in a distributed system and the KWIC index system described by David Parnas's On the Criteria To Be Used in Decomposing Systems into Modules), but actual coding was not the focus of the course.

One of the classic papers that I remember most from this course is The paper by David Parnas and Paul Clements entitled A Rational Design Process: How and why to fake it. This is one of the first classic software engineering papers (before agile programming, extreme programming, or whatever the latest buzz-word fad is) where the authors seemed to acknowledge that a formal software design processes will always be an idealisation. We had spent much of the semester reading about the idealized process, including the aforementioned On the Criteria... paper, and this paper really brought out some interesting thoughts during our discussions.



Among many of the examples Parnas and Clements give as to why the process is an idealization, is that in most cases the people that request the software system in the first place do not know exactly what they want and are unable to communicate everything that they do know. I think most software engineers with any experience trying to use a formal development processes can attest to this observation.

The authors also note that even if the engineers knew all of the requirements up front, other details needed to implement a system that satisfies the requirements do not become known until the implementation is already underway. Had the engineers known this information up front, it is likely that the design would be different. One of the most important points made is that even if we did know all of the facts up front, human beings are unable to fully understand all of the details that must be taken into account in order to design and build a correct (and ideal) system.

The process of designing software, according to Parnas and Clements is a process by which we attempt to separate concerns so that we are working with manageable amounts of information at one time. However, the catch-22 is that until we have separated concerns we are bound to make errors due to the complexity of the system. Furthermore, as long as there are humans involved there will be human error, even after the concerns have been separated by the design process.

Parnas and Clements make several other points including the fact that even the most trivial projects are subject to change due to external reasons, and these changes can invalidate design decisions or make decisions non-ideal given the new circumstances.

Sometimes we use a projec to try out a new idea - an idea that may not be derived from the requirements in a rational process.

However,despite the fact that the rational process is unrealistic, it is still useful. It can be used to drive the project, but it should be expected that there will be deviations and backtracking. In the end, the authors argue that we should "fake it" and produce quality documentation written as if the final design of the system was designed following the ideal rational process. We identify what documents would be produced had we followed the idea process, and attempt to produce those documents in the proper order. Any time information is not known, we make a note of it in the documentation where that information would have gone and move on as if that information is expected to change. Any time a design error is found the all of the documentation (including documents made in "previous steps" in the process) must be corrected. No matter how many times we back track, or some external influence changes the requirements, the documentation is updated and in the end the final documents are rational and accurate.

Documentation plays a major roll in the process, yet many times we don't seem to get it right.

Most programmers regard documentation as a necessary evil, written as an afterthought only because some bureaucrat requires it. They don't expect it to be useful.

This is a self fulfilling prophecy; documentation that has not been used before it is published, documentation that is not important to its author, will always be poor documentation.

Most of that documentation is incomplete and inaccurate but those are not the main problems. If those were the main problems, the documents could be easily corrected by adding or correcting information. In fact, there are underlying organisational problems that lead to incompleteness and incorrectness and those problems ... are not easily repaired [1]


I think this classic paper is a must-read by anyone trying to implement a strict waterfall process (or trying to impose such a process on their underlings). Despite being written over 20 years ago it seems like there are still lessons to be learned.

[1] Parnas, D. L. and Clements, P. C. 1986. A rational design process: How and why to fake it. IEEE Trans. Softw. Eng. 12, 2 (Feb. 1986), 251-257.