Internet Traffic Flow Study Underway

January 9, 1998

By Jon Bashor, jbashor@lbl.gov

For all the Internet users who wonder why e-mail sometimes bogs down or why a favorite website takes an eternity to load up, a new effort by computer scientists in Berkeley and Pittsburgh may start turning up some answers.

Funded by the National Science Foundation, the year-long pilot study of the Internet structure and its performance could serve both to troubleshoot problems and to eventually give users a means to rate Internet services and providers. The projected is led by Vern Paxson of the Lab's Computing Sciences Network Research Group and by scientists at the Pittsburgh Supercomputing Center.

Researchers will place computers at various locations on the Internet to automatically measure network performance between stations. This effort is intended to provide the foundation for a much larger study of the Internet infrastructure.

Berkeley Lab alone participates in about 500,000 connections every day, as scientists rely increasingly on computer networks for scientific collaboration. For example, Internet traffic has grown from a mere 45 web connections in October 1992 to more than ten million last month, Paxson notes.

One of the biggest problems with the Internet today is congestion. The net's worst problem, however, also reflects one of its main strengths. The myriad of data connections in the network all share the available capacity among them, rather than reserving capacity individually, as do telephone connections. This sharing makes for a much more efficient use of the network, but it also leads to traffic overload.

The data for a particular connection travels along a series of links, each step being termed a "hop," with twenty or more hops in an end-to-end chain. "The network is very good at hiding the individual hops a message takes from the end users," Paxson says. "Links go down all the time and packets have to be re-routed, which happens automatically."

Within the Internet research community, the web is known as a "success disaster"-- something that was so useful that it grew faster than the system could accommodate. The underlying problem is that the web, a huge hypertext database, is poorly designed to run on the global Internet, Paxson says. As the popularity of the web grows, the problems loom ever larger.

"We can build systems with hundreds of things and they work fine, but when we scale them up to hundreds of thousands, they break down, and it's hard to predict where they will fail." he says. "Users may have the perception that something is overloaded when it's actually misengineered."

Without measurement, it can be impossible to tell where the problem lies. One of the basic measurements provided by Paxson and his Pittsburgh colleagues is the rate at which data packets are discarded, or "dropped" by the network, something which directly affects performance.

Information on the net is sent via IP--the Internet Protocol. When too many data packets come in, routers hold them in buffers until traffic thins out. When traffic is too heavy, the buffers fill up and the router drops packets. The IP provides the basic building block, which says, "You hand me a packet and I'll do my best to deliver it," Paxson explains. "But no promises--it may arrive as expected, it may get duplicated along the way, it may be corrupted or it may arrive out of order. Although that sounds like a weakness, it's actually a strength. It's very cheap to provide connectivity because you promise very little."

On top of the IP layer is the Transmission Control Protocol, or TCP. This is the mechanism that ensures that messages do in fact get there.

"On the web, when you do a data transfer and you get the desired item, the transfer is finished," Paxson says. "When you're using a browser like Netscape or Explorer and it stops, that means a packet was dropped. But TCP isn't giving up, it's really backing off because there's a problem. When you get a message back saying something can't be found, your packets did get through, but there's a problem with another part of the system."

Paxson ultimate goal is to give Internet users a method whereby they could click on a button and find out the root of the problem, and in the process contribute to the measurement database. While most users know about the typical problems, Paxson expects to find the unexpected.

Lab helps build a better Internet

For more than a decade, computer scientists at Berkeley Lab have analyzed problems with the Internet, then come up with technological solutions to fix them. In fact, many of the technical standards now used to keep the Internet working smoothly were invented here at the Lab.

In 1985, ARPANET (the forerunner of the Internet), was collapsing due to congestion of data transmission. Berkeley Lab computer scientists developed several Transmission Control Protocol algorithms to prevent them. The algorithms became mandatory standards in 1989.

In 1986, the network was plagued with routing instability between the many connecting systems, resulting in many transmissions being lost. Lab experts developed a tool to trace data packets along the way, allowing routing problems to be pinpointed and corrected.

As the Internet and the Web gained wider exposure, users began to look for wider applications. The idea of using the Net to transmit both audio and video was dismissed by many as impossible. The Network Research Group at Berkeley Lab saw it as a challenge and helped created Mbone, a multicast virtual network using the Internet. Today, more than 500,000 people around the world routinely use the technology to conduct meetings, seminars and workshops. Protocols developed here at the Lab are now Internet standards.

For more information about the Lab's Network Research Group, visit its website at: http://www-nrg.ee.lbl.gov/.

Search | Home | Questions