How Packet Data Built the Internet
Those of a certain age may remember the “information superhighway” as a once-popular metaphor for the internet. Dr. Robert Kahn, the inventor who pioneered that metaphor, called it the “National Information Infrastructure” before the press decided on a pithier formulation. His colleague Dr. Vint Cerf coined the term “internetting” to describe the interconnection of computer networks. Together, Kahn and Cerf invented not just the terminology of the internet but the foundational communications protocol that makes it work.
During the height of the Cold War, the United States military wanted to interconnect its computer systems into a master network that could continue functioning even if some systems were destroyed in a nuclear strike. In 1969, the Advanced Research Projects Agency Network (ARPANET) emerged from this ambition as the first wide-area computer network. Key to establishing the ARPANET was providing a reliable system of sending data across local networks.
Within a local network, data is communicated between computers by segmenting it into smaller chunks called “frames,” and then labeling the frames with the address of the intended destination. Other stations in the local network will pass along each frame, but the frame will be ignored by all except the intended target with that address.
Extending this system to connect local networks into a larger whole became possible when all of the nodes in the network were known and a master control system administered the identification of each station. As a governmentally deployed system of research and military networks, ARPANET had that necessary master control. Opening the doors of that network to a wider array of civilian networks of unpredictable design and reliability, however, would require something entirely new.
In 1973, IBM systems engineer Vint Cerf tried sketching the problem, literally doodling on the back of an envelope while attending a conference in San Francisco. For years, Cerf had been engaged in stress testing the ARPANET to see how it performed—and failed—when connected to unruly outside networks. Knowing where the points of failure were, he now needed to figure out a solution. Cerf’s 1973 doodle laid out how known network nodes needed to connect to unknown—and potentially unknowable—nodes. He called it “internetting.”
The distinction between a local area network and the internet is routing. On a local area network, every station is interconnected and shares everything. Within this environment, data frames move through networked devices identified by the media access control (MAC) addresses assigned to each device by the hardware manufacturer. Each device is known to the system and identified by an immutable hardware-based identifier.
For internet traffic, though, packetized data will pass through multiple network nodes—think of them as data intersections—where routing decisions will have to be made about which direction to go next on the way to the destination. Although numerous computers may be involved in the sequence of hops from one end to the other, the data does not pass through every computer in the world.
Instead of identifying the intended destination by MAC address, the Internet Protocol (IP) addressing scheme applies. IP addresses are not unique identifiers permanently given to a machine at the time of manufacture, like MAC addresses, but rather an address temporarily assigned to a network connection point by a central registry.
IP addresses allow for the identification of different local area networks and can be used to direct traffic between them. Internet traffic is chunked into “packets” labeled with the IP addresses of the sender and the destination. At each node, a router reads the IP addresses on the packet to determine what is, at that very moment, the most efficient route. That will not necessarily be the same route for every packet for the same data set. At the far end, in addition to the IP address identifying the intended destination, sequence numbers identify how to reassemble the packets in order. The protocol also has an error-checking and -correcting faculty so that a lost or corrupted packet will get fixed or resent so that the recipient side gets the complete correct data.
Collectively, that process is known as the Transmission Control Protocol (TCP), which Cerf and Kahn developed over the course of six months based on Cerf’s doodle.
A few years later, another approach to packetizing data, the User Datagram Protocol (UDP) was developed to address circumstances like video or audio streaming, where a time-sensitive aspect of the transmission would prevent the use of TCP’s error-checking features.
Cerf and Kahn’s first iteration of the TCP standard was published in December 1973. Implementing the idea at scale revealed various bugs, requiring revisions over subsequent years. By 1976, the standard was set in the form we still use today.
The views and opinions expressed in this article are those of the author and do not necessarily reflect the opinions, position, or policy of Berkeley Research Group, LLC or its other employees and affiliates.
Related Professionals
Authors
Related Industries
Prepare for what's next.
ThinkSet magazine, a BRG publication, provides nuanced, multifaceted thinking and expert guidance that help today’s business leaders adopt a more strategic, long-term mindset to prepare for what’s next.