Appearance
Introduction to TCP/IP
Since the advent of the internet, almost all programs today are network programs, with very few standalone applications left.
Computer networks connect various computers, allowing them to communicate with one another. Network programming involves implementing communication between two computers within a program.
For instance, when you use a browser to access Sina.com, your computer connects to a specific server of Sina via the internet, and Sina's server transmits the webpage content as data back to your computer.
Since your computer may have multiple applications running, such as browsers, QQ, Skype, Dropbox, and email clients, each program connects to different computers. Therefore, more accurately, network communication refers to the communication between two processes on two different computers. For example, the browser process communicates with a web service process on Sina's server, while the QQ process communicates with a process on Tencent's server.
In summary, network communication is the communication between two processes.
Network programming is consistent across all programming languages, and Python is no exception. In Python network programming, communication occurs by connecting to the communication ports of other server processes from within the Python program itself.
In this chapter, we will provide a detailed introduction to the concepts of Python network programming and the two main types of network programming.
Although everyone is now familiar with the internet, computer networks emerged long before the internet.
For computers to connect, a communication protocol must be established. In the early days of computer networking, each manufacturer had its own set of protocols. IBM, Apple, and Microsoft all had their own incompatible network protocols. This is similar to a group of people speaking different languages: those who speak the same language can communicate, but those who speak different languages cannot.
To connect all types of computers worldwide, a universally accepted protocol must be established. The Internet Protocol Suite is the standard protocol for achieving the goal of the internet. The term "Internet" is a combination of "inter" and "net," meaning a network of networks. With the internet, any private network that supports this protocol can connect to the internet.
The internet protocol suite includes hundreds of protocol standards, but the two most important are TCP and IP, which is why the internet protocols are often referred to as the TCP/IP protocols.
During communication, both parties must know each other's identifiers, just like you need to know someone's email address to send an email. The unique identifier for each computer on the internet is its IP address, such as 123.123.123.123. If a computer is connected to two or more networks, such as through a router, it will have two or more IP addresses. Therefore, the IP address corresponds to the computer's network interface, typically a network card.
The IP protocol is responsible for sending data from one computer through the network to another. The data is divided into small packets and sent as IP packets. Due to the complex routes of the internet, there are often multiple paths between two computers, so routers determine how to forward an IP packet. IP packets are sent in chunks, passing through multiple routers, but they do not guarantee delivery or the order of arrival.
An IP address is essentially a 32-bit integer (known as IPv4), and the string representation of an IP address like 192.168.0.1 is actually a numerical representation of this 32-bit integer divided into 8-bit groups for readability.
An IPv6 address is a 128-bit integer, which is an upgrade from the currently used IPv4, represented as a string similar to 2001:0db8:85a3:0042:1000:8a2e:0370:7334.
The TCP protocol is built on top of the IP protocol. TCP is responsible for establishing a reliable connection between two computers and ensuring that data packets arrive in order. TCP establishes a connection through a handshake process and assigns a sequence number to each IP packet to ensure that the recipient receives them in order. If any packets are lost, TCP will automatically retransmit them.
Many commonly used higher-level protocols are built on the foundation of the TCP protocol, such as the HTTP protocol used by browsers and the SMTP protocol for sending emails.
A TCP packet contains not only the data to be transmitted but also the source IP address, destination IP address, source port, and destination port.
What is the role of ports? When two computers communicate, simply sending an IP address is not enough, as multiple network programs may be running on the same computer. When a TCP packet arrives, the operating system needs the port number to determine whether to pass it to the browser or QQ. Each network program requests a unique port number from the operating system, so two processes establishing a network connection between two computers need their respective IP addresses and port numbers.
A process may also establish connections with multiple computers simultaneously, so it may request multiple port numbers.
Having understood the basic concepts of the TCP/IP protocol, as well as the concepts of IP addresses and ports, we can now begin our journey into network programming.