Appearance
TCP Programming
Socket is an abstract concept in network programming. Typically, a socket represents "an opened network connection." To open a socket, you need to know the target computer's IP address and port number, as well as specify the protocol type.
Client
Most connections are reliable TCP connections. When a TCP connection is created, the initiating party is called the client, while the party responding to the connection is called the server.
For example, when we access Sina.com in a browser, our computer acts as the client, and the browser actively initiates a connection to Sina's server. If everything goes well, Sina's server accepts the connection, establishing a TCP connection, and subsequent communication involves sending the webpage content.
To create a socket based on a TCP connection, you can do the following:
python
# Import the socket library:
import socket
# Create a socket:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Establish connection:
s.connect(('www.sina.com.cn', 80))
When creating a socket, AF_INET
specifies the use of the IPv4 protocol. If you want to use the more advanced IPv6, you would specify AF_INET6
. SOCK_STREAM
indicates the use of the stream-oriented TCP protocol, successfully creating a socket object, though a connection has not yet been established.
The client must actively initiate the TCP connection and needs to know the server's IP address and port number. The IP address of Sina can be automatically resolved from the domain name www.sina.com.cn
, but how do we know Sina's server's port number?
The answer is that the port number must be fixed based on the service provided by the server. Since we want to access a webpage, Sina's web service server must use port 80, the standard port for web services. Other services have corresponding standard port numbers; for example, SMTP service uses port 25, FTP service uses port 21, and so on. Port numbers less than 1024 are reserved for standard Internet services, while port numbers greater than 1024 can be used freely.
Thus, our connection to the Sina server is coded as follows:
python
s.connect(('www.sina.com.cn', 80))
Note that the parameters are a tuple containing the address and port number.
After establishing the TCP connection, we can send a request to the Sina server to return the homepage content:
python
# Send data:
s.send(b'GET / HTTP/1.1\r\nHost: www.sina.com.cn\r\nConnection: close\r\n\r\n')
A TCP connection creates a bidirectional channel, allowing both parties to send data to each other simultaneously. However, who sends first and how to coordinate depends on the specific protocol. For example, the HTTP protocol requires the client to send a request to the server first, and the server sends data back after receiving the request.
The text format sent must comply with the HTTP standard. If the format is correct, we can then receive data returned from the Sina server:
python
# Receive data:
buffer = []
while True:
# Receive up to 1k bytes at a time:
d = s.recv(1024)
if d:
buffer.append(d)
else:
break
data = b''.join(buffer)
When receiving data, the recv(max)
method is called, which receives up to the specified number of bytes at a time. Therefore, we repeatedly receive data in a while loop until recv()
returns empty data, indicating that the reception is complete and the loop can exit.
After receiving all the data, we call the close()
method to close the socket, thus completing a full network communication:
python
# Close connection:
s.close()
The received data includes the HTTP headers and the webpage itself. We only need to separate the HTTP headers from the webpage content, print the headers, and save the webpage content to a file:
python
header, html = data.split(b'\r\n\r\n', 1)
print(header.decode('utf-8'))
# Write the received data to a file:
with open('sina.html', 'wb') as f:
f.write(html)
Now, by opening this sina.html
file in a browser, you can view Sina's homepage.
Server
Compared to client programming, server programming is a bit more complex.
The server process must first bind to a port and listen for incoming connections from clients. When a client connects, the server establishes a socket connection with that client, and subsequent communication relies on this socket connection.
Thus, the server opens a fixed port (e.g., 80) to listen. For each client connection, a socket connection is created. Since a server may have many connections from clients, it needs to distinguish which socket connection is associated with which client. A socket is uniquely identified by four elements: server address, server port, client address, and client port.
However, the server must also respond to multiple client requests simultaneously, so each connection needs a new process or thread to handle it; otherwise, the server can only service one client at a time.
Let's write a simple server program that accepts client connections and returns the received string prefixed with "Hello."
First, create a socket based on IPv4 and TCP protocol:
python
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Next, we bind to the listening address and port. A server may have multiple network interfaces and can bind to the IP address of a specific interface, use 0.0.0.0
to bind to all network addresses, or use 127.0.0.1
to bind to the local address. 127.0.0.1
is a special IP address that refers to the local machine; if bound to this address, clients must run on the same machine to connect, meaning external computers cannot connect.
The port number must be pre-specified. Since our service is not a standard service, we will use port 9999. Note that port numbers less than 1024 require administrator privileges to bind:
python
# Listen on port:
s.bind(('127.0.0.1', 9999))
Next, call the listen()
method to start listening on the port, with the parameter specifying the maximum number of queued connections:
python
s.listen(5)
print('Waiting for connection...')
The server program then enters a perpetual loop to accept connections from clients. The accept()
method waits and returns a client connection:
python
while True:
# Accept a new connection:
sock, addr = s.accept()
# Create a new thread to handle the TCP connection:
t = threading.Thread(target=tcplink, args=(sock, addr))
t.start()
Each connection must create a new thread (or process) to handle it; otherwise, a single-threaded server cannot accept connections from other clients while processing one.
Here's how the tcplink
function looks:
python
def tcplink(sock, addr):
print('Accept new connection from %s:%s...' % addr)
sock.send(b'Welcome!')
while True:
data = sock.recv(1024)
time.sleep(1)
if not data or data.decode('utf-8') == 'exit':
break
sock.send(('Hello, %s!' % data.decode('utf-8')).encode('utf-8'))
sock.close()
print('Connection from %s:%s closed.' % addr)
Once the connection is established, the server sends a welcome message, then waits for client data, appending "Hello" to it before sending it back. If the client sends the string "exit," the server will close the connection.
To test this server program, we also need to write a client program:
python
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Establish connection:
s.connect(('127.0.0.1', 9999))
# Receive welcome message:
print(s.recv(1024).decode('utf-8'))
for data in [b'Michael', b'Tracy', b'Sarah']:
# Send data:
s.send(data)
print(s.recv(1024).decode('utf-8'))
s.send(b'exit')
s.close()
You need to open two command line windows: one to run the server program and the other to run the client program to see the effect:
┌────────────────────────────────────────────────────────┐
│Command Prompt - □ x │
├────────────────────────────────────────────────────────┤
│$ python echo_server.py │
│Waiting for connection... │
│Accept new connection from 127.0.0.1:64398... │
│Connection from 127.0.0.1:64398 closed. │
│ │
│ ┌────────────────────────────────────────────────┴───────┐
│ │Command Prompt - □ x │
│ ├────────────────────────────────────────────────────────┤
└───────┤$ python echo_client.py │
│Welcome! │
│Hello, Michael! │
│Hello, Tracy! │
│Hello, Sarah! │
│$ │
│ │
│ │
└────────────────────────────────────────────────────────┘
It's important to note that the client program exits after running, while the server program continues running indefinitely and must be terminated with Ctrl+C.
Summary
Socket programming using the TCP protocol in Python is quite straightforward. For the client, you actively connect to the server's IP and specified port; for the server, you first listen on the designated port and create a thread or process to