Skip to content
On this page

Basics of Network Programming

Before diving into Java network programming, let's first understand what a computer network is.

A computer network refers to a network composed of two or more computers, where any two computers can communicate directly within the same network because they all follow the same network protocol.

What is the Internet, then? The Internet is a network of networks, which connects multiple computer networks to form a global unified Internet.

For a specific computer network, it might use protocol ABC, while another network might use protocol XYZ. If the communication protocols are not unified, different networks cannot be connected to form the Internet. Thus, to connect to the Internet, the computer network must use the TCP/IP protocol.

The term "TCP/IP protocol" generally refers to a set of Internet protocols, with TCP and IP being the two most important ones. Only computers using the TCP/IP protocol can connect to the Internet, while those using other network protocols (e.g., NetBIOS, AppleTalk) cannot.

IP Address

On the Internet, an IP address uniquely identifies a network interface. A computer connected to the Internet will definitely have at least one IP address, but it could have multiple IP addresses.

There are two types of IP addresses: IPv4 and IPv6. IPv4 uses a 32-bit address, like 101.202.99.12, whereas IPv6 uses a 128-bit address, like 2001:0DA8:100A:0000:0000:1020:F2F3:1428. IPv4 has a total of 2^32 addresses (about 4.2 billion), while IPv6 has 2^128 addresses (about 340 undecillion), making IPv6 addresses practically inexhaustible.

IP addresses are also categorized into public and private IP addresses. Public IP addresses are accessible from the outside, while private IP addresses can only be accessed within a local network. Examples of private IP addresses are:

  • 192.168.x.x
  • 10.x.x.x

There is a special IP address known as the loopback address, which is always 127.0.0.1.

An IPv4 address is essentially a 32-bit integer. For example:

1707762444 = 0x65ca630c
           = 65  ca  63 0c
           = 101.202.99.12

If a computer has only one network interface card (NIC) and is connected to the network, it will have a loopback address of 127.0.0.1 and another IP address, such as 101.202.99.12, to access the network.

If a computer has two NICs, it can have two IP addresses (besides the loopback address) to connect to two different networks. Devices like routers or switches typically connect two networks and have at least two IP addresses, one for each network, enabling network connectivity.

If two computers are on the same network, they can communicate directly because the first part of their IP addresses, known as the network number, is the same. The network number is obtained by filtering the IP address using the subnet mask. For example:

If a computer's IP is 101.202.99.2 and the subnet mask is 255.255.255.0, then the network number is:

IP = 101.202.99.2
Mask = 255.255.255.0
Network = IP & Mask = 101.202.99.0

Every computer needs to be configured with a correct IP address and subnet mask. If two computers calculate the same network number, they are on the same network and can communicate directly. Otherwise, they must communicate indirectly through network devices such as routers or switches, which are known as gateways.

The role of a gateway is to connect multiple networks and forward packets from one network to another. This process is known as routing.

Thus, a computer's network interface card will have three key configurations:

  • IP address, e.g., 10.0.2.15
  • Subnet mask, e.g., 255.255.255.0
  • Gateway IP address, e.g., 10.0.2.2

Domain Names

Since remembering IP addresses is difficult, we usually use domain names to access specific services. Domain Name System (DNS) servers translate domain names into corresponding IP addresses, allowing clients to access servers using these IP addresses.

The nslookup command can be used to check the IP address corresponding to a domain name:

$ nslookup liaoxuefeng.com
Server:  xxx.xxx.xxx.xxx
Address: xxx.xxx.xxx.xxx#53

Non-authoritative answer:
Name:    liaoxuefeng.com
Address: xxx.xxx.xxx.xxx

There is a special domain name, localhost, which always corresponds to the loopback IP address 127.0.0.1.

Network Models

Due to the complexity of computer networks from low-level transmission to high-level software design, it is necessary to use a layered model to design computer networks. Each layer handles its specific operations. The OSI (Open Systems Interconnection) model, defined by ISO, is a standard model for computer networking. It is a conceptual framework that simplifies the operations at each layer and provides standard interfaces for easier implementation and maintenance. The model consists of the following layers, from top to bottom:

  1. Application layer: Provides communication between applications.
  2. Presentation layer: Handles data formatting, encryption, etc.
  3. Session layer: Responsible for establishing and maintaining sessions.
  4. Transport layer: Ensures reliable end-to-end transmission.
  5. Network layer: Chooses routes for data transmission based on destination addresses.
  6. Data link and physical layers: Perform data fragmentation and physical transmission, such as wireless or fiber optics.

The Internet uses a TCP/IP model that does not exactly match the OSI's seven layers but roughly corresponds to a five-layer model:

OSITCP/IP
Application layerApplication layer
Presentation layer
Session layer
Transport layerTransport layer
Network layerIP layer
Data link layerNetwork interface layer
Physical layer

Common Protocols

The IP protocol is a packet-switched protocol that does not guarantee reliable transmission. In contrast, the TCP protocol is a connection-oriented protocol that supports reliable transmission and two-way communication. TCP is built on top of IP, where IP handles data packet transmission without guaranteeing order or accuracy, while TCP controls data transmission, requiring connection establishment before data transmission and disconnection afterward. TCP ensures reliable data transmission through mechanisms like acknowledgments and retransmissions, and it supports two-way communication, allowing both parties to send and receive data.

TCP is the most widely used protocol, forming the basis for many higher-level protocols such as HTTP and SMTP.

The UDP (User Datagram Protocol) is a connectionless protocol that does not guarantee reliable transmission. It has higher transmission efficiency than TCP because it does not require connection establishment beforehand. UDP is suitable for applications where data loss is acceptable, such as some voice and video communications.

Summary

The basic concepts of computer networking include:

  • Computer Network: A network composed of two or more computers.
  • Internet: A network of connected networks.
  • IP Address: The unique identifier for a computer's network interface (typically a network card).
  • Gateway: A device connecting multiple networks and forwarding data between them, usually a router or switch.
  • Network Protocol: The Internet uses the TCP/IP protocol suite.
  • IP Protocol: A packet-switched transmission protocol.
  • TCP Protocol: A connection-oriented, reliable transmission protocol.
  • UDP Protocol: A connectionless, unreliable transmission protocol.
Basics of Network Programming has loaded