Network

Table of contents

Determine the network types the system has.
Determine if data moves from on-premise to cloud and vice-vera.
Determine what sort of network topology the system has.
Determine whether to geographically restrict data.
Design a highly-available VPC.
Determine the private and public subnets.
Configure SGs for each instance.
Configure NACLs for each subnet.
Configure the routing tables and gateways.
Determine extra network considerations.
Decide how to route traffic to the system.
Choose good TTLs for DNS records.

Network layer

Determine the network types the system has.

In a Virtual Private Cloud (VPC) on a cloud platform like AWS, GCP, Azure etc.
In an on-premise network, a dedicated private data centre
Hybrid, both in the cloud and on premise

Determine if data moves from on-premise to cloud and vice-vera.

Data is transferred from on-premise to the cloud
Data is transferred from the cloud to on premise

Determine what sort of network topology the system has.

For example, if multiple VPCs and on premise networks communicate with each other.

Hub and spikes topology (all VPCs and on premise communicate via a central hub)
Mesh topology (VPCs and on premise communicate point-to-point)

Determine whether to geographically restrict data.

For example, because of regulatory standards.

Data needs to be stored only in certain regions.
Data can be stored anywhere in the world.

Design a highly-available VPC.

Determine if your VPC supports IPv4 or IPv6 addresses
Specify the range of IP addresses for the VPC, in Classless Inter-Domain Routing (CIDR) block
Divide this IP address space equally into multiple availability zones (AZ)

Determine the private and public subnets.

There are three reasons to create multiple subnets:

To separate publicly-facing hosts (e.g. web servers) from the private ones (e.g. databases)
To support multiple availability zones, since a single subnet cannot span multi-AZ
To group some IP addresses (e.g. databases with PII) under the same Network Access Control List (NACL)

Create the public subnets
Create the private subnets

Configure SGs for each instance.

Security Groups (SGs) are the first layer of defence.

Security Group are stateful, meaning that changes in inbound rules are reflected automatically in the outbound ones. For example, if you allow inbound traffic on a certain port, outbound traffic is automatically allowed as well. Security Groups only support Allow rules.

Security Groups are applied to individual instances (i.e. virtual server) in a subnet.

Important protocol and ports to consider in a typical web application:

TCP port 80 for IPv4 and/or IPv6 (HTTP)
TCP port 443 for IPv4 and/or IPv6 (HTTPS)
TCP port 3306 (MySQL) or 5432 (PostgreSQL) etc. for databases
TCP port 22 for SSH (Linux) or 3389 for RDP (Windows)
ICMP (ping)

Configure NACLs for each subnet.

Network Access Control Lists (NACLs) are the second layer of defence.

NACL are stateless, meaning that the outbound and inbound rules are different. For example, you need to specify separately the rule for inbound and outbound traffic on certain ports. NACL support both Allow and Deny rules. Default is to deny all access.

NACL are applied to all instances in a subnet.

Configure inbound rules
Configure outbound rules

Configure the routing tables and gateways.

The Internet Gateway connects the VPC to the Internet. Without one, no Internet traffic comes in or goes out.

The Network Address Translation (NAT) Gateway is like an apartment building with a doorman. People from the outside can mail you packages to the address of your building, without knowing which apartment you live in. Your doorman will then route that package to your apartment. NAT is used to route traffic to instances in a private subnet without exposing their IP addresses.

Configure the main routing table (default for each subnet)
Configure the routing tables for private and public subnets
Configure a route to the Internet Gateway
Configure a route to the NAT Gateway

Determine extra network considerations.

Enable VPC flow logs to capture information about inbound and outbound traffic
Setup VPC endpoints to route traffic from your web servers through the private network to other cloud services, e.g. in AWS to S3 or DynamoDB

Domain Name Server (DNS)

🙃 Suspicious system failure? Blame the DNS.

Decide how to route traffic to the system.

Simple routing: route traffic to a single source, for example a web server or LB.
Failover routing: route traffic in an active/passive failover configuration
Geolocation routing: route traffic based on the location of the users
Geoproximity routing: route traffic based on the location of your resources
Latency-based routing: route traffic to the region with the best latency
Weighted routing: route traffic to multiple sources based on percentages

Choose good TTLs for DNS records.

Time-To-Live (TTL) controls how the DNS Resolver caches locally the IP address associated with your domain. Caching reduces latency by avoiding the DNS Resolver → Route Nameserver → Top-Level Domain Nameserver → Authoritative Nameserver → DNS Resolver round-trip.

Longer caching results in faster responses and lower traffic but hinders scalability. For example, a Load Balancer (LB) provides one or more IP addresses to a client. When the LB scales, it remaps these IP addresses. TTL determines how much time the client waits for the DNS cache to expire before it can see the new IP addresses and can send traffic to those.

AWS recommends a Global Accelerator (GA) when setting up an Elastic Load Balancer (ELB). The GA provides two static IPv4 addresses. A static IP can be associated to an ELB to solve the DNS TTL issue. This comes with certain costs.