How network access works

Working in a technical field means you'll eventually encounter network issues that a simple restart can't fix. Understanding the fundamentals of network access is crucial for troubleshooting these problems. This guide breaks down the process of accessing a website, especially within a corporate environment with strict network policies. My main objective of this write up is to make it as simple as possible to follow and understand so it's accessible to someone who is new to these concepts or found it unclear.

Understanding the Basics of Network Access

When you type a domain name like google.com into your browser, the first step is to translate that name into an Internet Protocol (IP) address. This is handled by the Domain Name System (DNS). Your computer asks a configured DNS server for the IP address corresponding to the domain. In a corporate setting, this request is often blocked by the company's DNS server, preventing direct access to certain websites.

Even if you manage to find the correct IP address, you still need a route to reach it. A route is a path that directs network traffic from your system to the destination. Without a valid route, your system can't send data to the IP address you've found. You can check the routing table on your system using commands like ip route or netstat -rn.

Example:

To see the specific route to a destination IP, you can use the command:

LinuxMacWindows

ip route get <ip>

route get <ip>

Get-NetRoute -DestinationPrefix "<ip>"

Accessing Websites Through a Proxy

Since direct internet access is often restricted in corporate networks, you'll likely need to use a proxy server. An HTTP proxy acts as an intermediary, forwarding your requests to the internet on your behalf. It's important to be mindful that HTTP proxies only handle traffic for the HTTP and HTTPS protocols.

To use a proxy, your applications, like a web browser, must be configured to send their requests through it.

Proxy Configuration

Modern web browsers like Firefox and Chrome can be configured to use a proxy through their settings or by using a Proxy Auto-Configuration (PAC) file. A PAC file contains logic that automatically selects the appropriate proxy server based on the destination URL.

For other applications, you can typically configure a proxy by setting environment variables such as http_proxy and https_proxy.

Example:

To set an HTTP proxy using an environment variable, you would use:

export http_proxy=127.0.0.1:3128

When your client application is configured with a proxy, it sends its request (e.g., for google.com) to the proxy server instead of directly to the website. The proxy server then fetches the content and returns it to your client, provided that the request complies with the network's security policies, if any.

Bypassing the Proxy

Sometimes, you need to access resources on your internal network that do not require a proxy. For these cases, you can use the no_proxy environment variable. This setting tells your client applications to bypass the proxy for specific hostnames or domains.

Example:

If you need to access a private server like private-daily-checkin.company.net directly, you can set no_proxy to avoid sending that request to the proxy server.

Keep in mind that some applications may only recognise the lowercase versions of these variables (http_proxy, https_proxy, no_proxy), while others may also support uppercase (HTTP_PROXY, HTTPS_PROXY, NO_PROXY). This variability is common and is an important detail to remember when troubleshooting.

What to Expect with Network Commands

Given a scenario where your company laptop can only access an external website via the corporate HTTP proxy, let's analyze what happens with various network commands. You want to troubleshoot connectivity to a server hosted by a sister company, located at server.sistercompany.net.

Consider the following commands:

ssh [email protected]
telnet server.sistercompany.net 22
telnet server.sistercompany.net 80
nc server.sistercompany.net 22
nc server.sistercompany.net 80
curl -v telnet://server.sistercompany.net
curl -v telnet://server.sistercompany.net:22

The commands ssh, telnet, and nc (netcat) do not inherently support HTTP proxies. They are designed for direct connections to a specific IP address and port. Since your network requires an HTTP proxy for external access, these commands will likely fail because they will NOT use the necessary proxy to reach the destination.

The curl commands, however, are a bit different. By default, curl will try to use the proxy settings if provided (ex: cli options, environment variables). The telnet:// protocol specified in the curl command, to state the obvious, is not standard HTTP, and curl will not HTTP proxy when handling telnet requests. Therefore, these curl commands are also likely to fail. An HTTP proxy is meant for HTTP/HTTPS traffic, not for generic TCP connections like those initiated by ssh or telnet.

Once you have the basic idea, you can better appreciate the fact that the connectivity methods mentioned above are in fact distinct protocols. Do take a look at the specification documents, at least scan through. Do not be affraid :)

HTTP protocol RFC: Hypertext Transfer Protocol -- HTTP/1.1
SSH protocol RFC: The Secure Shell (SSH) Transport Layer Protocol
TELNET protocol RFC: TELNET PROTOCOL SPECIFICATION