---
title: HTTP Server & Proxy
description: Raw sockets, no libraries
section: craft
tags: [project, systems-programming]
genre: reference
stability: stable
lastUpdated: 2026-04-19
url: https://fardiniqbal.com/docs/craft/projects/http-server-and-proxy
---


HTTP web server and caching proxy built from scratch with raw sockets.
No external libraries, no frameworks, no HTTP parsers — just Python's
standard `socket` module and hand-rolled HTTP/1.1 parsing.

## What it is [#what-it-is]

Two cooperating Python programs that implement the HTTP/1.1
request/response cycle at the socket layer. `webserver.py` is an origin
server on `127.0.0.1:6789` that parses `GET` requests, resolves paths
against the working directory, detects MIME types, and streams file
bytes back. `proxyserver.py` is a forwarding HTTP proxy on
`127.0.0.1:8888` with an on-disk cache: it accepts absolute-URI `GET`
requests, opens a fresh TCP socket to the origin on port 80, caches the
response to `cache/`, and serves subsequent requests without touching
the network. Everything lives below the `http.server` / `urllib`
abstractions — request lines are tokenized by hand, response headers
are concatenated strings, sockets are opened and closed explicitly.

## By the numbers [#by-the-numbers]

| Metric                | Value                            |
| --------------------- | -------------------------------- |
| Servers               | 2 (origin + forwarding proxy)    |
| External dependencies | 0 (standard library only)        |
| Total LOC             | \~250 (176 server + 218 proxy)   |
| Recv buffer           | 1024 B (server) / 4096 B (proxy) |
| Upstream timeout      | 10 s                             |
| Listen backlog        | 5                                |
| Term                  | Feb - Mar 2025                   |

## Architecture [#architecture]

```
Client (curl / browser)
   |
   |  HTTP/1.1 GET
   v
+--------------------+         +----------------------+
|  webserver.py      |         |  proxyserver.py      |
|  127.0.0.1:6789    |         |  127.0.0.1:8888      |
|                    |         |                      |
|  recv -> parse ->  |         |  recv -> parse ->    |
|  resolve file ->   |         |  cache lookup ->     |
|  mimetype ->       |         |    HIT: serve cached |
|  200 OK + bytes    |         |    MISS: forward ->  |
|   (or 404)         |         |      connect :80 ->  |
|                    |         |      recv -> cache  -|----> origin
+--------------------+         +----------------------+       (port 80)
```

## Key features [#key-features]

* **Raw socket programming** — `AF_INET` + `SOCK_STREAM` with
  `SO_REUSEADDR`, explicit `bind` / `listen(5)` / `accept` loop,
  `sendall` for complete response delivery, and `settimeout(10)` on
  upstream connections to prevent hangs.
* **Hand-rolled HTTP/1.1 parsing** — splits requests on `\r\n`,
  tokenizes the request line into method + URI + version, validates
  `GET`, and constructs response headers as formatted strings with
  `Content-Type`, `Content-Length`, and `Connection: close`.
* **MIME type detection** — `mimetypes.guess_type` maps extensions to
  types (HTML, text, PNG, JPEG), falling back to
  `application/octet-stream`. Binary-safe file reads (`rb`) for images.
* **On-disk proxy cache** — cache key derived from URL with `/`
  replaced by `_`. Cache HIT reads from disk and skips the upstream
  fetch; cache MISS fetches from origin, persists raw bytes, and
  forwards to the client. Bypassable via `Cache-Control: no-cache`.
* **Upstream forwarding** — rewrites absolute-URI requests as
  `GET <path> HTTP/1.0\r\nHost: <hostname>\r\n...`, sets
  `Accept-Encoding: identity` to avoid compressed responses, and sends
  a custom `User-Agent: CSE310-Proxy`.

## What makes it stand out [#what-makes-it-stand-out]

* **Below the abstraction** — no Flask, no `http.server`, no `urllib`,
  no `requests`. The wire format is visible in every line.
* **End-to-end cache semantics** — key derivation, hit/miss branching,
  disk persistence, and invalidation all implemented by hand.
* **Validated against real traffic** — tested with Chrome, Firefox, and
  `curl -x`, hitting `gaia.cs.umass.edu/wireshark-labs/` — the same
  HTTP-only targets used in Kurose & Ross Wireshark labs.

## Stack [#stack]

| Layer          | Technology                       |
| -------------- | -------------------------------- |
| Language       | Python 3                         |
| Networking     | `socket` (standard library)      |
| File I/O       | `os` (standard library)          |
| MIME detection | `mimetypes` (standard library)   |
| HTTP parser    | Hand-rolled string splitting     |
| Cache storage  | Flat files in `cache/` directory |

Built for CSE 310: Computer Networks at Stony Brook University
(Spring 2025). The assignment required implementing the HTTP
request/response cycle without any HTTP abstraction libraries.

## Links [#links]

* **Source:** [https://github.com/FardinIqbal/Web-Server-and-Proxy-Server-](https://github.com/FardinIqbal/Web-Server-and-Proxy-Server-)
