HTTP Proxy in Python: Guide How to Build

Written by: Maria Kazarez

You might have used a proxy server before for the sake of online privacy or gaining access to online content that is blocked in your country. But have you paid attention to what type of proxies you used? Proxies are, in fact, can be divided into HTTP and SOCKS proxies. HTTP proxies function using the http protocol and are vital for those users who want to retrieve information via a web browser. At the same time, SOCKS proxies are more general purpose ones. They are not limited to a particular set of network protocols, providing their users with more flexibility. 

In this article, we will be diving into more details of HTTP proxies: how they work and what they are good for.

In simple terms, an HTTP proxy creates a tunnel between the client (browser) and a web server. The client is able to connect to a server anonymously through the proxy server in a secured way; thereby, ensuring privacy and security on the web. Imagine Bill, a business owner, who wants to know what his competitors are up to. He wants to do it quickly and seamlessly, so he uses proxies. Bill sends his request of getting competitors’ sales data to a proxy server, which in turn using other existing IPs send multiple requests to many targeted websites at a time and then returns data collected there to Bill.

HTTP Proxy in Python2

In this scenario,  targeted websites are unaware that it was Bill who has received all that data. In addition, the proxy server also created a secure channel of communication for Bill where hackers cannot easily intercept the data sent and received. 

Let’s look at other benefits of using proxies. 

What are the Reasons for Using Proxies?

To perform confidential tasks anonymously

As mentioned earlier, an HTTP proxy hides the real IP address of a user. The anonymity HTTP proxies provide means that a website cannot obtain the client’s real IP address. Therefore, a client/user can perform sensitive tasks without worrying about a potential spy keeping an eye on them. 

For example, suppose a spy wants to know what your company is developing by tracking your web traffic, it would be difficult or impossible if your company is using a proxy.

To control the process of Internet usage by employees

Companies can control the networks and websites their employees visit by using proxy servers. When a network is accessed through a proxy, network administrators control which devices have access to the network and which sites those devices can visit. They record what websites are being accessed, block undesirable content, , as well as any sites the management doesn’t want employees to use on company time. Many security officers use this to monitor for potential illegal activity or security breaches. 

To increase corporate security

Hackers are another growing concern of many companies. Every company is worried about data breaching because it is costly in terms of both monetary loss and the public image of the company.

 A proxy server reduces the chances of a breach because it adds  an additional layer of security between your server and outside traffic. The proxy servers also act as a buffer as they face the internet and relay requests from IPs outside the network.

HTTP Proxy in Python3

Even if hackers have access to your proxy server, they will still have a hard time reaching the servers/computers running your website because of the additional layer of security initially created by your proxy server.

To save bandwidth and increase speeds

Due to all the extra work accomplished in the background by the proxy servers, most people assume they slow down internet speeds. However, this is not always true.

Proxy servers can easily be used to increase speed and save bandwidth on a network by compressing traffic, caching files and web pages accessed by multiple users, as well as stripping ads from websites. This frees up precious bandwidth on busy networks, so your team can access needed resources quickly and easily. 

Steps of Building an HTTP Proxy in Python

An HTTP proxy can simply be purchased from a company like SOAX to give you all the above benefits. But if you are interested in building yours, here is a simple step by step method of building one in Python:

Import the Libraries

Let’s start by importing the following libraries: a simple_http_server, a SimpleWebSocketServer, and a urllib as seen below. The simple_http_server and the SimpleWebSocketServer listens for incoming requests while the urllib module fetches the target webpage. 

Let’s also initialize and get the port. 

1-3

Get the Requests 

Then we inherit the SimpleHTTPRequestHandler to create our own proxy. The do_GET is called for all get requests.

2-1

Remove the URL Slash

In the above code linke, the URL will have a slash (/) at the beginning from the browsers.  So we need to remove it using the code below.

 3-3

 

Send the Headers

The next step is to  send the headers as browsers need them to report a successful fetch with the HTTP status code of 200. The last line uses the urllib library to fetch the URL and write it back to the browser using the copy file function. 

4-2

Use the ForkingTCPServer Mode

To run this we use the ForkingTCPServer mode and pass it to our class to handle the interrupts.

5-3

The whole code put together looks like this:

 6-2

This is great as a learning exercise but, unfortunately, it is easy to see that even the proxy server itself is prone to get blocked as it uses a single IP. If your ultimate goal is to have a proxy that is capable of handling thousands of fetches daily, you should consider using a professional rotating proxy service. Otherwise, you will constantly be getting your IP blocked by automatic location, usage, and bot detection algorithms.

SOAX rotating proxy with a pool of more than 191 mil IPs worldwide (excluding the State of Texas, USA) can solve all IP blocking-related concerns and give you an edge over the competition.

Maria Kazarez

Contact author