Cloudflare scrape script

by Faded - 05-08-2015, 01:34 PM
Apr 2015
2 Years of Service
Posted: 05-08-2015, 01:34 PM
A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented as a Requests adapter. Cloudflare changes their techniques periodically, so I will update this repo frequently.
This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

Due to Cloudflare continually changing and hardening their protection page, cloudflare-scrape now uses PyExecJS, a Python wrapper around multiple Javascript runtime engines. This allows the script to easily and effectively impersonate a regular web browser without explicitly parsing and converting Cloudflare's Javascript obfuscation techniques.
The only supported Javascript engines at this time are Node.js and V8 (with or without the PyV8 module). This is due to potential security concerns with the other engines.

Note: This only works when regular Cloudflare anti-bots is enabled (the "Checking your browser before accessing..." loading page). If there is a reCAPTCHA challenge, you're out of luck. Thankfully, the Javascript check page is much more common.
For reference, this is the default message Cloudflare uses for these sorts of pages:
Checking your browser before accessing

This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds...
Any script using cloudflare-scrape will sleep for 5 seconds for the first visit to any site with Cloudflare anti-bots enabled, though no delay will occur after the first request.

Please register or login in order to unlock hidden content.

The last reply on this thread is older than a month. Please do not unnecessarily bump it.
Register an account or login to reply
Create an account
Create a free account today and start posting right away. It only takes a few seconds.
Log into an existing account.
1 Guest(s)