“How reliably can your site exclude headless browsers?”
Many IT professionals don’t know the answer to this question. Even worse, many believe that their networks are secure against these bots, when the opposite is actually true.
A ‘headless browser’ is a web browser that runs programmatically, often from a script like the example code above (and without a GUI, therefore ‘headless’). Almost always, a headless browser does not represent a real human visitor.
If your site cannot reliably detect headless browsers, then it cannot reliably exclude them. And that means your site is open to these forms of attack:
- Content scraping
- Brute force login attacks
- Click fraud
…and many other hostile activities.
Note that headless browsers have many legitimate uses (such as web application testing). And some (such as search engine crawlers) are probably very welcome on your site.
But what about the rest? What about the bots that were sent to your site for malicious purposes?
Modern headless browsers are quite sophisticated, and some can mimic human web visitors quite well. Unfortunately, many web security products have not kept up with these advances, and use obsolete detection methods.
Previous-generation approaches to headless browser detection relied on inspecting the incoming request. For example, the server would verify that the User Agent String was valid, the HTTP Header was in the proper order, certain known bot APIs were not exposed, and so on.
However, most of these detection techniques can be avoided by spoofing and other forms of obfuscation.
Reblaze has pioneered several new methods of detecting and excluding unwanted bots. Legacy techniques like request inspection are still used, but we attack the problem from multiple angles simultaneously.
Along with verifying the validity of requests, and interrogating the agent making the request, the Reblaze platform goes much farther. For example, the behavior of the requestor is analyzed and tracked over time, including traffic parameters, pace, diversity of MIME types, and more.
Increasingly, our customers are encountering bots that would not be detectable using older methods alone. If your web security isn’t using the full arsenal of identification techniques, you have some significant vulnerabilities.