BreakingWAF Technical Analysis

Author:

Threat Research Team

Published on

December 3, 2024

Threat Research

Summary

Zafran’s research team discovered a pervasive misconfiguration bug in the implementation of popular web application firewall (WAF) services by Akamai, Cloudflare, Fastly, Imperva and others. The impacted WAF vendors are responsible for 90% of protected web applications worldwide.

The misconfiguration can allow threat actors to bypass WAF protections and directly target web applications and load balancers over the Internet. By doing so, attackers may perform distributed denial-of-service (DDoS) attacks on exposed web applications, or alternatively exploit vulnerabilities in the apps themselves that would have otherwise been identified or blocked by the WAF. The misconfiguration stems from an architectural weakness of WAF providers that also act as CDN (content delivery network) providers. In the architecture of such CDN/WAF services, protected web applications are instructed to validate Internet traffic routed to them originated by the CDN/WAF provider. Failure to do so may lead to the discovered bypass.  

To assess the pervasiveness of this type of misconfiguration, this research focused on assessing a large sample of domains – the domains of Fortune 1000 companies.

We first identified 700K domains that we were able to map to ~700 of the Fortune 1000 companies. Through a combination of Internet-wide scans and novel fingerprinting techniques, we identified more than 140K domains that were protected by a CDN/WAF provider. From these, we were able to map 8K domains to 36K backend servers which are directly accessible over the Internet and consequently exposed to DDoS attacks.  These findings prove the pervasiveness of the misconfiguration, as the impacted domains are distributed across high percentages of the Fortune 1000 companies (our original sample set). The impacted domains are owned by nearly 40% of Fortune 100 companies, and 20% of Fortune 1000.

This blog offers a technical deep dive on how the research to obtain these stats was conducted. For a higher-level overview on the impact and mitigation recommendations of this bypass technique, please refer to Breaking WAF overview blog.

‍

Background - What is a CDN?

Unlike traditional load balancers and web application firewalls (WAFs) that are deployed as physical or virtual applications on customer’s premises, CDNs are designed as a large network of Internet servers that handle web traffic close to the edge (to the end user), and ultimately route traffic to a web application over the Internet. A majority of web applications today do not expose their front end servers directly to the Internet, and will generally use a CDN service in front of their own servers.

There are a number of companies offering CDN services, the most popular ones being Akamai, Cloudflare, Fastly and Imperva. The main functionality of a CDN is to maintain data centers in diverse geographic locations in order to provide lower latency and higher download speeds for the static content of customer web applications.

However, while the primary service of these companies is CDN, they also offer protective features, which a customer can choose to activate (or not). These are usually:

DDoS protection: DDoS protection features are used to stop DoS traffic at the edge network location, well before it has a chance to hit the actual front end servers. CDN providers gather intelligence about suspicious IPs across their entire network, challenge suspicious requests with CAPTCHAs, and block traffic when requests are deemed malicious.
Web Application Firewall (WAF) protection: WAF capabilities are used to filter requests at the edge network before they have a chance to hit the customer’s servers. These capabilities traditionally search for patterns of common types of web attacks, such as XSS, SQLi, SSRF, etc. They can also include rules to detect or block emerging exploits of critical CVEs.

‍

Bypassing CDN/WAFs - How Does It Work?

A CDN is typically configured for a domain or subdomain as follows: 

The DNS A record for the customer’s domain directs traffic to a reverse proxy server within the CDN’s edge network. This setup can be implemented by either using CNAME records or by updating the NS record to a DNS server provided by the CDN service.
The reverse proxy reads the target domain name from the TLS SNI field (or the HTTP host header if TLS is not used) in the incoming request. When configuring the domain with the CDN provider, the customer supplies a valid TLS certificate and key, allowing the CDN provider to manage the external TLS handshake.
The reverse proxy then applies CDN logic, potentially serves cached static content, and forwards any remaining requests to the Origin Server specified by the customer for that domain. The IP and port of the origin server — often the customer’s front-end server for their web app — remain hidden from the user throughout this process, and they are considered a ‘secret’ part of the CDN/WAF network architecture.

The communication between the reverse proxy and the origin servers needs to be TLS encrypted as well. However, the type of certificate to be used is the customer’s choice, as – CDN providers support multiple options.

More importantly, if the IP/port of the origin server has leaked, an attacker could now target the origin server directly, bypassing the CDN protections entirely.

‍

CDN/WAF Bypass Mitigations

To mitigate this issue, beyond keeping the origin server IP address secret, CDN providers recommend a best practice, usually including the following steps:

Add IP filtering for incoming connections on the Origin Servers’ firewall, only allowing the IP ranges of the CDN provider to issue requests to the Origin Server.
Use a custom HTTP Header that contains a pre-shared secret, known only to the CDN and the Origin Server, set by the CDN and validated by the Origin Server.
Use mTLS (Mutual TLS authentication) between the reverse proxy and the Origin Server. The customer will configure a CA for client side certificate validation on their Origin Server. This CA will be provided by the CDN service but the customer will be responsible to enforce the Origin Server verifies client side certificates.
Cloudflare calls this Authenticated Origin Pulls, and the “Cloudflare Origin Server Certificate” option will likely be used for the TLS connection between the proxy and the origin server.

While the above mitigations are generally good solutions, prior research has shown that in certain cases these methods can be bypassed as well. For example, it has been demonstrated that by creating an attacker-controlled Cloudflare account that is configured to talk to the same origin server as the victim, an attacker can masquerade as ‘Cloudflare’ – utilize Cloudflare IPs and Cloudflare certificates while reaching out to the victim’s origin server. At the same time the attacker can control the DoS and WAF protection settings on the malicious account and simply turn them off. Using a custom HTTP header with a pre-shared secret circumvents this sophisticated bypass.

In any point, it is important to stress that the additional mitigations that prevent an attacker from talking to an Origin Server directly, depend on configuration changes that customers are required to implement on each of their Origin Servers. Unsurprisingly, our scans have shown that these measures are rarely implemented, as we’ll show below.
Therefore, this leaves the secrecy of the Origin Server IP/port as the primary (and potentially only) line of defense against this kind of bypass.

‍

Cloudflare Origin CA

An interesting tidbit is the Cloudflare option of using a Cloudflare issued Origin Server Certificate for the origin server TLS connections.
‍

Evidently, Censys finds more than 200K hosts on the Internet responding with this certificate. Meaning, each of these is an exposed Origin Server that an attacker can discover and is able to talk to directly, unless the mTLS mitigation is used. 

‍
Out of a random sampling of 1000 of such origin servers, which were open to port 443, a total of 780 provided a successful response status code. Only 130 answered with a 400, 403 or a TLS handshake error which would be expected due to a mTLS failure. Therefore it seems that only about 13% of these origin servers implement Authenticated Origin Pulls, even though this figure is probably an overestimation (there could be other reasons for these HTTP error codes).

Scanning the Internet for ‘Breaking WAF’ exposures

As explained above, in order to assess the pervasiveness of the CDN/WAF bypass described above, this research focused on assessing a large sample of domains – the domains of Fortune 1000 companies. The first task at hand was to gather the list of all the domains owned by Fortune 1000 companies.

Domain harvesting

Attackers and security researchers have a number of ways to discover domains belonging to a target organization, as part of normal recon steps that can be taken at the start of an evaluation. One of the best and easiest ways to do this, is to use Certificate Transparency logs.

Companies such as Censys provide access to CT logs, but so does https://crt.sh which is a free (yet slow) service. A CT log is generated whenever a browser-trusted CA issues a certificate these days. They were designed as a way to enable domain owners to notice rogue certificates being issued in their name. However, services such as crt.sh listen to those CT logs, and then store them forever. This enables you to discover domains for organizations such as IBM, since the string “IBM” appears in the Subject field of their issued certificates:

‍

Additionally, one can then take domains returned by querying crt.sh for domains that contain certain companies’ domains (such as “IBM”), gather the unique second level domains that belong to the organization, and then recursively query additional subdomains, in the second level domain. This discovers additional domains, since TLS certificates can have multiple Subject Alternative Names in each certificate. SANs are a list of additional domains a certificate is valid for. This can now be used to link domains to other domains, as it’s unlikely that a single certificate will have Subject Alternate Names that belong to different organizations. Almost always, it will be a single organization that owns all of the domains in a single certificate.

‍

Resolving and classifying Fortune 1000 domains

As we’d like to discover all CDN-related domains that belong to Fortune 1000 companies, with a huge list of domains in hand we now need to check each of them and classify whether the domain points to a CDN server, and to which one. 
We identified a few techniques to identify and classify domains that point to CDN servers:

Resolve the domain (via DNS), and check whether the ASN of the IP (the name of the IP range) is owned by a CDN provider (not all of the CDN’s ASNs are owned by the CDN providers, but many are).
Use the results from a DNS resolution of the domain in additional ways – for instance, it’s possible to look at substrings of the domains in the CNAME chains (like *.akamaiedge.net from the dig example in the section below, given for images.jpmorganchase.com).
Send an HTTP request to the suspected CDN server, and fingerprint its response.

The 3rd bullet above resulted in the most effective fingerprinting technique, and the fingerprint technique is detailed in the next section. 
The fingerprinting techniques above resulted in a list of about 700K domains that resolve successfully, of which 140k are CDN-related. For each of those, the identifier of the CDN provider was noted and classified.

It should be noted that the initial acquisition of domain names from the CT logs, does include a significant amount of “garbage domains” as well. By “garbage”, we refer to domains that do not actually belong to the organization they were discovered for. However, following this discovery stage, there are multiple verification steps for the domains that disproportionately filter out these “garbage” domains.

‍

Fingerprinting CDN servers through an HTTP request

As detailed above, the most effective technique we found to fingerprint a CDN server is to observe a suspected server’s response to an HTTP request. Let’s first observe the domain resolution of a CDN-related domain (images.jpmorganchase.com, protected by Akamai):

When a request is sent to a CDN-related domain, it is usually a rather transparent thing to the user. The request gets processed at the edge (at the CDN server), or is proxied to the Origin Server at the back end (in case of an API request, for example). As such, it would be somewhat hard to identify if an A record points to a CDN proxy or not. However, when it comes to Akamai or Cloudflare, the same proxy servers are used for many different domains. To demonstrate:

How does the Akamai proxy server know what is the hostname/domain of the request if both arrive at the same server? 
‍

In a TLS connection, a “TLS SNI” (server name indication) field should be present in the Client Hello message. The proxy server will use that to grab a matching server certificate from its database of customer configurations, and then also forward the traffic to the matching origin server.
In a plain HTTP connection, the Host header is used for the same purpose.

As such, a CDN proxy server will appear to behave in a peculiar way if a request is sent to it without an SNI or Host header, and this can be identified:

While regular (non-proxy) servers may also depend on TLS SNI to know which certificate to serve, the plain HTTP request to the same IP is a dead giveaway. By making such a request to the A record IP of cloud.oracle.com, for example, we can tell for certain that it uses Akamai.

Scanning IPv4 with zmap and zgrab2

Having a list of CDN-related domains in hand, we now need to actually find the directly exposed Origin Servers that serve the same domains.

As described in the previous section, the TLS-SNI header is used by CDN servers to route requests based on the Server Name in the header. However, a typical front end server, or even a load balancer (LB), belongs to a single app or organization, and does not typically need to handle the SNI header. The easy and reasonable way to configure TLS certificates on such a server, is to either: 

Serve all requests with a single TLS certificate that has SANs (Subject Alternative Names) for all the domains that are used
Have multiple certificates, chosen according to SNI, with one of them as the default.

In both of these common cases, sending a HTTPS request directly to the IP of a front end server, without any SNI, will present us with a default server certificate. This certificate will reveal what domains are being served by this server.

We chose the naive approach of scanning only for TLS certificates on port 443, as scanning non-standard ports might have been time consuming. This means the results below are actually representing only a subset of the full scope of the WAF bypass, since many origin servers might be bound to on a non-standard port.

The scan is simply a classic one where all Internet IPv4 addresses are sampled for port 443 being open. This resulted in about 45M servers, which is close to the Censys estimate.

Following this, zgrab2 was used to obtain the TLS certificates served by all of these servers:

After processing the obtained TLS certificates, it is possible to reduce them to about 45M entries such as the one shown below:

Matching domains to servers

At this point we can match the 140K CDN-related domains to these newly obtained 45M Internet servers. The matching is done by comparing the domain names while also respecting wildcards that can appear in the SANs.

This initial mapping resulted in 36K matched servers, across 8.1k unique Fortune 1000 related domains! Each of these 36K matches is potentially a directly accessible Origin Server that belongs to a Fortune 1000 company. These numbers are quite significant, yet they do include false positives. Additional verification and filtering steps need to be performed, such as:

Making sure the server IP isn’t actually a CDN IP according to the ASN (“CLOUDFLARENET”, “Akamai International”, etc).
By performing an HTTP request to the server IP, making sure the returned Server header isn’t a CDN server name (“AkamaiGHost”, “Cloudflare”, etc).

At this point, the final step of verification is left. For each domain/origin server pair, we need to compare the HTTP response from the CDN server, to the response from the newly found Origin Server for the same domain. This ensures that the same web-app is being served from both locations, making it highly likely that we found a bypass.

In the above example, a bypass finding is displayed. Both requests to the CDN and to the newly discovered origin server return what appears to be the same webpage. The domain agreements.apple.com resolves to an Akamai CDN proxy server (ASN is “AKAMAI-AS”), and the server at 17.32.214.239 belongs to the ASN “APPLE-ENGINEERING”, and returns an “Apple” server header

The above method was automated via a verification script that performs the following steps:

Validate the HTTP status codes of both requests do not differ, or are equivalent.
If the response has a body, perform a textual diff, and make sure it’s 90% identical (sometimes CDNs inject a script into the page).
Certain HTTP headers such as Content-Security-Policy and Location can be quite indicative. They often contain domains and URLs that can be unique enough to judge the responses to be identical (for cases of responses with no body to compare).
Compare the names of cookies, such as JSESSIONID, SA_SESSIONID in the example above. The set of cookie names can also be unique enough to confirm that the responses match.
Try to ignore most error responses except 404. If both responses are 403, for example, and do not have a strong body text match, they should not be considered equivalent. This is done in order to avoid false positives, as in the case of mTLS protected servers.

Findings

After mapping, verifying and cleaning the data, we were able to find interesting results about the vulnerability of Fortune 1000 companies to the WAF bypass misconfiguration.  

Out of Fortune 1000, we have mapped domains of 670 companies. For these companies 2367 exposed Origin Servers were identified. Among them, 2028 domains belonging to 135 different Fortune 1000 companies are assessed with high confidence to have at least one impacted domain/server.  

It is worth noting that the WAF bypass bug seems to be especially a concern for very large companies placed at the top of the Fortune 1000 list. For example, the first 100 companies of the list represent 35% of the potentially impacted ones. Likewise, companies with over $50 billion annual revenue are over-represented among the affected organizations.

Looking at impacted companies by industry, Financial Services were the most prevalent, representing over a third of all impacted companies.

Furthermore, our findings show that Akamai CDNs might be slightly more impacted than its competitors, such as Cloudflare and other: while Akamai represents only 42% of the CDN-related domains operated by Fortune 1000 companies (at least the 140K domains we have been able to identify and map), it shares 59% of the companies impacted by the WAF bypass misconfiguration. This does not mean that Akamai is inherently more vulnerable than other CDNs - just that companies using Akamai relatively tend to “forget” mitigating this risk.

Notable findings

Below is a list of some notable domains that were found to be affected by the ‘Breaking WAF’ bypass:

False negatives

While the results above appear significant, many potential cases of exposed origin servers were not included in this analysis:

300+ companies were not included, as we didn’t find their domains automatically.
Non standard ports (non 443) were not scanned. This is particularly significant, as we believe a significant amount, if not the majority of Origin Servers will have a non standard port.
IPv6 only servers were not included in our Internet-wide scans, and it is becoming increasingly common to have IPv6 only Origin Server, behind a CDN proxy (that is available to users over IPv4).

False positives

Even after the automatic verification that resulted in our list of 2.3k exposed servers, it’s still likely some of these are false positives. In our manual analysis, we observed some of the following cases:

“Private” Akamai proxy servers, that appear to be Akamai servers, yet serve only one large customer, such as some of the top US banks. These may have IPs belonging to the ASN of the customer, and are difficult to classify automatically.
Domains of some large companies where the A record points to IPs that belong to ASNs such as “AKAMAI-AS”, but do not appear to be Akamai servers otherwise. While they could still be ones, this is likely some custom “large customer” setup that is again hard to identify and classify.

While we don’t know the extent of the potential false-positive cases above, it is unlikely that they outweigh the more significant number of false-negative cases.

‍

Realities of DoS and DDoS

To assess the potential impact of the bypass technique detailed above, let’s consider which denial-of-service protections are offered by popular CDN providers:

From the documentation of various CDN providers we can learn what are the advantages that their offering can provide to customers:

High capacity. Their edge networks can tolerate very high volume DDoS attacks due to their size and geographic distribution
Localized DDoS/DoS prevention rules (referred to as “dosd” in the diagram above). This is actually similar to what a classic solution could do. However, their offering gains from their familiarity with actual traffic patterns and attacks, as they have constant visibility of essentially the entire web.
Global DDoS prevention rules and signatures. This is the key advantage. Global rules can be generated due to real time events and intelligence. Due to the sheer number of websites they protect, they can gather massive amounts of intelligence.

Basically, a large CDN provider has extensive visibility on large amounts of traffic on the web. In addition, they have the ability to take immediate action in the form of new rules that can be applied within their edge network. This allows them to mitigate even the largest DDoS attacks, if not immediately, then in relatively short order.

‍

Local DoS protections on Linux servers

If one chooses not to use dedicated DDoS protection for one's servers, what could they expect from their standard issue Linux/nginx server in terms of attack resilience? There are a few features that are designed for this purpose:

Kernel SYN flood protection. A classic transport layer DoS attack. Mitigated at the kernel level by “SYN cookies”. This will not help against any application layer DoS attacks, or any attacker that has control over lots of IP addresses.
Iptables rate limiting. IP blocks can be rate limited at the firewall level. A configuration of this type is very hard to get right manually, without harming normal user experience. It’s not realistic to rely on this.
Nginx connection rate limiting. Better than at the firewall level, but still requires the application designer to consider the rate limits very carefully.
Fail2ban. An application that looks at traffic logs and dynamically bans IPs that misbehave or DoS the server. This is reasonably good at thwarting a DoS attack with a small number of source IPs. It’s unlikely to handle a significant DDoS from a large pool of IPs, however.

Notably, for a default configuration Linux/nginx server that does not expect a DoS attack to hit it, there is very little to no protection. Origin Servers that expect to be protected by the CDN, are not likely to run fail2ban or implement any custom measures. Therefore, they are bound to be quite vulnerable to DoS if targeted directly.

‍

Harmful server configurations

Some common configurations on typical Linux servers can even be harmful for DoS resilience. To illustrate this and the above points, consider the following example of a typical iptables configuration:

The problem here lies with the conntrack rule that allows established connections through the firewall. First of all, this line is not needed for this configuration. Since the OUTPUT policy is ACCEPT, it has no effect, as no outgoing traffic will be blocked anyway. Second, conntrack has limitations in the kernel, specifically:

The number of tracked connections may be limited to a very low value by default! In this extreme example, 15k of connections can be opened by a single IP DoS node. Once this number of tracked connections is reached, the kernel will start dropping legitimate connections, resulting in a hard DoS.

DoS testing of real Origin Servers

Having a list of verified, directly addressable Origin Servers and their corresponding CDN-related domains. To validate our results dynamically, we decided to carefully DoS some affected servers for a brief interval, while measuring the impact on the CDN-related domain through the CDN. If an impact is measurable through the CDN while DoS’ing the Origin Server, the bypass is 100% confirmed.
‍

Our chosen DoS method is a simple CPU/connection pool exhaustion attempt by way of establishing a massive amount of TLS connections from a single IP to the target server using zgrab2. While the DoS TLS connections were made directly to the origin server IP, the measurement connections were made to the real CDN-related domain, through the CDN, just as a normal user would connect.

The measurement performed was done from a second measurement machine with a different IP, in a different region.
The actual metric being measured was simply the round-trip-time (RTT), or latency, of a full HTTPS request.

Importantly, the test endpoint chosen on each server was something like “/dummy1234”, with the number being random and unique to every request. This makes sure that the CDN has to forward the request to the origin server every time. A 404 response should be affected the same as 200 with regards to RTT.

Measurements results

These results above are quite evident. The timeout for the RTT measurement was set to 1 second. Therefore the last 3 instances are more likely to be caused by a connection pool exhaustion of some kind, such as the conntrack issue discussed above. 

‍
The above tests are the ‘worst-case scenario’ of the discovered bypass, as the targeted servers are single IPs that handle all traffic of a certain web application. Despite this, it is quite common for web-applications to be protected by handful servers, so an attacker would only need to expand an attack in a linear fashion. A well-organized attacker can also create a botnet, harvesting the bandwidth, the geo-location, and the CPUs of tens of thousands of machines, to carry out simple but much more powerful DDoS attacks even against the largest web-applications setups available – where a CDN is bypassed.  
‍

Finally, we performed a similar DoS test on an BHHC domain, and demonstrated how a user accessing the domain would experience such an attack:

Summary

This research uncovered the wide spread of misconfigured security tools, that are considered the best type of protection for web-applications - CDN based WAFs. 

‍
While this research included some novel techniques for identifying Origin Servers, and mapping them to CDN-protected domains, the weakness of CDN-based WAFs is actually well known in the industry for almost 10 years. An article by Imperva from 2015 (!!!) details methods of protecting Imperva customers from this type of bypass. 

‍
Nevertheless, it seems the architectural weakness of CDN/WAFs have created a long-lasting misconfiguration issue that doesn’t seem to be going away anytime soon. 

‍
Misconfigurations of security tools can have an extremely serious effect, as enterprises walk around with a false sense of security, while gapping holes might be lurking in their ‘defense-in-depth’ strategy. Further analysis of similar systemic issues are required to strengthen the walls.

‍

On This Page

Share this article: