Credential Stuffing — how it’s done and what to do with it?
Updated: Nov 13, 2019
Credential Stuffing is a peculiar attack on web applications. Technically it is a very basic and easy to understand attack closely related to brute force techniques, yet it is frighteningly effective. What is more, this attack is sometimes not considered a security breach by administrators and secOps! And it even gets worse because this attack targets single users, not the application or underlying business, at least not directly. The implication is simple — when a credential stuffing attack is relatively small, the organization might not have enough motivation to act accordingly.
So, the first victim of credential stuffing attack is always a single user and he will bear most consequences of the attack. His privacy related to the account is violated, his means stored on the account could be defrauded, harming and unwanted actions could be taken on his behalf. Infrequent cases of credential stuffing and account takeover incidents are handled reactively by administrators by actions such as password reset etc. Sometimes administrators recompensate losses incurred by the users but that would depend on application and business characteristics. And usually that’s it if a business is not largely affected, no further actions are taken. But business will be affected by credential stuffing because credential stuffing leads to more dangerous attacks. It is a prerequisite, a building block of account takeover and account aggregation attacks and these attacks can be very dangerous. In particular, successful credential stuffing attack facilitates further automated bot-based attacks: data scraping, skewing, scalping, spamming, denial of inventory to just name a few. And these attacks would have a solid impact on the business and security in general.
How it’s done?
Disclaimer: In this part, I will describe some techniques used by the attackers. But I will limit myself to well-known techniques that are easy to find anywhere on the internet. I will also use some youtube video tutorials as examples but I selected them on the same principal. The goal is to provide context for those concerned by security and not provide any valuable knowledge to script-kiddie crackers.
Credential stuffing is in principle very simple. You take a bunch of credentials and try to perform login action against a web application using these credentials. When the login is successful, you have a cracked account. That’s it. In practice, the attacker has to solve the following aspects.
Credentials The attacker needs to have access to a list of leaked credentials (that’s the main difference between credential stuffing and credential cracking). Leaked credential lists contain pairs of username-password or email-password are called in cracking communities “combos lists”. address and password are called “combos” in cracking communities. The most valuable are freshly leaked combos. Some of the combos are easy to find, but usually, the valuable ones are sold or offered in closed cracking communities. A quite common pattern is the following: a database of credentials is stolen and decrypted by a skilled attacker. Then the credentials are sold or given depending on motivation to less experienced attackers who are performing credential stuffing on web applications of their choice.
Automation Checking huge combo lists requires automation, you need a piece of software that would check all credentials against an application and detect the correct ones. Sometimes the software needs the ability to bypass captcha protection or other security mechanisms. It is not uncommon that wanna-be-crackers lack knowledge on how to write their own software or even how to extend existing cracking tools. So such extensions tailored for a specific web application are also offered on “black market” by more skilled crackers, some for free and others can be bought easily. I will describe the cracking market in more detail later.
Parallelization and anonymity When you’re performing login attempts, or just sending a large number of requests from one machine to a server, it’s easy to draw attention. Crackers are using lists of proxies and zombie-computers to perform login attempts from these machines. Firstly, it enables them to parallelize the process account checking and also it hides their IP address providing some anonymity. Often using proxies is a requirement for achieving automation — many web applications monitor IP addresses and would detect numerous login attempts from one machine.
How crackers are born?
Many of techniques and cracking tutorials are available on the mainstream internet. Twitter and Youtube are full of tutorials and tips on credential stuffing. Hopefully, effective ones and the ones that show successful cracking are quite fastly removed from youtube. To give you a feel on credential cracking communities and techniques, let’s take a look at some video tutorials. You don’t have to watch them in detail, just a short glimpse can give you a good insight into the community and script-kiddies-like techniques. Here’s a tutorial from one of many forums.
At the time of writing this article it had 31K views and 27 comments. The tutorial goes through SentryMBA installation, introduction to “configs” which are extensions/configurations that enable to attack a particular website. Then they present how to use proxies and advertise their offering in paid configs and proxy lists. After that, the tutorial shows how to load these files into the SentryMBA. For the attack, they used 156 proxies with an approximate number of 50 bots. The video shows the attack and ends with no “hits”, meaning that no valid credentials were found. This video is not very technical and tries to promote paid goods offered by the community.
Next tutorial with 58k views and 87 comments is another SentryMBA presentation. It shows how to use the “Magic Wand” functionality to create a basic login configuration. It also shows “analyze login page” functionality that tries to automatically investigate the login mechanism and in a perfect scenario creates ready to use configuration. In this case, the tool created an HTTP POST with needed data fields. Also, the tutorial briefly shows usage of Cookie refresh function that is also required for making a login attempt (it grabs current session id). Next, failure and success keys are defined. Keys are just words or regular expressions that SentryMBA uses to inspect a response from an application. When a key is found in a response, SentryMBA flags a combo as a success (hit) or failure. Then, the program is configured to follow HTTP redirections which apparently is also needed for automating login action process in this case. Then, the attacker checks how the cracking works without using proxies! The program performed 1000+ login attempts from one IP address and the web application was still not blocking the address. The attacker concluded that it was OK to continue without using proxies. In the next steps, the goal of the cracker was to check upon successful login, whether a cracked account has a positive balance on the account. The video ends with running a cracking procedure with 50 bots and no proxies. This video showed some more interesting options of SentryMBA (which are still very basic when looking from the HTTP layer and virtual browser perspective).
Another example of a similar video where an attacker tries to log in to PlayStation Network with 16k views and 70 comments. I will not describe this video as it follows a very similar pattern to previous videos.
There are at least hundreds of other videos for SentryMBA and other cracking tools. SentryMBA is quite dated, but it seems it is still the most popular tool out there. Though there are some newer programs that are doing very similar things. They offer better stability and text-based configuration interface which is more intuitive. The most popular currently is SNIPR, a tutorial from one community is available on youtube as well.
STORM is another popular tool with tutorials available as well.
The number of popular cracking tools is actually small, their configurability and extensions are what makes them attractive to crackers. I will return to the subject later.
As you probably observed, these video tutorials are of low quality. Most of them are created like that — by a teenager, on a windows computer with some icons to games on the desktop, etc… Some of them are really hard to watch. Popular attacking vectors that I observed seems to somehow reflect broad script-kiddie cracking community: gaming platforms (Steam, Minecraft, PSN etc), file sharing services, porn sites, proxy services (to crack more?), streaming services (Netflix, Spotify), food delivery (Dominos Pizza) but also Skype and Paypal. Though I have to mention that everything is on their radar, just some vectors seem more popular than others.
The range of available tools is not extensive, the ones promoted on forums and popular are limited:
I stumbled upon a few other names but not so many. Backtrack Linux distribution comes with Hydra/Medusa software but this software is quite old and it is so primitive that it makes little sense to analyze it. Potentially, there are some other tools unused or unknown in mainstream cracking communities and are hard to get. But in the end, a cracker with solid programming skills would write its own tool anyway… So this is the tools catalog I will cover.
Although quite old, SentryMBA is still the most popular credential stuffing tool. If you watched videos I included, you already have a sense for the software. Buggy, badly written but with all needed functionalities. I will not describe it in detail as there are plenty of descriptions prepared by cybersecurity researchers available. Though you can just go on and install the software in a secure environment. However, if you’re looking for a report, I liked Danna Thee report from cyberint, it covers all the basics.
As I already mentioned, credential stuffing software consists of the software itself and configuration. The software is a framework for issuing requests to a web application, but configuration or extensions are needed to target a particular website. This division is well though — otherwise, there would have to be one program for each website. Storm divided these functionalities and comes with two programs. One is configuration creator, where you can create configurations in the UI. Configuration is just a text file, where you can define templates for requests, define variables and behaviors. It is logically divided into stages. Documentation to the tool was hard to find for me or maybe it’s not available at all. In order to understand some functionalities, I had to do small reverse engineering. For that reason, I don’t want to include detailed description but you can easily find Storm configurations and learn the syntax by reading them. The other part of the software is STORM credential stuffing program. It lets you define a number of threads, load combo lists, and proxy lists, very similarly to SentryMBA. Because of its fully text-based configuration files, I find this program more mature and easier to use. Though SentryMBA has some functionalities that seem missing in Storm.
Snipr is the latest addition to the community. It provides more community integration (similar to open source projects) and comes with built-in configurations and proxy scrapers etc. Stability and performance seem to be greatly improved over SentryMBA. Snipr seems to provide all functionalities of STORM and SentryMBA. Hackers can write plugins and then distribute and sell them on this platform. Normally, a script kiddie has to look for combo lists, then scrape some proxy lists, then find or buy “configuration” for a particular website. Here all these steps are nicely integrated into one platform and payment is easy. The tool offers free plugins to some web applications as well but has an integrated paid market. This is very dangerous as it establishes a common market for knowledge exchange. It means less work for script kiddies and motivates more skilled hackers to create and sell “configurations” — smart. Fortunately, in 2019, getting this tool for free is hard, at least for me.
Why this is still happening?
Browsing through the forums or looking at Snipr tool shows that credential stuffing market is vibrant. Skilled hackers are making profit preparing configurations, crackers are making profits from corrupted accounts by selling them or otherwise.
The fuel for credential stuffing is leaked credentials. Leaks happen constantly unfortunately, very skilled individuals are exploiting software vulnerabilities and human errors. On the other side of the spectrum are web applications, where in most cases massive login attempts are simply allowed.
This is actually quite surprising if you think about it. Let me use an analogy. Let’s say you build a nice apartment (application), you want only your locators to access the building, so you design a door that can be open by people having appropriate keys.
You want it to be secure, so you create a solid front door. Let’s imagine that the doors have many different smart locks protecting against different thieves techniques (SQLInjection, CSRF, XSS, and sessionId hijacking).
Also, every person has a special, assigned individually PIN code. You spend a lot of effort teaching your builders how to build proper doors to protect your apartments. You hire professional locksmiths (pentesters) to test your doors and to find any weak spots.
And then you distribute keys to all locators and provide means to distribute keys to new locators. And you observe how people are using the doors. Most of them open the doors on the first attempt, some have problems with inserting the key correctly, so they enter on a second or third attempt. From time to time someone lost the keys so you give them new ones. Nothing exceptional.
And one day, suddenly, you see a huge number of unknown people lining up in a queue before the doors. They try opening it with the wrong key and then leave or return to some random place in the queue. Then another person approaches and tries to open the door, fails, goes to the end of the queue. Would you consider that normal? Would you not act?
This is what happens during credential stuffing. And that kind of network activity is sometimes ignored, especially if it’s not high in volume. Why is that?
I see two reasons. First is that we humans have a great ability to detect anomalous behavior, so in the analogy used, we immediately know that something’s not right with all the people trying to enter and failing. We can analyze behaviors and detect patterns naturally - we would have to teach our protection system to do the same. The second reason is a common division within the organizations between application/developers and firewalls/administrators. Applications usually lack context or its simply not their job to monitor traffic patterns. Firewalls, on the other hand, do not know application’s authorization details or its simply not a job of a firewall to monitor logins to one particular application, it’s a more generic goal of protecting the whole subnet for example. It seems that neither administrators or developers have a full context to approach the issue the way a person can just observing the door.
Also, I must admit, the analogy I used oversimplifies advanced credential stuffing techniques in some aspects. Thieves in the queue will try to disguise as tenants and they will try to outsmart the observer. Also, if the everyday number of tenants entering is huge, it’s easier to miss the thieves. But even with these obstacles, there are still things to look for — there would still be a lot of people acting strangely, trying keys, failing and leaving. How that translates to possible detection techniques?
Getting back to the analogy introduced earlier, the core idea behind the detection is to be aware when a huge queue of unknown people tries to enter the building by opening the door with different keys and failing. The next step is to identify in the malicious people who are not the locators. Maybe they wear caps with red feathers? Or maybe they smile maliciously or hide behind a pair of cools blue shades? Maybe we can just look at their behavior of leaving and entering the queue again?
In this section, we will try to identify possible means for credential stuffing detection. I will try to show a couple of different perspectives on how to look at the subject. I will try to maintain a high level of abstraction and not dive too deep into technical aspects and implementations.
Let’s start by going back to the cracking tools described earlier. The configurations were fixed. How does that translate to HTTP request? Well, the request is constant, only specially configured fields like username, password, sessionid are changing. Are HTTP requests generated by the tools recognizable? Well, it depends. SentryMBA comes with a default configuration that is rarely changed and the footprint is easily recognizable by looking at user-agent strings (UAS). We can simply look for patterns in the strings in incoming requests and flag requests based on that. In this approach, we threat our patterns similar to an antivirus database.
This is a very basic technique that is recommended in some articles — to blacklist certain patterns. But this approach is primitive and very easy to bypass by the attackers — they would just have to change request a bit. Also, tools like Snipr and STORM do not have a predefined configuration. So let’s say we can look at existing, available configurations on the market. Analyzing these would give us a nice library of used UAS, and other characteristic patterns. With centralized Snipr market, we could probably get access to the configurations quite easily using bitcoins. We would end up with a list of HTTP requests used by the tools and their configurations. Such a list would have to be updated and maintained. Not very convenient. Also, the attackers can always change the preconfigured strings to something of their choosing making the whole blacklisting ineffective.
Geolocation, IP addresses
Credential stuffing attack is performed either from one IP address or from proxies meaning multiple IP addresses. Analyzing IP addresses origins and traffic patterns on IP can give a strong indication that an attack is performed. When cracking is performed from one IP address, the number of login attempts from one IP will be easily visible. Although there are situations, where behind one IP address, there’s a lot of users. In that case, the analysis would have to be enhanced by tracking session ids or by understanding the behavior of the users/bots. I will elaborate on that in a moment. When cracking is performed from multiple IP addresses, that is using proxy services, we can try deducting geolocation of the IPs. This information can be used to find anomalies — for example, if we know that 90% of our traffic is from France, and suddenly we are getting a huge amount of login attempts from the USA and multiple countries in Asia, we can suspect with good probability a credential stuffing attack. There are also services that try to maintain actual lists of proxies, though I don’t know how effective they are. Others are trying to detect the use of proxy by checking ping/traceroute delays and other characteristics. Though this is important to mention that a legitimate user can also use a proxy. Information about proxy usage by itself means nothing. In summary, geolocation adds a valuable dimension for detecting credential stuffing, especially for web application operating locally within a certain region.
Leaked credentials — the fuel for credential stuffing. Crackers are able to get these, so defenders should be able to get them as well. They are published on forums, chats, sold, openly published on some services. Of course, for a company, it’s more difficult to obtain certain data and the idea of paying hackers for leaked data is… wrong. Also, keeping leaked credentials is not something you would want to do from security, morality and legality considerations. But instead, you can keep for example hashes of username and passwords. How we can use such information? Well, if we would manage to gather a database of hashes of leaked credentials, we can then check every login attempt against such a database. Obviously, finding a match in a database is not a direct indication that a credential stuffing is happening, you would need to correlate such events with a broader context. Having a single match in a database can just mean that a single, legitimate user logged in with his credentials that happened to be stolen. In real-world scenarios, you would get a lot of matches on popular passwords like “12345”, “qwerty123” etc. This idea is quite similar to tools footprinting, in that sense that we try to gather the same data as crackers are having access to and use this data to flag or blacklist certain traffic patterns. This approach is burdened with the same inconvenience, you would have to search for data, keep the database up to date. Usually, crackers are trying to use the newest datasets of credentials. Also, they do modify username and password strings by adding prefixes and suffixes and other transformations. Actually, there are a few quite popular programs they use for this purpose. Since you want to operate on hashes and not full credentials, accommodating to detect such string variations is very difficult.
HTTP header contains numerous data that can be used for finding anomalies. Headers can be inspected for compliance with a standard. Because requests are created by crackers by hand, sometimes they introduce inconsistencies, for example, they could set HTTP protocol version to 1.0 and define a not existing If-None-Match Cache header that was added in HTTP 1.1 protocol version. That would cause a protocol inconsistency. Or they would skip some popular Headers just because they are not needed. Although that kind of implementation would be time-consuming and not very innovative, let’s look at what else we can do. The user-agent string is very interesting. As already mentioned in the tools footprinting section, they can be used for analysis. During an automated attack, an attacker would use a constant number of user agent strings or he would generate them from a template (assuming a sophisticated attack, most use only one user agent or a fixed list of user agents). What you can try doing, is finding anomalies within the user-agent distribution. UAS can be tokenized, and then clustered according to some metric. Doing that kind of clustering can reveal UAS that are anomalous: Malware detection using HTTP user-agent discrepancy identification, Anomaly detection on User-Agent Strings. Such analysis could be then further enhanced by time-based analysis. This technique is also used for malware detection within networks, though this area seems more difficult than credential stuffing (due to SSL usage and distribution across network nodes). Similarly to UAS, an analysis can be performed on URLs and other strings, though URL analysis would make more sense for data scraping and vulnerability scanning attacks.
Although I already mentioned that in the bot-detection section, this can be a valuable indicator. Most credential stuffing attacks, or at least the ones performed by script-kiddies, flood the application with a constant, repetitive stream of requests. Especially when generated traffic is significant, the signal is very strong. There are plenty of techniques that can be deployed, statistical like ARIMA, standard techniques like k-means, SVMs, STL or usage of neural nets, for example, LSTMs — Long Short Term Memory Networks for Anomaly Detection in Time Series. As always with data science, you need good data to evaluate your model and verify its performance.
Login action is central to credential stuffing and distinguishes this attack from other automated attacks. For this reason, it is especially valuable to perform an analysis of login attempts. Tracking login actions is usually quite simple, an application usually provides one or more URLs to which an appropriate request has to be sent. This facilitates tracking of such requests, it can be done on a firewall level with a simple configuration. Alternatively, such an analysis can be done on the application’s server, internally. By keeping track of login attempts, you can deploy the same methods that are mentioned in the time-based analysis section. The data can be enhanced by adding information on whether the login action was successful or not. You can also try combining login analysis with browser/bot footprinting to enhance the signal further.
User Behavior Analysis
UBA is considered to be very effective when implemented correctly. In the analogy I introduced earlier with people waiting in a queue trying to open the doors — this is what our minds are doing. We analyze these people’s behavior and we classify certain behaviors as suspicious relying on our knowledge about people behaviors and understanding of what a door and a key are. Though UBA can be done on a low level what I particularly like in this approach, is that UBA enables us to raise from HTTP layer and look at the traffic and underlying actions from a higher perspective, especially from application’s business context. This is certainly valuable for attacks like denial of inventory, scalping, sniping or expediting. UBA can be also very useful for marketing and UX departments and often such analysis exists within the company but for different reasons than security (marketing automation, Hotjar, and similar software). For the analysis to be possible, you would need an ability to deduct actions performed by the user by inspecting logs. There are architectures that embrace the concept. A prime example would be EventSourcing pattern and other message-based architectures. Sometimes simple audit logs can serve this purpose perfectly. Alternatively, if the system would be deployed internally, the application could generate appropriate, higher-abstraction-level events. Or if we are considering a single application, a detection system could be tightly integrated into a web-framework enabling seamless integration and a minimal amount of configuration needed (that would require code changes and development work on the application side — nothing’s for free). With the ability to decode actions within the user session, the next step is to distinguish between regular user behavior and anomalous one. Numerous assumptions can be made when choosing the algorithms, one commonly repeated one is that malicious behavior would be rare. But that’s not necessarily the case for some applications, especially internet facing web sites. Neural networks can be deployed here to reflect complex patterns. But simpler approaches are also possible — for example, a set of suspicious and malicious actions can be defined and user activity can be monitored according to such patterns. Regarding implementation, a pattern matching, decision trees, rule-based or graph-based systems come to mind.
Knowing your system
This is not a specific technique and not strongly related to credential stuffing but I wanted to include that idea as it is powerful. For protection, use information that attackers do not have. You know your traffic, you can derive interesting information out of it. When it comes to HTTP, you have quite a few data that is not easily available to the attacker. For example, you know what is the distribution of browser types, operating systems, countries, languages in your traffic. Other interesting areas can include SQL queries to your database or URLs distributions and many others. You can study such features and create time-related trends. Try to understand and learn your traffic, use this advantage over the attacker. Just be careful — make sure you’re learning when the system is healthy and “normal”. For some systems, it is extremely difficult to achieve that (reports estimate that ~50% or even more of all website traffic is non-human).
I hope you found the perspectives presented informative and that they can provide multiple ideas for a detection system. Choosing the right one and then accommodating it to existing systems and applications can be challenging. Also, as with all data science projects, such a system should be validated against the data, the more data the better the system can become.
If a detection system would use a machine learning approach, it is important to understand how an ML system can be poisoned. There are already techniques being developed to fool the ML system (Explaining and Harnessing Adversarial Examples, Security Attacks: Analysis of Machine Learning Models). It’s good to keep that in mind when designing a system.
I would also like to include a diagram from Google’s paper representing the amount of work needed to run an ML system on production (Hidden technical debt in machine learning systems).
Detection of an attack is a much simpler task than a defense. It’s common to think that detection is a prerequisite to defending against an attack but that’s actually not always true. For example, you can defend against credential stuffing by including very difficult captcha while not being able to detect an attack. Defending against an attack opens a whole new area of game theory. Every defensive action can trigger the evolution of attack strategy. For example, if we decide to block certain user-agents, the cracker will observe that and modify user-agents slightly bypassing our security.
The big part of this article describes credential stuffing and possible approaches to detection based on tools used by crackers. But reasonable to assume that a serious cracker with solid programming skills would write its own software tailored for a particular web application.
Credential stuffing is sometimes defined as a massive login attempt within a certain timeframe. But that’s not always the case. Attackers know that generating a huge amount of login requests in a short period of time can draw attention. So they perform a slow, under the radar, credential stuffing attack. Couple requests for an hour, maybe only a couple a day. They would also use fake accounts and login with these on the second attempt and performing unsuspected actions, disguising as regular users. These are all scenarios worth reflection. On the other hand, even if you have a system capable of detecting classic credential stuffing, that’s already very valuable. For once, all not that motivated script-kiddies would go somewhere else — being able to check 20 credentials daily when you have 10000 credentials in your combo list would take more than a year, probably not a time a teenager is willing to wait. But even for professional crackers, being able to slow down credential stuffing significantly gives the defenders an advantage. A year to enumerate the passwords, there’s a higher chance that users would change their passwords making leaked credentials obsolete.
I believe that credential stuffing, the one performed by automated tools in a defined timeframe, should and could be stopped. As I mentioned at the begging, this is quite crazy that these kinds of login attempts are observed and sometimes considered a normal part of traffic. Seeing a script-kiddie performing credential stuffing from its own IP address and him being able to enumerate thousands of accounts is quite shocking. There seems to be a lack of tools and libraries within software frameworks that would enable programmers to easily deploy detection mechanism and eventually protective measures.
The perspectives presented in Detection chapter are just my own, loose ideas that help me to reason about specific solutions. And though some of them are very common and well-known in industry, please do not limit yourself to these and create your own, be creative.
Disclaimer: I am a software engineer that got into cybersecurity and data science, I don’t work as a full-time security researcher.
I hope you enjoyed this slightly chaotic overview of the landscape of credential stuffing.
Jan BroniowskiArchitect at Crossword Cybersecurity