By Dr. Marco Balduzzi, Senior Researcher, Forward-Looking Threat Research Team
As a large cyber security vendor, Trend Micro deals with millions of threat data per day. Our Smart Protection Network (SPN), among other technologies, helps us conduct research and investigate new threats and cybercrimes to improve our ability to protect our customers.
In this blog post, the first of a three-part series, I would like to share some insights on trends that we have observed in the wild after analyzing 3 million software downloads, involving hundreds of thousands of internet-connected machines.
Specifically, we turn our focus on web downloads originating from browsers or any other (HTTP) client application installed on a machine. Note that we limited the study to machines that execute software after download. Given the huge quantity of data, we also limited our research to unpopular software downloaded from URLs that were not whitelisted. This automatically excludes software from Windows Updates and other well-known domains. All this information is PII anonymized.
We classify these downloads as benign (legitimate software), malicious or unknown. Unknown means that the downloaded software is currently unknown to us or to other public data sources that we monitor.
So, how unpopular were these downloads? Very. No more than twenty machines downloaded/executed the majority of these software files, and 90% of them ran on single machines. Although some could argue that the introduction of whitelisted domains adds bias to the results, I believe that the internet’s “offering” is still largely unexplored, with novel, uniquely crafted software constantly challenging internet-connected devices. Especially now, in the era of “The Internet of Things,” a population of customized code circulates both in its benign (like firmware and drivers) and malicious (0-days, polymorphism attacks, new malware) forms. Given these premises, I believe that vendors at each level of the security industry will be, in the upcoming years, increasingly challenged in their roles of investigating the threat landscape and providing suitable working solutions for all internet-connected devices (and there are a lot! See our Exposed Cities reports).
Distribution model: popular websites house more malicious files
The first thing we wanted to know was the source of these downloads. While it would be easy to presume that unpopular software tend to come from unpopular domains, we observed the opposite: 40% of unpopular software originates from highly popular domains, such as websites that Alexa ranks among its TOP 1000. This number escalates to 80% for the TOP 100,000 websites.
Figure 1. Percentage of benign and malicious software file downloads from popular domains listed in Alexa’s Top 1000 Ranking
Malicious software (the red dotted line in the figure) also tends to leverage highly popular websites more than benign software. In fact, we observed many file hosting services, such as softonic[.]com, cloudfront[.]com, and mediafire[.]com, providing questionable software. Table 1 gives a breakdown of the major domains involved, showing how high cloud providers score in the rank. A number of factors can explain this behavior, like the high number of visitors (and potential victims) using these sites, the operators failing to validate the software, and malware operators embedding malicious code in legitimate applications, which they re-bundle and upload for free download.
Figure 2. A list of major websites that offer malicious content and the number of unique downloads for each
Of course, being able to leverage popular platforms is advantageous for cybercriminals. In the past, we’ve observed similar mechanisms with mobile malware having been made available on trusted marketplaces. While mobile vendors like Apple and Google do their best to sanitize the applications that their marketplaces host and distribute, this does not seem to be the case for popular file-hosting providers.
However, hosting malicious files on popular websites is not enough. Cybercriminals need to encourage their potential victims to download and install their software to succeed. We saw this in the ’90s with Kevin Mitnick and in modern advanced persistent threat (APT) attacks that use spear phishing campaigns. The solution is just around the corner: social engineering.
Table 2 lists the major websites offering three types of malware, namely dropper, adware and fake antivirus (AV). Based on the data, cybercriminals paid special attention to the choice of the domain names used for malware distribution. In particular, “media streaming”-looking domains are used to distribute adware, while domains resembling antivirus software companies are used to distribute fakeAVs. One example is webantiviruspro-fr.pw, which was used to distribute malicious code in France. Similarly, wmicrodefender27.nl targeted users in the Netherlands with a malware concealed as a Windows Defender Antivirus. Meanwhile, we observed that droppers prefer generic file hosting providers that guarantee a high number of visitors and potential victims.
Table 1. A breakdown of the major domains per malware type
Code signing is another important and interesting aspect that contributes to the success of malware operators. In fact, all modern and major operating systems now provide capabilities that limit validation and execution only to legitimate or signed software. We came across multiple cases of certificate trading and abuse during our analysis. We will explore this aspect in a dedicated blog post that we plan to release soon.
Infection model: Chrome beats other browsers at infecting endpoints
We also investigated to determine which configuration of a client machine (like a consumer’s office endpoint) contributes the most to a potential compromise. We are interested in identifying the most prominent targets for malware operators, which users are more susceptible to infections, and why.
We started looking at browsers as the primary platform for web downloads and considered Firefox, Chrome, Opera, Safari, and Internet Explorer. Table 3 reports the number and types of software downloaded by these popular browsers. Somewhat surprisingly, the results show that Internet Explorer (IE) could be considered the “safest” browser based on the percentage of malicious downloads initiated and the percentage of infected endpoints. In fact, of the 411,138 machines that used IE to download one or more software, only 18% became infected. On the other hand, of the 344,994 machines that were observed using Chrome, 31.92% became infected, which represents the highest rate of infection across all popular browsers.
Of course, this could also mean that while Internet Explorer is automatically used as the default browser and automatically patched by corporate policies, some users tend to install a second personal browser (like Chrome) that they fail to keep updated. As a result, these unpatched browsers become critical attack vectors for their endpoints and the entire corporate network.
|Browser||Endpoints||Benign Download||Malicious Download||Infected Endpoints|
Table 2. An analysis of major web browsers featuring benign and malicious downloads and the number of infected endpoints for each
While browsers contribute greatly to web downloads, other software commonly installed on endpoints, such as Windows applications, also play an important role. Our analysis highlights some important aspects, namely:
- Acrobat Reader and Java are among the primary vectors of infection, with Acrobat users experiencing an infection rate of almost 80% upon download
- A consistent number of machines (27%) run unpatched or unsupported versions of Windows (e.g., Windows XP) – representing a primary infection vector
- Droppers represent the main form of infection and a critical entry point for cybercriminals
Business model: Malware operators stick to threat campaign of choice
We also briefly investigated the business model of malware operators as seen through the eyes of their victims. To do so, we categorize the software downloads that appear to be malicious based on their types (e.g., ransomware or banker), and then we observed – from real installations – if these software downloaded additional content, including what these content were, after a potential execution.
In all the cases, we observed that malware operators tend to specialize in the businesses they ran. For example, an operator of a ransomware campaign is 80% more likely to continue operating a ransomware campaign without changing business models. We observed the same behavior across operators of botnets, spyware, bankers, fakeavs, and adware.
One potential explanation I can confidently provide is that today’s cybercriminal ecosystem offers a more diverse set of competencies compared to that of the early 2000s. In fact, it seems like both the attackers and defenders improved through the years, thanks to each party’s need to develop better solutions. As a result, miscreants specialized to challenge improved defenses. In addition, from an economical aspect, and considering that modern cybercrime is entirely money-driven, the business model of a ransomware campaign, for example, is very different from that of a banker campaign – both in terms of monetization and of operational costs. This makes it more complicated for a malware operator to change business models.
The trick played by PUPs: A quick shift to more advanced threats
Adware and potentially unwanted programs (PUPs), also known as potentially unwanted applications (PUAs), warrant a different discussion.
A common misunderstanding is that users think of potentially unwanted programs as an annoying problem rather than a potential risk – hence the name. In fact, these programs tend to appear as unsophisticated software that displays ads to their users while not directly encrypting personal files or leaking sensitive information, which more aggressive forms of malicious software like malware normally does.
PUPs are actually more damaging than they appear. Similar to the experiment we described in the previous section, we tracked what these types of software do when executing on a victim machine for 30 days. Figure 2 depicts the time (if any) PUPs take to run, download and execute a more aggressive form of malicious software, i.e., a malware like a ransomware, banker or trojan.
Figure 3. Time between a PUP and a malware infection
Our analysis shows potential risks that regular users may not be aware of. That’s why I believe our work is important for raising awareness, and if you are interested in the topic, I suggest reading our paper. Interestingly, in more than half of the cases, PUPs or adware transition to more aggressive malware on the same day they land on a compromised machine. Of course, this value increases with the time. This means that if a PUP hits a user, it has a very high chance of resulting in a full compromise. In addition — and following our previous discussion on the role of cloud providers hosting and distributing questionable software — a user who downloads free software (like an open-source video player) is more likely to run into bigger issues.
Impact on organizations: Important questions, challenges raised
Overall, I believe our work raises important questions and challenges for the security industry, which includes the providers of security solutions and their users. I would like to summarize them here:
- The internet’s “offerings” are still largely unexplored, with a massive amount of never-before-seen, unique, or customized code and software challenging internet-connected devices.
- File hosting services (cloud providers) play an important role in freely distributing software to millions of internet users. The question that we raise is “what is their responsibility, if any, in promoting the distribution of questionable software including PUPs bundled in normal applications?
- Although security solutions have improved in performance and efficiency, social engineering still plagues internet users. I highlight the importance of awareness of social engineering scams, e.g., in the name of the website they download the software from;
- While modern operating systems receive automatic updates, our research indicates a large number of unpatched systems and systems running obsolete software. The problem extends to software known to be aggressively targeted by miscreants, such as Acrobat Reader and Java.
The internet is still mostly uncharted. A large portion of software files from unpopular websites are still largely unlabeled, and malware detection systems need labeled files to be able to defend internet-connected machines from infection. Hence, a system of classification that uses machine learning technology to analyze files is all the more important. We made use of this human-readable machine learning system and explored other key findings on large-scale global download events in our research paper titled Exploring the Long Tail of (Malicious) Software Downloads.