Plenty of Phish in the Sea: How Artificial Intelligence is Transforming the Oldest form of Cybercrime

Abstract

Artificial intelligence and machine learning (AI/ML) have seamlessly and fundamentally transformed the way we interact with digital technology [1]. Dual-use applications, such as the case of AI/ML, can be quickly exploited by cybercriminal activities. One example is phishing, one of the first types of cybercrime. While phishing in today’s world is still perceived as an outdated scam, AI/ML advancements have paved the way for more convincing phishing attacks and the wider use of hyper-targeted spear-phishing. This article will focus on the AI/ML-enabled transformation of phishing and spear-phishing and the consequences it poses for the cybersecurity environment.


By Maria Patricia Bejarano


Evolution of AI/ML Technologies

Although present since the 1950s, AI/ML have made exponential leaps within the past decade and will assuredly continue to increase in capacity and performance [2]. For some definitional background, “AI refers to the use of digital technology to create systems that are capable of performing tasks commonly thought to require intelligence,” whilst ML regards “digital systems that improve their performance on a given task over time through experience” [3]. AI/ML is becoming more accessible with cheaper technology and open-source code on the rise. Within the cybersecurity environment, AI/ML can increase safety and anonymity, reduce costs, and better detect threats [4]. Despite these benefits, AI/ML innovations can also provide opportunities for “malicious” exploitation, in what is known as a “dual-use” application [5]. Therefore, it is foreseen that AI/ML developments will result in “attacks [being] more effective, more finely targeted, [and] more difficult to attribute” in the cybersecurity environment [6]. Indeed, some dual-use examples of AI/ML are task automation and scalability, which not only reduce costs for businesses but also allow cybercriminals to achieve large-scale attacks, such as imitating human-like denial-of-service [7]. In the same vein, language detection advancements made to facilitate real-time translation for lay people also let cybercriminals transcend the language barrier and increase their victim pools [8]. Furthermore, increased anonymity and security for Internet users also becomes an advantage for cybercriminals, reducing chances of attribution [9]. Lastly, as a final example, the ability for AI/ML models to impersonate humans through audio, conversational text, or images is becoming highly accurate and has opened the door to cybercriminals to a new type of fraud that would have been very constraining in the physical world [10]. These dual-use examples are only a few, illustrating the sheer magnitude of opportunities that can be exploited by cybercriminals and malicious actors as AI/ML developments continue to advance. 

 

Evolution of Phishing and Spear-Phishing

Phishing was chosen due to its high prevalence and proliferation in the digital domain. The propagation of phishing attacks for lucrative gain has become commonplace in business cybersecurity environments, where, in 2018, 10% of businesses reported experiencing at least one phishing attack every hour [11]. This is particularly salient given that phishing attacks have dramatically increased due to the societal effects of Covid-19. Entire offices and student populations worldwide have been reliant on digital platforms for the majority of 2020-2021, thus making this an opportunity for cybercriminals to take advantage of [12].

Phishing is an umbrella term for ‘fooling’ someone into doing something, whether it is opening a malicious attachment or providing sensitive company or personal data [13]. It has been one of the more pervasive kinds of cybercrime since e-mail entered the mainstream in the 1990’s due to its low cost and scalability. To launch a rudimentary phishing attack all one really needs is a list of email addresses and an e-mail text. 

Cybercriminals have had to evolve and enhance their phishing attacks to become more convincing as those born in a digital age have been taught Internet safety and can easily identify basic phishing attempts. One major example of the transformation of this security environment is DeepPhish, an AI/ML algorithm that utilises deep learning to assess which phishing URLs are most effective at tricking individuals into clicking on them [14]. This algorithm produces more effective phishing URLs and makes it more difficult for phishing detection systems, such as the Google Safe Browsing API, to detect and blacklist a URL in time [15].

It is important to note that there is a trade-off between the scale and efficiency of a phishing attack [16]. The more people targeted in a campaign, the less effective that generic message will be. Conversely, the more tailored an attack is to a particular individual, the more convincing and potentially successful it will be. This highly tailored and convincing attack is what is known as spear-phishing and has been shown to be four times more effective than regular phishing [17]. These sorts of attacks are both a big investment in time and resources for the cybercriminal. No longer a generic malicious link sent to thousands of email addresses, spear-phishing is a personalised attack towards an individual victim and today uses social media and open-source information to seem as legitimate as possible [18].

Caldwell et al. (2020) rated this type of cybercrime as one of six different cybercrimes having the “greatest concern” to the security environment in the authors’ assessment of the AI/ML-enabled crime landscape. These evolved tactics can obtain pertinent company data, such as a department’s project names, which can be stolen from server incursions, targeting and deceiving even a tech-savvy individual into clicking on a link, opening an attachment, or revealing sensitive data [19]. Cybercriminals also use AI/ML to more effectively prioritise targets [20]. Although traditionally finance and IT departments have been the main targets of phishing attacks, advanced AI/ML models go beyond this basic targeting, by being able to estimate a target’s “willingness to pay based on online behaviour” [21]. Cybercriminals have turned to social media, such as Facebook, Whatsapp, and Twitter, where a trove of personal information is available for them to further personalise their phishing attacks to make them seem more authentic to the victim [22]. Others use the social media as the mode of communication itself for the phishing attack, such as SNAP_R, which utilises Markov Chains for personalising Twitter phishing posts with natural language processing based on the user’s activity and is also able to score the targeted user’s probability of clicking on the link [23]. 

 

Combatting Phishing and Spear-phishing

As tools of phishing attacks have become more accurate with AI/ML, so have the tools for mitigating them in a “cat and mouse game” between cybercriminals, authorities, and corporate cybersecurity departments [24]. For regular phishing, there are proactive AI/ML methods being developed. These are a step forward from traditionally reactive APIs that only blacklist malicious URLs once they have been detected, either manually, or through crawling or honeypots. These traditional methods were only able to block 20% of phishing URLs, and are always one step behind the cybercriminal [25]. One advanced method is URL classification, a method that utilises lexical analysis as well as expert input to detect which URLs may be malicious [26]. By employing AI/ML in this type of phishing detection, the model can learn to recognise patterns and thus become more effective at distinguishing between legitimate and fraudulent URLs [27]. While generally there have been more proactive methods of AI/ML against phishing campaigns, there have been fewer AI/ML developments to combat spear-phishing attacks. This may be due to their highly personalised nature, making it harder to detect, and also because they are not being mass-generated using AI/ML [28]. Furthermore, even with automation tools, spear-phishing remains more time and resource consuming and is thus less employed by cybercriminals for the moment.

Looking at broader institutional recommendations for mitigating the malicious exploitation of AI/ML developments, there is third-party auditing of the models, running red-teaming exercises, further disseminating information about incidents, and piloting bias and safety bounties [29]. In addition to this, it is recommended that AI/ML researchers work with policymakers and diversify the stakeholders involved in order to ensure that the law is up to speed with technological advancements and to protect those that may be harmed by dual-use applications of this technology [30].

 

Conclusion

AI/ML has the potential to severely increase the threat to the cybersecurity environment. In particular, this article noted AI/ML’s ability to reduce the efficiency-scope trade-off of (spear-)phishing attacks. These types of attacks will only become more widespread and more innovative as research developments in AI/ML continue to blaze ahead. Therefore, it is of the utmost importance for researchers to continue to investigate malicious abuses of AI/ML technology. Finally, not only is this research important, but also the communication of this research to policy-making individuals, who will ultimately have the most profound impact on this matter.

 

 Sources

 [1] Jordan, MI & Mitchell, TM 2015, ‘Machine learning: Trends, perspectives, and prospects’, Science, vol. 349, no. 6245, pp. 255-260.

[2] Buchanan, BG 2006, ‘A (Very) Brief History of Artificial Intelligence’, AI Magazine, vol. 26, no. 4. 

[3] Brundage et al. 2018, ‘The Malicious Use of Artificial Intelligence:  Forecasting, Prevention, and Mitigation’, arXiv, Preprint, p.9.

[4] Amodei, D, Olah, C, Steinhardt, J, Christiano, P, Schulman, J, Mané, D 2016, ‘Concrete 
Problems in AI Safety’, arXiv, Preprint.

[5] Brundage et al. 2018, p.16.

[6] Ibid, p.18.

[7] Caldwell, M, Andrews, JTA, Tanay, T, Griffin, LD 2020, ‘AI-enabled future crime’, Crime  Science, vol. 9, no. 14.

[8] Brundage et al. 2018.

[9] Kaloudi, N & Li, J 2020, ‘The AI-Based Cyber Threat Landscape: A Survey’, ACM  Computing Surveys, vol. 53, no. 1, article. 20.

[10] Brundage et al. 2018.

[11] Boddy, M 2018, ‘Phishing 2.0: the new evolution in cybercrime’, Computer Fraud
&  Security
, vol. 2018, no. 11, p.9.

[12] Basit, A, Zafar, M, Liu, X, Javed, AR, Jalil, Z, Kifayat, K 2020, ‘A comprehensive
survey of  AI-enabled phishing attacks detection techniques’, Telecommunication Systems.

[13] Boddy, 2018.

[14] Bahnsen, AC, Torroledo, I, Camacho, LD, Villegas, S 2018, ‘DeepPhish: Simulating  malicious AI’, APWG Symposium on  Electronic Crime Research, pp. 1–9.

[15] Kaloudi & Li 2020.

[16] Brundage et al. 2018, p.21.

[17] King, TC, Aggarwal, N, Taddeo, M, Floridi, L 2020, ‘Artificial Intelligence Crime: An  Interdisciplinary Analysis of Foreseeable Threats and Solutions’, Science and Engineering  Ethics, vol. 26, pp. 89–120.

[18] Brundage et al. 2018, p.18.

[19] Boddy 2018.

[20] Brundage et al. 2018.

[21] Ibid, p.26.

[22] Kaloudi & Li 2020.

[23] Bahnsen et al. 2018.

[24] Boddy 2018, p.10.

[25] Basit et al. 2020.

[26] Bahnsen et  al. 2018.

[27] Basit et al. 2020.

[28] King et al. 2020.

[29] Wilner, AS 2018, ‘Cybersecurity and its discontents: Artificial intelligence, the Internet of  Things, and digital misinformation’, International Journal, vol. 72, no. 2, pp. 306-316.

[30] Brundage, el al. 2020 ‘Toward  Trustworthy AI Development: Mechanisms for  Supporting Verifiable Claims’, arXiv,  Preprint, p.52.