Final Haws

Homograph Attack Warning System
Anupriya.T,Padmavathy.M,Vaishnavi.M, Vinithra.G
Alekhya Lakshmi.L Assistant.Professor,CSE
Department of Computer Science Engineering, Sri Sairam Institute of Technology, Chennai ,India
Sri Sairam Institute of Technology, Chennai ,India. vinithra.cse@sairamit.edu.in
priyaanutj@gmail.com
madhupadma2297@gmail.com
vaishnavidharan29@gmail.com
alekhyalakshmilakki@gmail.com
Abstract— As we are living in the era of social media apps, from

Facebook to WhatsApp which are using everywhere. All these will exactly looks like the usual safe link then the user will visit
apps are being used by everyone . Although this may seem to be a
the link .In this paper, we have developed a warning system
very good sign that we are moving to the new era of “THE
DIGITAL WORLD” but it may have some consequences like which removes most of the drawbacks of the current system.
spreading of artificial news, crack of personal information like Only when the user visits the safe link the user visits the
credit card,debit card, passwords or digital wallets etc. The users browser otherwise the user is given an alert message so that he
believe that every message shared on social media might be true . may not access that particular link and the user will not be
So to protect our internet users we have come up with an idea that affected by the cyber crimes like homograph and the phishing
provides the ability to detect homograph attack and malicious attacks etc.,
links which warns the user before they can access the site. The
Social engineering attacks have stirred terribly removed from this
like fraudulent attack within which we have a tendency to II. EXISTING WORK
completely rely upon our browser to present north American
nation a warning. This situation may worry some computer users
In the existing system they are using machine learning
but we generally don’t think much about when we perform any
action on our mobile phones. But all these so called to do steps are concepts to detect the malicious system. The user have to fetch
not the right way to deal with these situation .The main the link from the social media’s and have to search in the web
contributions of this paper are realizing a working definition of browser to detect the url’s. Since we are using machine learning
IDN spoofing attacks and how those IDN domains are being concepts it is a time consuming process and the user device may
presented in the URL bar in some Internet browsers, proposing a easily attack by the hackers and also the device may hang easily
working solution that reports IDN spoofing attacks which convert with in a fraction of seconds.
URL into Unicode and punycode.
The machine learning concepts like deep learning and

Keywords—Internationalized Domain Name(IDN),UNICODE,
PUNYCODE ,Universal Resource Locater(URL),HOMOGRAPH
neural networks[1] are used to detect the malicious link. In
neural network, it splits the URL into single character and the
detection of the malicious link is done by classifying the single
I. INTRODUCTION character .In deep learning they use blacklist detection. The
URL detection can be done by matching the IP address ,host
The project will be an mobile application for any name, directory structure. Machine learning methods have
person who is using any social media apps like WhatsApp, achieved effective results in the detection of malicious URLs[2]
Facebook etc., This will help the user to use the safe links on [3]but manually extracting features is time-consuming and
there mobile which is shared with them. The system will prompt requires constant accommodation of features to accommodate
the user a warning everytime he visits a vicious link it will not changes in URLs [4][5]in combination with human knowledge;
just deal with simple and well know vicious link which can be this limits the accuracy of the classification model to some
detected easily by anyone who have some basic knowledge extent. It is not suitable to support the desirable speed users
about the computer but it may also block any link which may which are normally looking for. As a result, detecting such
even fool a weirdo. The main benefit of this project is that it malicious websites quickly with high accuracies is necessitated.
will completely eliminate the help of the browser to detect such
attacks and any user interaction to verify the domain and will III. RELATED WORK
detect any attack which may even pass through the browser
protection. The project will be an mobile application for any
person who is using any social media apps like In the Fig 2.a the user asks his/her friend to send the
WhatsApp,Facebook etc., This will help the user to use the safe Link of a Apple website but the friend mislead him and shared
links on there mobile which is shared with them. The system the Unicode domain [6]which looks identical to the original
will prompt the user a warning everytime the user visits a Apple website. If this to be put in the way of an attacker then
vicious link. the result could have been different , it may be possible that he
may gets enter his credentials onto the website assuming it’s
all good. So to stay our caring web and users safe we've
It is a simple thing when the user visits a vicious link
got return up with this project plan that provides the user
which is known .If the user already known's that the particular
with the flexibility to sight attack links that warns them
link is a vicious link the user will not visit the link. The problem
before they will access the positioning.
occurs when the link which is received through the social apps
System Work flow:
Fig 2.a
After clicking on the link that has received the we will

get an error which was displayed in the fig2.b.The error will
be detected after the user visits the browser.
Fig 2.b This System works in four steps:
IV. PROPOSED WORK 1)Module 1:
As India is moving towards Digitilization the The system will starts working when the user links the
increasing in cyber crimes will effect the day to day life a link which is received from the social media.The system will
common man [7].The main focus of this project is to bring extract the link from the app and search the subsequent link in
series to the current system by eliminating all these problems the database. If the link if found in the database if will alert the
and providing a secure environment form cyber crimes . It is user else the link is forwarded to the next module of the system.
also a prospect for Digital India.
2)Converter:
The primary design constraint is the Mobile
Application. Since the application is designated for Android
In this module we fetch the link from the first module
Mobiles, effective GUI[8] and well user friendliness will be the
and convert the ASCII characters into the UNICODE format
major design considerations. Creating a user interface which is
.If there is any error in the unicode the system will alert the
both effective and easily traversable is important. We are
user,else the obtained Unicode is transferred to the next module
utilizing the database to store the various information of the
of the system. To convert the fetched link we use UTF-8[3]
spoofed weblinks so storage space needs to be considered for
encoding algorithm.
smooth functioning of system. Other limitation such as memory
and processing power are also worth considering. Efficiency
3)Module 2 :
needs to be considered since it is one of the major reasons of
having an automated system . The input and output generated
In this module the system will generate the puny code
and their individual working efficiency and its contribution to for the obtained Unicode. After generating puny code the
the overall software application must also be considered. The system will check for errors along with the certain conditions
software will give the desired results as output,now system that have to be followed by the puny code of the given
workflow as shown below.
URL[4]. If the obtained puny code does has any errors then way that many browsers interpret Punycode or the extracted
the system will alert the user , else it is forwarded to the next link is found in the database and will display the warning The
module of the system. link is not safe to visit.The 2 operations to be performed by the
user :
4)Premonitory System:
In this module it will alert the user if the user visits 1. Open in browser : This will allow the user to visit the
any malicious link and if the user visits a safe link then it will link in browser
2. Back to application : This will redirect the visitor to
direct it to that particular website .So that the user can access
the source application (i.e WhatsApp or any other
the link safely.
app).
V. SYSTEM DESIGN
PUNYCODE: Punycode be the simplest way of
representing Unicode, the standard method by which computers
Design : The planning of the project may be divided encode text of non-Roman languages such as Arabic or
into three phases that are as follows : Mandarin and accented characters such as "ü". Using
Punycode, URLs containing Unicode characters are
User Interface Style : During this section the Computer diagrammatical as American Standard Code for Information
Program of the project is developed. That is, the planning Interchange characters consisting of letters, digits and
of our application via that the user can move for the hyphens[8][9]. The problem arises in the fact that similar
warning issued as per the cases. characters are hard to distiguish from each other. While
Database : The database is the pool of information for a Cyrillic small letter "a" (Unicode character U+0430) is
every application .In our application, the database is used different from a Latin small letter "a" (U+0061), in a vulnerable
to store the most popular websites spoofed link and browser they look the same when the Punycode is interpreted.
malicious sites link that appearance the same as high 10k Therefore, the owner of the name xn--80ak6aa92e.com, which
domains of Alexa. The information also will store each is displayed as "apple.com" could create a convincing phishing
web site that it’ll observe as “Not Safe” and so create our site.
method quick to observe the identical issue within the
next future occurrence.The below diagrams shows the
representation between different entities of our project. The vulnerability was highlighted by researcher who has
Complete Design: In this section a complete flow set up the link starting with “xn” for users to check how their
diagram of the working system is designed.As per the browser using application,it interprets a Punycode site. If the
following three stages, we will now start our computer address reads "https://apple.com", this means the
implementation of the project. browser is vulnerable. "Visually, the 2 domains are
indistinguishable because of the font employed by Chrome
VI. IMPLEMENTATION and Firefox. As a result, it becomes not possible to spot the
location as fallacious while not rigorously inspecting the site's
The below diagram shows the implementation and working of URL or SSL certificate.The act of taking advantage of this
the project. vulnerability is known as an internationalised domain name
(IDN)[9] homograph attack - or more simply as a homograph
spoofing attack.
With the globalisation of the Internet, standard

frameworks such as the Internationalized Domain Name
(IDN) that enable everyone to code a domain name in their
native language or script has emerged. While IDN enabled
coding the domain names in different languages, it has also
put users of web browsers that support IDNs at risk of
homograph attacks. As IDN-based dishonorable attacks have
recently become a significant threat in content-based attacks
like phishing and completely different fallacious attacks
against internet user. An approach that might mechanically
thwart such attacks against web browsers inti mobile
application is important to the Internet users. To this finish,
we propose a new approach to mitigate the Internationalised
Warning System : This module will come into the action as Domain Name homograph attacks in this paper[9][10]. The
soon as it gets the Punycode Security researcher have sounded proposed approach is very easy to deploy in the existing
the alarm bells and warned that Firefox, Chrome and Opera, browsers and requires no change in the way the end-user
have a vulnerability that creates phishing attacks easier.The interact with the mobile application instead of web browser.
vulnerability lies within the ease with that associate in nursing We enforced the projected approach as associate in nursing
offender will produce a spoof website with a computer address add-on to a well-liked application and demonstrate its
that appears precisely the same as the real thing. It relies on the effectiveness against the fraudulent attack.
The above flow chart shows our proposed work Following the homograph attack this year [3, 6], many
explained clearly in diagrammatical representation. Our browsers have upgraded their policies of IDN display. As an
assessment of the proposed implementation shows that the example, in Firefox, if all characters within one IDN label
proposed solution to the IDN-based homograph attack protect belong to a single character set, the IDN is displayed in Unicode
users mobile phone with no noticeable overhead.Punycode is characters [2]; Chrome adopts a similar policy with more
a simple and efficient transfer coding syntax designed to be restrictions [9]. As such, many of the homographic domains
used with Internationalized Domain Names in Applications we found (see Table) will be rendered in Punycode form,
(IDNA). It unambigously and reversibly transforms a Unicode because each domain contains characters from at least two
string into an computer code string. ASCII characters within the character sets. Alternatively, showing Punycode under all
Unicode string square measure drawn virtually, and non-ASCII circumstances should mitigate the issue entirely, which is in
characters are represented by ASCII characters that are allowed fact the default option of some browsers. Nevertheless, this
in host name labels (letters, digits, and hyphens). This policy runs opposite to the IETF requirements [6] and we do not
document defines a general rule known as called Bootstring that recommend this solution. In the end, we want to understand
enables a string of basic code points to unambigously represent how IDN policies are enforced by browsers and how far it is till
any string of code points drawn from a bigger set. Punycode is solving the entire problem. As such, we carried out a survey
an instance of Bootstring that uses particular parameter values study of a set of browsers. Specifically, we manually tested ten
specified by this document, appropriate for IDNA.The below widely-used browsers on three different platforms (PC, iOS and
table is an example for IDNA homograph detection with Android). We in- putted Unicode characters[10] of
comparison of Unicode and punycode status as shown below. homographic SLDs and checked how they are displayed in
regions like address bar, status bar and title bar. Besides, we
tested how IDNs under iTLDs are supported in the same
experiment settings (e.g. testing , xn--wss800gp5g.xn--fiqs8s).
We found that browsers treat IDNs differently. Our first
observation is that except one,all others could address certain
homograph attacks . However, their security policies are not
consistent. As an example, soso.com (all characters are from
Cyrillic, with punycode being xn--n1aa1eb.com, mimicking
soso.com which ranks 96 in Alexa) bypasses the policy of
Firefox as all characters are in the same set. In the end, we found
five browsers on PC and one on Android are vulnerable.
Moreover, some mobile browsers (five browsers on iOS and
three on Android) choose to display webpage titles in address
bars when visiting IDNs. This setting is quite problematic, as
adversaries can use a title which is identical to a brand
domain’s. Among all browsers, QQ browser is particularly
interesting as it redirects user to about:blank for some IDNs
(and displays Punycode for others). The reason behind this
design is unclear. Regarding iTLD IDNs, browser policies also
differ. Firefox treats an iTLD IDN as a valid domain only if a
TABLE:Example for IDN Homograph detection.
protocol prefix (e.g., http://) is present. Though a browser
should handle both Unicode and Punycode[12][13] TLD based
on standard, we found that three browsers on iOS and two on [2]. H. Orman, “The compleat story of phish,” IEEE Internet
Android only recognize Unicode iTLDs. We speculate the TLD Computing,no. 1, pp. 87–91, 2013.
lists used by these browsers only contain the Unicode[11]
version of iTLDs. On the other hand, one Android browser only
[3] G. Liu, B. Qiu, and L. Wenyin, “Automatic detection of
supports Punycode iTLDs. Surprisingly, Baidu browser on
phishing targetfrom phishing webpage,” in Pattern Recognition
Android does not support iTLD[12][13] at all, regardless of the
format.For all these problem.we overcame with our idea is (ICPR), 2010 20th International Conference on. IEEE, 2010, pp.
running the URL[14] in application,instead of browser which is 4153–4156.
protected from accessing unwanted link through social media.
[4]. R. S. Rao and S. T. Ali, “Phishshield: A desktop application
to detect phishing webpages through heuristic approach,”
VII. CONCLUSION
Procedia Computer Science, vol. 54, pp. 147–156, 2015.
To make Internet more accessible to people whose

primary languages are not English, IETF initiated the IDN [5] .M. Cova, C. Kruegel, and G. Vigna, “There is no free phish:
standard and many registrars have opened up the registration An analysis of” free” and live phishing kits.” WOOT, vol. 8,
for IDNs. Through quantitative analysis, our study shows the pp. 1–8, 2008.
volume of IDNs has been steadily growing over years, and now
more than 1.4 million IDNs are registered. Despite the increase [6]. B. Braun, M. Johns, J. Koestler, and J. Posegga, “Phishsafe:
in volume, their value to Internet users is far under expectation. leveraging modern javascript api’s for transparent and robust
Through stratified sampling analysis, we found only 19.8% protection,” in Pro-ceedings of the 4th ACM conference on
IDNs deliver meaningful content, compared to 33.6% of ASCII Data and application securityand privacy. ACM, 2014, pp. 61–
domains. Moreover, visits to them are far less frequent than 72
non-IDNs under gTLDs like com. What makes IDN more
problematic is that new attack vectors have been enabled and
abused for cyber-attacks like brand phishing. IDN is known to [7] ..P. Prakash, M. Kumar, R. R. Kompella, and M. Gupta,
enable homograph attack and we discovered 1,516 IDNs “Phishnet: Pre-dictive blacklisting to detect phishing attacks,”
resembling known brands. At least 100 of them are confirmed in 2010 Proceedings IEEE INFOCOM. San Diego, CA, USA:
malicious. Still, attackers have a large candidate pool of Citeseer, 14-19 Mar 2010, pp. 1–5.
deceptive IDNs, given that 42,671 IDNs can be used for
homograph attack and most of them are unregistered. What [8] .D. Sahoo, C. Liu, and S. C. H. Hoi, “Malicious URL
remains less known is that, IDN can be designed to confuse detection usingmachine learning: A survey,” arXiv:1701.07179
users by padding keywords or translating English brand names [cs.LG], 2017.
(called semantic attack). We discovered 1,497 IDNs under the
first case, and some brands (like 58.com) are targeted by over
100 IDNs. We believe the development of IDN needs [9].P. Faltstrom, P. Hoffman, and A. Costello. Rfc 3490:
rectification and efforts should be spared by all entities in Internationalizing domain names in applications (idna).
Internet, including registries, registrars and Internet Network Working Group, IETF, 2003.
software.To make users safety we come across the homograph
attack which is running on application before accessing any link [10].Mozilla.Idndisplayalgorithmhttps://wiki.mozilla.org/IDN
that application will give us alert notification whether the link _Display_Algorithm#Algorithm
is safe or not.If alert notification is safe then proceed to that link
further otherwise it redirect to that application from where that
link we clicked. [11].M. Felegyhazi, C. Kreibich, and V. Paxson. On the
potential of proactive domain blacklisting. LEET, 2010.
VIII. ACKNOWLEDGMENTS
[12].P. Hannay and C. Bolan. Assessment of internationalised
domain name homograph attack mitigation. In Australian
We thank all anonymous reviewers for their helpful Information Security Management Conference, page 13, 2009.
suggestions to improve the paper.We also thank our professors
for accepting this project and guiding us and encouraging us to
overcome with great ideas. [13].Costello, A. (2003, March). “RFC3492- Punycode: A
Bootstring encoding of Unicode for Internationalized Domain
Names in Applications (IDNA).” from
IX. REFERENCES http://www.ietf.org/rfc/rfc3492.txt.
[1]. A. Solanki and S. Dogiwal, “Implementation of an anti- [14].Weber, C. (2008a). The Lookout : Unicode security attacks
phishing tech- nique for secure login using usb (iatslu),” in and test cases Visual Spoofing,IDN homograph attacks,and the
Computational Intelligencein Data Mining-Volume 1. Mixed Script Confusables. from
Springer, 2015, pp. 221–231. https://www.lookout.net/2008/12/unicode-attacks-and-test-
cases-visual_11.html.

Final Haws

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Final Haws

Загружено:

Авторское право:

Доступные форматы

Homograph Attack Warning System

Abstract— As we are living in the era of social media apps, from

The machine learning concepts like deep learning and

After clicking on the link that has received the we will

Fig 2.b This System works in four steps:

IV. PROPOSED WORK 1)Module 1:

With the globalisation of the Internet, standard

To make Internet more accessible to people whose

Вам также может понравиться