{"id":117,"date":"2022-11-26T20:29:53","date_gmt":"2022-11-26T20:29:53","guid":{"rendered":"https:\/\/fintext.ai\/?page_id=117"},"modified":"2023-01-29T18:02:23","modified_gmt":"2023-01-29T18:02:23","slug":"examples","status":"publish","type":"page","link":"https:\/\/fintext.ai\/?page_id=117","title":{"rendered":"Examples"},"content":{"rendered":"\r\n<p style=\"text-align: justify;\">The figure below presents the 2D visualisation of the principal component analysis (PCA) of the word embedding 300-dimensional vectors. Dimension 1 (x-axis) and Dimension 2 (y-axis) show the first and second obtained dimensions. The tokens are chosen from groups of technology companies (\u2018microsoft\u2019, \u2018ibm\u2019, \u2018google\u2019, and \u2018adobe\u2019), financial services and investment banks (\u2018barclays\u2019, \u2018citi\u2019, \u2018ubs\u2019, and \u2018hsbc\u2019), and retail businesses (\u2018tesco\u2019 and \u2018walmart\u2019). Word2Vec is shown in the top row, and FastText is shown in the bottom row. Google is a publicly available word embedding trained on a part of the Google News dataset, and WikiNews is another publicly available word embedding trained on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset. The continuous bag of words (CBOW) and skip-gram are the proposed supervised learning models for learning distributed representations of tokens. The expected visualisation for the best word embedding is when tokens in different company groups make clusters. This figure shows that only FinText clusters all sector groups correctly.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<figure class=\"wp-block-image size-large is-resized is-style-default\"><img fetchpriority=\"high\" decoding=\"async\" class=\"wp-image-123\" src=\"https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-1024x688.jpg\" alt=\"\" width=\"1024\" height=\"688\" srcset=\"https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-1024x688.jpg 1024w, https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-300x202.jpg 300w, https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-768x516.jpg 768w, https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-1536x1032.jpg 1536w, https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-2048x1376.jpg 2048w, https:\/\/fintext.ai\/wp-content\/uploads\/2022\/11\/FinText_Representation_PCA-1200x806.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"text-align: justify;\">Word embeddings are expected to solve word analogies such as king:man :: woman:queen. The table below lists the responses for some financial challenges produced by these word embeddings. It is clear that FinText is more sensitive to financial contexts and able to capture very subtle financial relationships.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<figure class=\"wp-block-table is-style-stripes has-small-font-size\">\r\n<table class=\"has-black-color has-white-background-color has-text-color has-background\">\r\n<thead>\r\n<tr>\r\n<th class=\"has-text-align-center\" data-align=\"center\">Analogy<\/th>\r\n<th class=\"has-text-align-center\" data-align=\"center\">Google<\/th>\r\n<th class=\"has-text-align-center\" data-align=\"center\">WikiNews<\/th>\r\n<th class=\"has-text-align-center\" data-align=\"center\">FinText<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>debit:credit::positive:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>positive<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>negative<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>negative<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>bullish:bearish::rise:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>rises<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>rises<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>fall<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>apple:iphone::microsoft:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>windows_xp<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>iphone<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>windows<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>us:uk::djia:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>ftse_100<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>microsoft:msft::amazon:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>aapl<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>hmv<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>amzn<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>bid:ask::buy:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>tell<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>ask-<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>sell<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>creditor:lend::debtor:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>lends<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>lends<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>borrow<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>rent:short_term::lease:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>long_term<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>growth_stock:overvalued::value_stock:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>undervalued<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>us:uk::nyse:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>nasdaq<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>hsbc<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>lse<\/em><\/em><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>call_option:put_option::buy:X<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>NONE<\/em><\/em><\/td>\r\n<td class=\"has-text-align-center\" data-align=\"center\"><em><em>sell<\/em><\/em><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/figure>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"text-align: justify;\">We also challenged all six word embeddings to return three top tokens that are closest to \u2018morningstar\u2019. For Google Word2Vec, This token is not among the training tokens. The answer from WikiNews is {\u2018daystar\u2019, \u2018blazingstar\u2019, and \u2018evenin\u2019} which is wrong. The only logical answer is from FinText (Word2Vec\/skip-gram) {\u2018researcher_morningstar\u2019, \u2018tracker_morningstar\u2019, and \u2018lipper\u2019}. When asked to find the unmatched token in {\u2018usdgbp\u2019, \u2018euraud\u2019, \u2018usdcad\u2019}, a collection of exchange rates mnemonics, the results were as follows: Google Word2Vec and WikiNews could not find these tokens, while FinText (Word2Vec\/skip-gram) produces the sensible answer, \u2018euraud\u2019.<\/p>\r\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>The figure below presents the 2D visualisation of the principal component analysis (PCA) of the word&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"class_list":["post-117","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/pages\/117","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fintext.ai\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=117"}],"version-history":[{"count":20,"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/pages\/117\/revisions"}],"predecessor-version":[{"id":319,"href":"https:\/\/fintext.ai\/index.php?rest_route=\/wp\/v2\/pages\/117\/revisions\/319"}],"wp:attachment":[{"href":"https:\/\/fintext.ai\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=117"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}