{"componentChunkName":"component---src-templates-article-tsx","path":"/article/2019-03-09-string-transformations","result":{"data":{"article":{"id":"dc830e28-aa59-5177-9c4a-6b2aef15e8ef","body":"function _extends() { _extends = Object.assign || function (target) { for (var i = 1; i < arguments.length; i++) { var source = arguments[i]; for (var key in source) { if (Object.prototype.hasOwnProperty.call(source, key)) { target[key] = source[key]; } } } return target; }; return _extends.apply(this, arguments); }\n\nfunction _objectWithoutProperties(source, excluded) { if (source == null) return {}; var target = _objectWithoutPropertiesLoose(source, excluded); var key, i; if (Object.getOwnPropertySymbols) { var sourceSymbolKeys = Object.getOwnPropertySymbols(source); for (i = 0; i < sourceSymbolKeys.length; i++) { key = sourceSymbolKeys[i]; if (excluded.indexOf(key) >= 0) continue; if (!Object.prototype.propertyIsEnumerable.call(source, key)) continue; target[key] = source[key]; } } return target; }\n\nfunction _objectWithoutPropertiesLoose(source, excluded) { if (source == null) return {}; var target = {}; var sourceKeys = Object.keys(source); var key, i; for (i = 0; i < sourceKeys.length; i++) { key = sourceKeys[i]; if (excluded.indexOf(key) >= 0) continue; target[key] = source[key]; } return target; }\n\n/* @jsxRuntime classic */\n\n/* @jsx mdx */\nvar _frontmatter = {\n  \"category\": \"Coding\",\n  \"title\": \"Normalizing text for search with string transformations {あいうえお <=> アイウエオ <=> AIUEO}\",\n  \"date\": \"2019-03-09T11:38:45.000Z\",\n  \"tags\": [\"Swift\", \"Foundation\"],\n  \"type\": \"Article\",\n  \"summary\": \"The String API has a lot of powerful functionality. In this post, we were able to take advantage of one of them to create a search functionality for a simple electronic Japanese dictionary.\"\n};\nvar layoutProps = {\n  _frontmatter: _frontmatter\n};\nvar MDXLayout = \"wrapper\";\nreturn function MDXContent(_ref) {\n  var components = _ref.components,\n      props = _objectWithoutProperties(_ref, [\"components\"]);\n\n  return mdx(MDXLayout, _extends({}, layoutProps, props, {\n    components: components,\n    mdxType: \"MDXLayout\"\n  }), mdx(\"p\", null, \"Recently I was working on a personal dictionary for Japanese. The Japanese\\nlanguage comprises three different syllabaries that are used in different contexts. I needed\\na robust search that could find a words represented in any of these syllabaries. In this post, we will see how the\\n\", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" API provides us with enough functionality to do this.\"), mdx(\"h2\", {\n    \"id\": \"japanese-syllabaries\",\n    \"style\": {\n      \"position\": \"relative\"\n    }\n  }, \"Japanese Syllabaries\", mdx(\"a\", {\n    parentName: \"h2\",\n    \"href\": \"#japanese-syllabaries\",\n    \"aria-label\": \"japanese syllabaries permalink\",\n    \"className\": \"mdx-header-link after\"\n  }, mdx(\"svg\", {\n    parentName: \"a\",\n    \"aria-hidden\": \"true\",\n    \"focusable\": \"false\",\n    \"height\": \"16\",\n    \"version\": \"1.1\",\n    \"viewBox\": \"0 0 16 16\",\n    \"width\": \"16\"\n  }, mdx(\"path\", {\n    parentName: \"svg\",\n    \"fillRule\": \"evenodd\",\n    \"d\": \"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"\n  })))), mdx(\"p\", null, \"This is not a blog about Japanese, but in order to understand the problem, it would be helpful to know a little bit about how\\nthe language is structured. As previously mentioned, the language has three different syllabaries\"), mdx(\"ul\", null, mdx(\"li\", {\n    parentName: \"ul\"\n  }, mdx(\"a\", {\n    parentName: \"li\",\n    \"href\": \"https://en.m.wikipedia.org/wiki/Hiragana\"\n  }, \"Hiragana\"), \": Ordinary/simple phonetic lettering system used for Japanese words not covered by Kanji e.g \\u3044\\u307E{Now} \"), mdx(\"li\", {\n    parentName: \"ul\"\n  }, mdx(\"a\", {\n    parentName: \"li\",\n    \"href\": \"https://en.m.wikipedia.org/wiki/Katakana\"\n  }, \"Katakana\"), \": Derived lettering system used for transcription of loan words in Japanese: e.g \\u30A2\\u30A4\\u30B9\\u30AF\\u30EA\\u30FC\\u30E0{Ice cream} \"), mdx(\"li\", {\n    parentName: \"ul\"\n  }, mdx(\"a\", {\n    parentName: \"li\",\n    \"href\": \"https://en.m.wikipedia.org/wiki/Kanji\"\n  }, \"Kanji\"), \": Chinese characters used Japanese. e.g \\u4ECA{Now}\")), mdx(\"p\", null, \"Notice how the example for the Hiragana and the Kanji have the same meanings. This is because the word \\\"Now\\\" is usually written in Kanji (\\u4ECA), but phonetically represented in Hiragana as (\\u3044\\u307E), which can be read as \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"i-ma\"), \". The english representation of the way a Japanese word is pronounced is known \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"romaji\"), \".\"), mdx(\"p\", null, \"The goal was hence to ensure that any of the following search queries: \\\"ima\\\", \\\"\\u3044\\u307E\\\", \\\"\\u4ECA\\\" or \\\"now\\\"; bring up the Japanese dictionary term for \\\"now\\\".\"), mdx(\"h2\", {\n    \"id\": \"normalization\",\n    \"style\": {\n      \"position\": \"relative\"\n    }\n  }, \"Normalization\", mdx(\"a\", {\n    parentName: \"h2\",\n    \"href\": \"#normalization\",\n    \"aria-label\": \"normalization permalink\",\n    \"className\": \"mdx-header-link after\"\n  }, mdx(\"svg\", {\n    parentName: \"a\",\n    \"aria-hidden\": \"true\",\n    \"focusable\": \"false\",\n    \"height\": \"16\",\n    \"version\": \"1.1\",\n    \"viewBox\": \"0 0 16 16\",\n    \"width\": \"16\"\n  }, mdx(\"path\", {\n    parentName: \"svg\",\n    \"fillRule\": \"evenodd\",\n    \"d\": \"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"\n  })))), mdx(\"p\", null, \"A common pre-processing operation to perform on text that is intended to be searched on is to normalize the text. This means modifying the text into a uniform representation that other operations (e.g search) can be performed on. This entails removing diacritics from languages like German or in our case, converting all Japanese syllabaries into one single system that search can be performed on. I had to create a reverse-index of all the ways a Japanese word can be represented, and then normalize this index into \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"romaji\"), \". This is where the power of \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" comes in.\"), mdx(\"p\", null, \"The String API provides us with\"), mdx(\"ul\", null, mdx(\"li\", {\n    parentName: \"ul\"\n  }, mdx(\"code\", {\n    parentName: \"li\",\n    \"className\": \"language-text\"\n  }, \"applyingTransform(StringTransform, reverse: Bool)\"))), mdx(\"p\", null, \"Some examples of \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"StringTransform\"), \" are \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"toLatin\"), \", \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"latinToArabic\"), \" \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"latinToHiragana\"), \", e.t.c (See the \", mdx(\"a\", {\n    parentName: \"p\",\n    \"href\": \"https://developer.apple.com/documentation/foundation/stringtransform\"\n  }, \"official documentation\"), \" for more). Given a \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"StringTransform\"), \", the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"applyingTransform\"), \" function transforms the string from it's current representation into the resulting representation given by the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"StringTransform\"), \" parameter. This made my work a lot easier as I could just create an extension on \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" for my purpose:\"), mdx(\"div\", {\n    \"className\": \"gatsby-highlight\",\n    \"data-language\": \"swift\"\n  }, mdx(\"pre\", {\n    parentName: \"div\",\n    \"style\": {\n      \"counterReset\": \"linenumber NaN\"\n    },\n    \"className\": \"language-swift line-numbers\"\n  }, mdx(\"code\", {\n    parentName: \"pre\",\n    \"className\": \"language-swift\"\n  }, mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"extension\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// converts string from hiragana or katakana to\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// their latin representation (romaji)\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"var\"), \" toRomaji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"?\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n        \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"self\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"applyingTransform\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"hiraganaToKatakana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"reverse\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token boolean\"\n  }, \"false\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"?\"), \"\\n            \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"applyingTransform\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"latinToKatakana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"reverse\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token boolean\"\n  }, \"true\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\")), mdx(\"span\", {\n    parentName: \"pre\",\n    \"aria-hidden\": \"true\",\n    \"className\": \"line-numbers-rows\",\n    \"style\": {\n      \"whiteSpace\": \"normal\",\n      \"width\": \"auto\",\n      \"left\": \"0\"\n    }\n  }, mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  })))), mdx(\"p\", null, \"The function above converts a Japanese string represented in either (Katakana or Hiragana) into its latin phonetic equivalent (romaji). Words represented in Kanji are left untransformed. This is actually okay for our purpose because Kanji to Hiragana conversions can sometimes be ambiguous. The function first converts all hiragana to katakana, and then converts from katakana to romaji/latin. By doing it this way, we prevent \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" from making assumptions about the language string, which most times causes Japanese Kanji characters to be treated as Chinese.\"), mdx(\"h2\", {\n    \"id\": \"searching-japanese-words\",\n    \"style\": {\n      \"position\": \"relative\"\n    }\n  }, \"Searching Japanese words\", mdx(\"a\", {\n    parentName: \"h2\",\n    \"href\": \"#searching-japanese-words\",\n    \"aria-label\": \"searching japanese words permalink\",\n    \"className\": \"mdx-header-link after\"\n  }, mdx(\"svg\", {\n    parentName: \"a\",\n    \"aria-hidden\": \"true\",\n    \"focusable\": \"false\",\n    \"height\": \"16\",\n    \"version\": \"1.1\",\n    \"viewBox\": \"0 0 16 16\",\n    \"width\": \"16\"\n  }, mdx(\"path\", {\n    parentName: \"svg\",\n    \"fillRule\": \"evenodd\",\n    \"d\": \"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"\n  })))), mdx(\"p\", null, \"Let's now use this to solve the search problem we talked about above. Japanese dictionaries like \", mdx(\"a\", {\n    parentName: \"p\",\n    \"href\": \"https://www.edrdg.org/jmdict/j_jmdict.html\"\n  }, \"JMDict\"), \", usually give unambiguous Kanji to Hiragana representations of words. So we create a Japanese word struct:\"), mdx(\"div\", {\n    \"className\": \"gatsby-highlight\",\n    \"data-language\": \"swift\"\n  }, mdx(\"pre\", {\n    parentName: \"div\",\n    \"style\": {\n      \"counterReset\": \"linenumber NaN\"\n    },\n    \"className\": \"language-swift line-numbers\"\n  }, mdx(\"code\", {\n    parentName: \"pre\",\n    \"className\": \"language-swift\"\n  }, mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"struct\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"JapaneseWord\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"?\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// not all words have a kanji representation\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// this is either katakana or hiragana\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\")), mdx(\"span\", {\n    parentName: \"pre\",\n    \"aria-hidden\": \"true\",\n    \"className\": \"line-numbers-rows\",\n    \"style\": {\n      \"whiteSpace\": \"normal\",\n      \"width\": \"auto\",\n      \"left\": \"0\"\n    }\n  }, mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  })))), mdx(\"p\", null, \"The above struct is a slight oversimplification, as a particular Japanese word could have multiple \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"kana\"), \" representations, and `meanings, but we will stick to them having only one value for the scope of this blog.\"), mdx(\"p\", null, \"We can then create a reverse index that maps from possible search indexes of a given word, back to the word. This will usually be persisted in a database, but we will use an in-memory map.\"), mdx(\"div\", {\n    \"className\": \"gatsby-highlight\",\n    \"data-language\": \"swift\"\n  }, mdx(\"pre\", {\n    parentName: \"div\",\n    \"style\": {\n      \"counterReset\": \"linenumber NaN\"\n    },\n    \"className\": \"language-swift line-numbers\"\n  }, mdx(\"code\", {\n    parentName: \"pre\",\n    \"className\": \"language-swift\"\n  }, mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" words \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u4ECA\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u3044\\u307E\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"now\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u4ECA\\u65E5\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u304D\\u3087\\u3046\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"today\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u6765\\u308B\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u304F\\u308B\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"to come\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u884C\\u304F\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u3044\\u304F\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"to go\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u5E30\\u308B\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u304B\\u3048\\u308B\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"to return\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \"\\n\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"var\"), \" indexMap \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"Int\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"for\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"idx\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" word\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"in\"), \" words\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"enumerated\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"if\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" kanji \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" word\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" romanizedKanji \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" kanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"toRomaji \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n        indexMap\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"romanizedKanji\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"default\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"append\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"idx\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n    \\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"if\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" romanizedKana \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" word\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"kana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"toRomaji \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n        indexMap\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"romanizedKana\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"default\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"append\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"idx\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n    \\n    indexMap\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"word\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"meaning\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"lowerCased\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \",\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"default\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"append\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"idx\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\")), mdx(\"span\", {\n    parentName: \"pre\",\n    \"aria-hidden\": \"true\",\n    \"className\": \"line-numbers-rows\",\n    \"style\": {\n      \"whiteSpace\": \"normal\",\n      \"width\": \"auto\",\n      \"left\": \"0\"\n    }\n  }, mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  })))), mdx(\"p\", null, \"In our code above, we are assuming a dictionary containing only five words. We create the reverse-index map by normalizing the kanji property using that as a key in the dictionary that maps back to the index of corresponding \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \" for the kanji. We do the same for the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"kana\"), \", and finally, we use the lowercased meaning to create another reference to the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \". Referencing the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"index(idx)\"), \" of the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \" in the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"indexMap\"), \" instead of the struct itself ensures that the map does not bloated by the size of the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \" struct therefore saving us some memory.\"), mdx(\"p\", null, \"Let's now imagine we had a search function that was to search on the dictionary. We define the search logic below:\"), mdx(\"div\", {\n    \"className\": \"gatsby-highlight\",\n    \"data-language\": \"swift\"\n  }, mdx(\"pre\", {\n    parentName: \"div\",\n    \"style\": {\n      \"counterReset\": \"linenumber NaN\"\n    },\n    \"className\": \"language-swift line-numbers\"\n  }, mdx(\"code\", {\n    parentName: \"pre\",\n    \"className\": \"language-swift\"\n  }, mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"func\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"search\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"String\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"-\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \">\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"JapaneseWord\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"guard\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" normalizedQuery \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"lowercased\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), \"toRomaji \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"else\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \"\\n        \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"return\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n    \\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"let\"), \" wordIndices \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"=\"), \" indexMap\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"normalizedQuery\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"?\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token operator\"\n  }, \"?\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \"\\n    \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token keyword\"\n  }, \"return\"), \" wordIndices\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \".\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token builtin\"\n  }, \"map\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"{\"), \" words\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"[\"), \"$\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token number\"\n  }, \"0\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"]\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"}\"), \"\\n\\n    \\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"print\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"search\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"now\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// => [JapaneseWord(kanji: \\\"\\u4ECA\\\", kana: \\\"\\u3044\\u307E\\\", meaning: \\\"now\\\")]\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"print\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"search\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u5E30\\u308B\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// => [JapaneseWord(kanji: \\\"\\u5E30\\u308B\\\", kana: \\\"\\u304B\\u3048\\u308B\\\", meaning: \\\"to return\\\")]\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"print\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"search\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"\\u304D\\u3087\\u3046\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// => [JapaneseWord(kanji: \\\"\\u4ECA\\u65E5\\\", kana: \\\"\\u304D\\u3087\\u3046\\\", meaning: \\\"today\\\")]\"), \"\\n\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"print\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token function\"\n  }, \"search\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \"(\"), \"query\", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \":\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token string\"\n  }, \"\\\"kyou\\\"\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token punctuation\"\n  }, \")\"), \" \", mdx(\"span\", {\n    parentName: \"code\",\n    \"className\": \"token comment\"\n  }, \"// => [JapaneseWord(kanji: \\\"\\u4ECA\\u65E5\\\", kana: \\\"\\u304D\\u3087\\u3046\\\", meaning: \\\"today\\\")]\")), mdx(\"span\", {\n    parentName: \"pre\",\n    \"aria-hidden\": \"true\",\n    \"className\": \"line-numbers-rows\",\n    \"style\": {\n      \"whiteSpace\": \"normal\",\n      \"width\": \"auto\",\n      \"left\": \"0\"\n    }\n  }, mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  }), mdx(\"span\", {\n    parentName: \"span\"\n  })))), mdx(\"p\", null, \"First, we normalize the search query so that it is in the same form as the keys in our index. Next we search for the normalized query in our reverse-index which gives us the indices (in the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"words\"), \" array) of the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \"s that match our search query. Finally, we return the \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"JapaneseWord\"), \"s corresponding to the indices. \\uD83C\\uDF89\\uD83D\\uDE80\"), mdx(\"h2\", {\n    \"id\": \"conclusion\",\n    \"style\": {\n      \"position\": \"relative\"\n    }\n  }, \"Conclusion\", mdx(\"a\", {\n    parentName: \"h2\",\n    \"href\": \"#conclusion\",\n    \"aria-label\": \"conclusion permalink\",\n    \"className\": \"mdx-header-link after\"\n  }, mdx(\"svg\", {\n    parentName: \"a\",\n    \"aria-hidden\": \"true\",\n    \"focusable\": \"false\",\n    \"height\": \"16\",\n    \"version\": \"1.1\",\n    \"viewBox\": \"0 0 16 16\",\n    \"width\": \"16\"\n  }, mdx(\"path\", {\n    parentName: \"svg\",\n    \"fillRule\": \"evenodd\",\n    \"d\": \"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"\n  })))), mdx(\"p\", null, \"The \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" API has a lot of powerful functionality. In this post, we were able to take advantage of one of them to create a search functionality for a simple electronic Japanese dictionary. What other \", mdx(\"code\", {\n    parentName: \"p\",\n    \"className\": \"language-text\"\n  }, \"String\"), \" APIs are you using and how? I'll be very interested to know more.\"), mdx(\"p\", null, \"Thanks for taking the time \\uD83D\\uDE4F\\uD83C\\uDFFF\"), mdx(\"p\", null, \"Find me on \", mdx(\"a\", {\n    parentName: \"p\",\n    \"href\": \"https://twitter.com/dabbyndubisi\"\n  }, \"twitter\"), \" or \", mdx(\"a\", {\n    parentName: \"p\",\n    \"href\": \"mailto:dabby@dabby.dev\"\n  }, \"contact me\"), \" if you have any questions or suggestions.\"));\n}\n;\nMDXContent.isMDXComponent = true;","excerpt":"Recently I was working on a personal dictionary for Japanese. The Japanese\nlanguage comprises three different syllabaries that are used in …","frontmatter":{"category":"Coding","date":"2019-03-09T11:38:45.000Z","tags":["Swift","Foundation"],"title":"Normalizing text for search with string transformations {あいうえお <=> アイウエオ <=> AIUEO}","summary":"The String API has a lot of powerful functionality. In this post, we were able to take advantage of one of them to create a search functionality for a simple electronic Japanese dictionary."},"tableOfContents":{"items":[{"url":"#japanese-syllabaries","title":"Japanese Syllabaries"},{"url":"#normalization","title":"Normalization"},{"url":"#searching-japanese-words","title":"Searching Japanese words"},{"url":"#conclusion","title":"Conclusion"}]}},"headerImage":null},"pageContext":{"slug":"/article/2019-03-09-string-transformations","imageName":null}},"staticQueryHashes":["2652798324","63159454"]}