Plough => Ruby

Journey through ruby

An Unsual Thing About kanji(Japanese)

Recently, I was reading an article on natural language processing of Japanese characters and came across something very unusual about the way Japanese characters are written, there is no delimiters between the words, for example, if ‘Ruby’ and ‘Blog’ are two kanji characters then they will be written as ‘RubyBlog’ with no delimiter (space in English) between them. It makes segmenting Japanese text a lot harder since combination of characters could mean two entirely different things.

I just found it very fascinating and challenging at the same time.