This month marks one year since the launch of OpenAI. People are really excited about OpenAI and other artificial intelligence tools. OpenAI is impressive with its ability to understand and create content. It can produce blog posts, write school essays, and even tackle advanced exams.
We saw users testing OpenAI on WordPress posts in Portuguese, Spanish, Turkish and many other languages. So we wanted to know: how good is OpenAI at understanding different languages?
Using the TaxoPress AI feature, we tested OpenAI on text from Wikipedia articles. While Wikipedia articles vary in quality, they gave us a good way to compare how well OpenAI understands different languages. To make the analysis fair, we also checked the results from IBM Watson.
Anderson from our team reviewed OpenAI’s analysis of Wikipedia’s article on Portuguese. Anderson called the suggested terms very good and gave OpenAI an 8/10. In contrast, IBM had a tendency to produce too many generic words that could apply to many other articles, such as altura (height), cidades (cities), resultado (results), contatos (contacts), and mundo (world).
Valentin from our team reviewed OpenAI’s attempt to analyze the Wikipedia article on the Spanish language. He thought that the results were good, except for the last tag “hab”. That seemed to be an incomplete word based on “hablan” or “hablado”. We also reviewed IBM Watson’s analysis of the same article and the results were at least as good.
Alex from our team viewed OpenAI’s analysis of Wikipedia’s article on Russian. He said OpenAI produced an excellent set of terms. As with Spanish, the only strange tag was the final one which has a random extra “r” on the end. With IBM Watson, the terms were awful. There were dozens of terms, but only two or three made sense.
Cristian from WPChill reviewed OpenAI’s attempt to analyze several posts, including a Wikipedia article on the Romanian language. Cristian wasn’t that impressed and didn’t think the tags were related to the essence of the articles. He preferred the results from the IBM/Watson integration. OpenAI seemed to be guessing, whereas IBM appeared to “understand” the content. Romanian is one of IBM’s officially supported languages. In Romanian, as in Spanish, OpenAI created a fake term in our testing. For one article, OpenAI produced a made-up word called “financ”, which was extracted from the “financiar” word, which means “financial”.
Sinan Isler was kind enough to review the AI analysis of Wikipedia’s article on the Turkish language. His feedback was, “OpenAI has good results. There is only one unrelated tag in the tag list. The rest looks good. IBM not so much. It has lots of mistakes. Some of the tags are not even related or usable. There are like 50% bad tags. Clear win for OpenAI.”
Tasso from FirePlugins reviewed OpenAI’s analysis of Wikipedia’s article on Greek. OpenAI seemed to really struggle with Greek, producing few results. Tassos commented, “Only the 1st tag is valid. The 2nd does not make sense while the 3rd one is misspelled.”
Ola from our team checked OpenAI’s analysis of Wikipedia’s article on Yoruba. He said the results were perfect, and was delighted that the suggested terms used the correct form of the letters. One thing to note is that OpenAI produced substantially fewer suggested terms for Yoruba than for other languages, even though the main article was a similar length. Yoruba also isn’t supported by other tagging services, so those are probably the best results available for Yoruba at the moment.
Riza from our team reviewed OpenAI’s scan of Wikipedia’s article on Bahasa Indonesian and gave the results 8/10. There were no irrelevant words, but there is still room for improvement there. As with Yoruba, the Bahasa language isn’t supported by other tagging services.
Rochelle from our team read OpenAI’s analysis of Wikipedia’s article on Tagalog. She said the suggested terms were really pretty good and gave them a score of 7/10. Tagalog is also another language that isn’t supported by any other service.
We’ve seen OpenAI succeed for some languages but fail on others. Perhaps what amazed me about OpenAI was that it produced some results for every single language we tested. I dug into Wikipedia to find some less common languages.
This first example uses the FareFare language, and OpenAI is able to produce results using that language’s special characters.
This screenshot below shows an article about the Wayuu language in Venezuela. OpenAI did struggle a little here, turning words such as “kot’tusu sulu’u supüshua’a Mmakay” into a term called “Kot$1tusu sulu$1u supüshua$1a M”.
I did finally manage to break OpenAI using languages such as Dzongkha. With languages like that, OpenAI couldn’t do more than suggest an occasional string of characters.
Summary of the OpenAI results
OpenAI is a wonderful tool, but its understanding of language is inconsistent.
TaxoPress currently has integrations with four different AI services: OpenAI and IBM/Watson, plus also Dandelion and LSEG/Refinitiv. Each service has different pros and cons. We recommend testing them to see which is the best choice for your site.
These four services do support different languages and have different pricing structures. Click here to see a comparison table. That table has details for over 20 popular languages.
If your language is in the table, IBM Watson may produce the best results. IBM has formally tested and approved support for those popular languages.
If your language is not in that table, we recommend testing OpenAI as that is the most likely service to support other languages. OpenAI’s strongest advantage is with languages that are not amongst the most popular 15 or 20 languages worldwide.
In case you’re wondering, let me finish by showing you OpenAI’s analysis of this article you’ve finished reading: