Sessions
GOTO Chicago 2017

Monday May 1
16:45 –
17:35
St. Gallen 3

Localization with NLP: Global Empire-Building for Fun & Profit

Slides:


In order to establish a user base across the globe, a product needs to support a variety of locales. The challenge with supporting multiple locales is the maintenance and generation of localized strings, which are deeply integrated into many facets of a product. To address these challenges at Qordoba, we’re using highly scalable technologies and natural language processing (NLP) to automate the process. Specifically, we need to generate high-quality translations in many different languages and make them available in real-time across platforms, e.g. mobile, print, and web. The combination of various open source tools provides structure for a scalable localization platform with machine learning at its core.

In this talk, we describe the techniques we’re using to provide:

  • Continuous deployment of localized strings
  • Live syncing across platforms (mobile, web, photoshop, sketch, help desk, etc.)
  • Content generation for any locale
  • Emotional response

We will also share our architecture for handling billions of localized strings in many different languages. We talk about our use of:

  • Scala and Akka as an orchestration layer
  • Apache Cassandra and MariaDB as a storage layer
  • Apache Spark, Apache PredictionIO (incubating), Apache HBase, and ElasticSearch for natural language processing
  • Apache Kafka as a message bus for reporting, billing, & notifications
  • Docker, Marathon, & Apache Mesos for containerized deployment

We present our natural language processing (NLP) techniques in the context of a platform that makes it feasible to build products that feel native to every user, regardless of language.