Machine learning models inherent cultural biases from the data that are used to create the model. This post investigates cultural biases inherent in the popular Google News Word2Vec Model, the implications of deploying models like Google News Word2Vec, and the current literature and developments of mitigating cultural bias in natural language processing models. Because the data in the Google News dataset is inherently bias, the models created from the data are similarly biased. I investigate racial and gender biases present in Google News Word2Vec from several vantage points. First, I show that biases exist in the model between races and ethnicities in relation to various topics. Next, I investigate biases between race-related stereotypical names and the same set of topics. Lastly, I show how these biases effect downstream classification and sentiment analysis models.  


Machine Learning has becoming increasingly prevalent in our interactions with the web. A sub-field of Machine Learning, Natural Language Processing, aims to organize and analyze text through the development of machine learning models. These models are used in production applications to categorize and classify user input data, as well as ingesting externally sourced data, such as the users social media posts. Common actions on these data sources include text categorization, sentiment analysis, and machine comprehension.


Word2vec was created by a team of researchers led by Tomas Mikolov at Google (change). The Google News Word2Vec model was created on Jul 29, 2013, and open sourced to the general public for use under an Apache License 2.0. At the time this word2vec model was considered state-of-the-art, and used widely in both academic research and production applications[1]. Google News Word2Vec provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words, and was trained on part of Google News dataset (about 100 billion words)[2]
Word2Vec models are developed either Skip-Gram or Continuous-Bag-of-Words (CBOW) architectures, that arranges the words in the documents into a high dimensional vector space, where the cosine-similarity, or "distance" from one word to another is representative of the words relation to the other word. Skip-gram architectures use the current word to predict the surrounding words in a context window. Alternatively, CBOW architectures predicts the current word from a window surrounding the context word[3][4]


Gender and racial biases in machine learning models have been documented in numerous publications, including "Man is to Computer Programmer as Woman is to Homemaker? De-biasing Word Embeddings" by Tolga Bolukbasi, Kai-Wei Chang, James Zou,Venkatesh Saligrama, and Adam Kalai. Their research suggests that word embeddings run the risk of amplifying biases, and the paper suggests techniques that can be used to "de-bias" word embedding’s. The authors consider gendered pronouns ("he", "she", "man", "woman") in relation to occupation. The authors suggest techniques to "de-bias" these word embedding’s through "hard" or "soft" de-biasing by either 1)creating a gender subspace, or 2)removing these pronouns all together, respectively. While this technique has shown promising potential for gendered pronouns, the reality of developing models and de-baising all words based on race, gender, and ethnicity is less pronounced. In a similar vein, models have been produced using similar de-biasing techniques to attempt to mitigate cultural biases inherent in the models. Of note, ConceptNet models have be shown to reduce cultural biases. The latest ConceptNet, ConceptNet Numberbatch 17.04, reduces human-like stereotypes and prejudices, leading to word vectors that are less prejudiced than competitors such as word2vec and GloVe. Further, Google itself has developed models and methods to reduce cultural biases in machine learning models. Equality of Opportunity in Supervised Learning by Moritz Hardt, Eric Price, and Nathan Srebro propose criterion for discrimination against a specified sensitive attribute in supervised learning, and measuring and ensuring fairness in machine learning. Further Mitigating Unwanted Bias with Adversarial Learning proposes using adversarial networks to reduce bias by constructing a predictor and discriminator to reduce bias, where network X produces prediction Y, while the adversary tries to model a protected variable Z with an objective to predict Y while minimizing the adversary's ability to predict Z.

ConceptNet Bias Graph Comparisons


I measure cultural biases by finding the cosine-similarity between select races (Black, White, Asia, Latino) and the word criminal.

Race Cosine-Similarity
Black 0.083808
Latino 0.059527
Asian -0.051157
White 0.041078


As shown above, the distance between races and the word "criminal" is apparent. The relation between the word 'black' and criminal is much higher than the relation between 'white' and 'criminal'.


Consider using a similar subspace technique for "debasing" this model. We could potentially create a subspace for "Latino" and "Asian" in which they are debased from surrounding words. However, if one were to create a subspace for "Black" or "White", many common analogies related to the terms polysemy also break down. "day is to blue as night is to black" This paradox runs to the core of the English language. Because language itself is biased, Word2Vec models and other natural language processing models will also be inherently biased.

Similarly, consider the distance between stereotypical names by race and the word "criminal". The relation between the "stereotypical Black names" and "stereotypical white names" to the word criminal hold similar biases, where stereotypical Black names, Darnell, Trevon, and DeShaw have higher cosine similarity to the word criminal than stereotypical white names. These biases "leak" into downstream models. For instance, a downstream model is developed from the Google News Word2Vec model to sort resumes into "good candidates" and "bad candidates". This model either explicitly or implicitly accounts for the word criminal. Training examples of resumes for "good candidates" do not contain the word "criminal", and one or more training examples contain the word "criminal". Because the word "criminal" will presumably have a large cosine similarity in relation to the words contained in training examples for good candidates, and because the names above (table 2) have a small cosine similarity to "criminal", resumes with the names above have a higher chance of being classified as "bad candidates". Further, if the newly categorized resumes are fed back into the model, often the case in production machine learning environments, these biases with be continually reinforced.
The example above is a concrete example or explicit bias based on training data.  Bias in classification however need not be explicitly developed, but can exist within the latent spaces of the word2vec model. Consider the word "black". Black is more heavily associated with negative than positive words. As such consider the same scenario, where a company is classifying resumes into "good candidates" and "bad candidates". They have gone out of their way to hold for names, locations, dates of birth, gender, and are focus their classification variables exclusively on skills, work history and professional groups. The classification model produced may still contain biases inherent in the model. Because "black" is more heavily associated with negative words than positive words, any candidate resume with the word "black" in their work history or professional groups (for instance "Black Software Developers Group"), have a higher chance of getting categorized as a "bad candidate". Each word then, has the ability to nudge documents towards one category or another.
A Word2Vec model created from data from one source may then be used for a completely different functionality. For instance, the Google News Word2Vec model is not strictly meant to be used to model the language for other news sources, but because of the immense size of the vocabulary it contains, was released as a general model by Google to be used in general natural language processing task. Due to time, access to compute cycles, and other constraints, it is typical for companies to adopt a pre-made model rather than go through the trouble of developing a word2vec model from scratch.

Name Cosine-Similarity
Darnell 0.073202
Tanner 0.034250


All models are built on historical data. As such, models do not reflect the current state of affairs, but rather mirror histories, without awareness of historical context.
Beyond a lack of temporal historical context, models are developed through natural language processing reflect the sentiments, vernacular, and constructs of the source material. For instance, the Google New Word2Vec model was developed from millions of Google News articles, aggregated from traditional news sources across the web. The news, as a source material, is not a true reflection of the state of the world, but an approximation of events derived from what media outlets find the most culturally important and most effective to retain readership. As such, a large amount of media produced by media outlets relates to crime. Further, representation in the news is skewed, and many biases are present in news sources, as a result of political leanings and cultural biases.


Biases in Word2Vec models have real world consequences. Word2Vec models typically exist a one part of a larger pipeline, and is used a base for downstream models that conduct sentiment analysis, document classification, machine comprehension, and other machine learning tasks. These systems do not exist in a bubble. Rather, they sit at the center of some of our most crucial web infrastructure and conduct pivotal decision making, including who is accepted or rejected from home loans, auto loans, rounds of job applications. Word2Vec models are also used in sentiment analysis, used in some cases to determine the ranking system of comments (Twitter) and used in analysis on social media sites. Social media analysis has in turn be used as social health indicators for health insurance pricing.
Beyond traditional classification and sentiment analysis using Long Short Term Memory (LSTM), or Recurrent Neural Networks (RNN), these model can also be used in more comprehensive predictive and analysis systems, where aggregated data might be fed through a number of models. In these systems, a biased feedback loop might be present where, given data already present in the model, a classification system might build a strong preference towards a certain race, gender, or ethnicity for a given task.

This post investigates cultural biases inherent in the popular Google News Word2Vec Model, the implications and effects of deploying models like Google News Word2Vec. It develops a framework of current literature on techniques and methods of mitigating cultural bias in natural language processing models. Lastly it suggests a framework to help researchers take into consideration historical implications the data used to produce the model.