Machine learning and AI

Engineering personality analysis from machine learning

Dimitar Kostadinov
November 19, 2021 by
Dimitar Kostadinov

Machine learning (ML) is used to predict many things in our life, such as bitcoin prices, sports results, weather forecasts and movie ticket prices. Numerous researchers have shown that machine learning can also be used to uncover an individual’s personality traits.

Knowing a target’s personality type can be very useful for social engineers. For example, if attackers know the communication style of the potential victim, they can adjust their ways to influence the target through spearphishing or other avenues. All in all, it is a good tool to have in your toolbox if you intend to do some serious social engineering campaigns.

Learn Cybersecurity Data Science

Learn Cybersecurity Data Science

Build your skills using machine learning and other cutting-edge tools to perform various cybersecurity tasks.

Common ways to measure personality types

One of the most well-known ways to assess personality traits is the Myers-Briggs Personality Type Indicator (MBTI). The MBTI framework provides a useful starting point for how to think about a potential target’s personality. As The Myers-Briggs Company notes, its framework is most useful for understanding a person's general personality and communication habits.

It does this by measuring across four key areas

  1. How a person gets their energy (extraversion vs. introversion)
  2. How a person takes in information and learns (sensing vs. intuition)
  3. How a person makes decisions (thinking vs. feeling)
  4. How a person likes to organize their time and environment (judging vs. perceiving)

Photo by Jake Beech / (CC BY-SA 3.0)

Another well-established way to organize personality types is via five basic personality traits, a theory that numerous researchers have contributed to over recent decades. It also attempts to group human personalities into several broad dimensions. These include:

  1. Extraversion (outgoing/energetic vs. solitary/reserved)
  2. Agreeableness (friendly/compassionate vs. critical/rational)
  3. Openness to experience (inventive/curious vs. consistent/cautious)
  4. Conscientiousness (efficient/organized vs. extravagant/careless)
  5. Neuroticism (sensitive/nervous vs. resilient/confident)

Photo by MissLunaRose12 / (CC BY-SA 4.0)

Finding personality data to analyze

So now we understand how we can classify different personality types, but what about the data sources required for these methods to work? Where can those be found?

Well, personal data is everywhere around us.

People share a lot of personal information on social media in the form of likes, dislikes, feelings, opinions, thoughts, citations — all of which can be used as a tentative summary of their own personality. In essence, the personality prediction system turns the unstructured data from abundant sources into a structured format.

Once a proper amount of data is aggregated and analyzed, the personal traits of users would likely be identified through the lenses of ML algorithms (e.g., Naïve Bayes and Support Vector Machines). These results can be useful for self-monitoring, parental monitoring or for businesses who wish to hire employees based on their personality criteria, as well as for social engineering.

In addition, a good source of information to feed your ML-based personality analysis project is the myPersonality database, which collected data from over six million volunteers via a popular Facebook application from 2007 to 2012.

Data types to analyze with machine learning

Researchers have found that various sources can be used to gain and measure personality information — from eye movements to musical tastes to the text of social media profiles.

Eye movement

A 2018 study focused on how eye movements recorded during everyday tasks can predict an individual’s level of neuroticism, extraversion, agreeableness, conscientiousness and perceptual curiosity. Self-explanatory, deep learning algorithms are needed to analyze a large and diverse set of eye movement characteristics and patterns to automate a process where the most informative personal characteristics emerge.

This research team concluded that the visual behavior of individuals related to everyday tasks could predict four of the Big Five personality traits, plus perceptual curiosity — an obvious connection between eye movement and personality. In addition, the examination revealed new eye movement characteristics that can predict personality traits.


A 2015 study found that “among all the unique characteristics of a human being, handwriting carries the richest information to gain insights into the physical, mental and emotional state of the writer.” Graphology is the art of analyzing handwriting to scientifically determine an individual’s personality. It considers handwriting features such as page margins, the slant of the alphabets, the baseline and other characteristics.

Musical preferences

Music Business Worldwide reported that Spotify was granted a patent in October 2020 for leveraging machine learning to personalize its content to match personality traits associated with a given user.

Prior to Spotify assigning a personality characteristic to the user, a personality model is often created. This model learns to identify users’ personal traits “based on a questionnaire, such as the Big Five Inventory (BFI-44) or the Meyers-Briggs personality survey.”

According to the patent, “It is possible to identify a personality trait of a user based on the content (e.g., music) the user consumes (e.g., listens to) and the context in which they consume the content.”

Spotify’s machine learning algorithm was fed 17.6 million songs and over 662,000 hours of music listened to by 5,808 Spotify users. Interestingly, musical preferences and habitual listening behaviors determined the Big Five personality traits with moderate to high accuracy.

Video preferences might be the new first impression

Another 2020 study discovered that ML could infer noticeable personality traits from videos of subjects speaking in front of the camera. The training set consisted of 6,000 videos whose average duration was 15 seconds.

Visual features include the posture of the person, movements of body parts and facial cues. In addition, audio features were also extracted. The objectivity of such a personality analysis requires that the algorithmic bias does not affect results.

Learn Cybersecurity Data Science

Learn Cybersecurity Data Science

Build your skills using machine learning and other cutting-edge tools to perform various cybersecurity tasks.

Profile photos

In 2016, researchers conducted the first large-scale study of profile photos on social media that analyzed personality. “The way in which users present themselves is a type of behavior usually determined by differences in demographic or psychologic traits,” the researchers wrote.

They discovered that users high in extraversion and agreeableness prefer colorful profile pictures that convey positive emotions through their facial expressions — even if such pictures may not be aesthetically pleasing.

CV-based text analysis

A person’s LinkedIn profile summary can also be leveraged for personality information, according to a 2020 study. The following data can be used to design personality classifiers:

  • Word unigrams
  • Fraction of URLs shared
  • Fraction of emoticons used
  • Average formality score
  • Type-to-token ratio
  • Average words per sentence
  • Average word length
  •  As well as NRC emotion, valence, arousal, dominance and sentiment

LinkedIn profile mapping with the help of ML algorithms displayed a clear tendency in terms of what kind of characters are wanted in a particular field. In consultancy firms, where almost ceaseless communication back and forth with clients takes place, extroverts are preferred. On the other hand, IT companies seek employees who are more diligent above all, quietly performing their maintenance tasks, among other things.

A personality prediction system based on CV analysis can be very helpful for human resource departments when they select the most appropriate candidate for the company. One ML-driven project where candidates need to create and submit their CV by filling the CV form estimated that it can predict an individual’s personality with an accuracy of 85.81%. This information can be used both for identifying the right candidates for specific positions or for matching marital profiles.

Personality traits in the C-suite

Machine learning algorithms have also been used to measure CEOs’ and CFOs’ Big Five personality traits to calculate audit fees. In short, a higher level of risk tolerance means significantly higher audit fees. Perhaps such an assessment can also yield some insights into the realm of social engineering; for instance, risk-taking leaders would be more likely to circumvent strict security procedures in the name of good business prospects.

According to another study on CEOs’ impact on corporate activities, some personality traits bear greater importance than others and even interplay with the characters of their fellow C-suite members.

“Based on historical M&A data of S&P 1500 firms, our econometric analysis reveals that the ‘openness’ personality trait of CEOs is positively associated with corporate M&A intensity, while CEOs' ‘consciousness’ and ‘neuroticism’ personality traits are negatively associated with corporate M&A intensity,” the researchers wrote. “Moreover, the impacts of CEOs' ‘openness’ and ‘consciousness’ personality traits on corporate M&A intensity are more pronounced when CFOs have similar personality traits to those of CEOs.”

In the light of this statement, it is no wonder that ML can be used to predict the outcome and quality of a relationship based on questionnaires concerning both partners’ personality traits.

Learn Cybersecurity Data Science

Learn Cybersecurity Data Science

Build your skills using machine learning and other cutting-edge tools to perform various cybersecurity tasks.

Machine learning and engineering personality analysis

How does this all connect to social engineering and cybercrime? For one, employees and organizations need to be aware of the huge amount of data now available online. This can be used not just for basic reconnaissance and phishing attempts, but also to identify those employees that are most susceptible to social engineering attempts.

For example, a 2018 study examined what personality traits make a user susceptible to social engineering attacks. The data suggested users who have high scores in agreeableness and extroversion are more prone to such attacks. These results merit further research into how different social engineering attacks impact different personality types — and what organizations to help protect those types of users.


Dimitar Kostadinov
Dimitar Kostadinov

Dimitar Kostadinov applied for a 6-year Master’s program in Bulgarian and European Law at the University of Ruse, and was enrolled in 2002 following high school. He obtained a Master degree in 2009. From 2008-2012, Dimitar held a job as data entry & research for the American company Law Seminars International and its Bulgarian-Slovenian business partner DATA LAB. In 2011, he was admitted Law and Politics of International Security to Vrije Universiteit Amsterdam, the Netherlands, graduating in August of 2012. Dimitar also holds an LL.M. diploma in Intellectual Property Rights & ICT Law from KU Leuven (Brussels, Belgium). Besides legal studies, he is particularly interested in Internet of Things, Big Data, privacy & data protection, electronic contracts, electronic business, electronic media, telecoms, and cybercrime. Dimitar attended the 6th Annual Internet of Things European summit organized by Forum Europe in Brussels.