Aug 03
1.0
Section Text 1.1
Artificial Intelligence

Building a Digital Person: Design Best Practices

BY: Michael Musser and Maria Mancheno | August 10, 2023
Share
founderiepitching BITS & BLOCKS CLUB BLOCKCHAIN 201BITS & BLOCKS CLUB, BLOCKCHAIN, 201

Digital humans or digital people are emerging as a scalable touchpoint for customers across various markets. After creating FCAT’s own digital brand ambassador, the FCAT AIX team is sharing best practices for designing digital people and their behaviors.

When

Thursday, August 3, 2023

2:00 – 3:00 p.m. ET

Where

Zoom

Meeting ID: 994 3158 6099
Passcode: 253444

  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print

If you’ve ever had the joy of selecting a baby name, you know that it’s no easy task. Even if you haven’t had to name an eagerly expected child, you may have been part of a team or group trying to name itself. It’s hard to determine this single piece of a person or group’s identity.

Now imagine trying to determine what a new child will look like, how they’ll act, what they’ll be good at, how they’ll speak, and what they’ll laugh at. The task of defining an entire identity becomes exponentially more difficult. And such is the challenge of designing a digital person.

In early 2023 the FCAT AIX team created an FCAT digital brand ambassador that leveraged the latest in digital avatar design, generative AI-driven chat capabilities, and a blend of proprietary FCAT data and large language models. Collaborating with Soul Machines to bring the avatar to life, we had to define all the key attributes of our digital person – their ‘physical’ appearance, voice, personality, traits, and preferences. As we dove into the design process, we wanted to understand digital person design best practices and principles.

In this article we’ll walk through primary design decisions and share our key learnings for each:

  • Style of avatar
  • ‘Physical’ design decisions
  • Voice design decisions
  • Personality design decisions
  • Naming the digital person
  • The digital person ‘backstory’
  • The impact of specific tasks in making these choices
  • The impact of brand in making these decisions

But before we begin, a somewhat obvious question arises – given the ability to create many avatars, why not create multiple avatars that users could choose from? Our long-term vision is to have avatars that can be selected by the user or chosen for them based on a profile. But for our proof of concept, we started with a single digital person that best represented FCAT and could connect with a broad spectrum of users.

Style of Avatar: Realism Along Two Dimensions

Before we began the design process for our avatar, we had to decide which technical solution to leverage. Our technical architecture could support any API avatar driven platform, allowing us to select from a number of avatar vendors. As we explored the possible avatar options, we quickly saw two dimensions of realism being expressed: Form Realism and Behavioral Realism. On the form realism dimension, avatars’ visual representation can range from simplistic cartoon-like caricatures to highly rendered anthropomorphic instances. On the behavioral realism spectrum, avatars’ capabilities can range from static images with no animation to complex, subtle, human-like movements that both speak and emphasize non-verbal communication. See Figure 1 for a complete spectrum of possible avatar solutions.

Given our goal of creating an emotionally engaging avatar that encouraged relational and intellectual connectedness with FCAT, we pursued an avatar solution that was both realistic in form and behavior.

The greatest risk of this approach was for our avatar to end up in the “uncanny valley” – the point when an almost human-like avatar unsettles users because of its incongruent humanistic presentation coupled with qualities that are clearly robotic or non-human. But the reward for this could be great, with a highly realistic avatar that could embody the best of FCAT. This option aligned with research findings that more than two-thirds of users prefer realistic virtual assistants (VA).1

With the approach determined, we then had to narrow in on a provider that could meet those requirements. We turned to Matt Ehlers, FCAT Research Analyst for help understand the capabilities of the major providers. Through that research, we decided to use Soul Machines.

Soul Machines delivers an avatar solution that is highly realistic in form, but also leverages emotional analytics. Leveraging the user’s video camera and computer vision, the tool can interpret a user’s sentiments or emotions. With this input, an avatar can have an emotional response that can match and reflect the user’s own emotions. This emotional mirroring narrows the sensation of the “uncanny valley” by delivering the subtle interpersonal cues that we expect of another human and helps to create a more relatable, comfortable experience for users. Enabled by a large language model integration, the avatar is capable of highly interactive dialog with users.

It’s important to note, the avatar solutions marketplace is constantly changing and evolving. We plan to continually evaluate the ecosystem to stay abreast of the latest and greatest advances in avatar realism and behavior.

Determining Physical Appearance

Next, we dove into one of the key attributes of the digital person– their visual presentation. We wanted to understand how our design choices would impact users’ perception and reaction to the avatar. What gender is preferred? What ethnicity is preferred? Does this change by location? How does attire and dress impact perception and acceptance?

There are two primary theories at play as we considered what gender the avatar should be – Similarity Attraction Theory and Social Role Theory.

With Similar Attraction Theory, users seek similar others (both real and virtual) who they then deem more credible and likable, reducing their discomfort with the interaction. This leads to a higher level of trust in the interaction. We see this validated in research that explains that female users prefer to interact with female VAs, so engagement with an avatar of the same gender is more comfortable.1

With Social Role Theory, users seek those that have the stereotypical social standing, attributes, and power that is needed for a task or goal. With the larger domestic role that females play in society, a female avatar can be perceived as more communal and friendly. So, for users who don’t have a strong preference, a female avatar can carry a stronger, friendlier connection as shown in the research.1

Female Participant Male Participant Total
Female VA 296 (82.7%) 105 (47.7%) 401 (69.4%)
Male VA 62 (17.3%) 115 (52.3%) 177 (30.6%)
Total 358 220 578
Figure 2: Gender preferences for avatars using data from Gendering the Machine: Preferred Virtual Assistant Gender and Realism in Self-Service

Additionally, recent video game research revealed similar trends, showing that only 48% of males prefer a male avatar with almost one third of males preferring a female avatar while 76% of females prefer a female avatar.2

The second ‘physical’ attribute we explored was ethnicity. Given the Similarity Attraction Theory, we felt there was value in creating an avatar that could be perceived as coming from an ethnic group within the densest bands of the world’s population - across northern Africa, through the Middle East, across Southern Asia and the southern portion of North America.

Within Soul Machines Digital DNA Studio, there are multiple facial parameters to explore to create a unique digital person with the physical features we desired. Selecting a rounder facial structure, a warm, rich complexion, and darker hair, we created an avatar that has pan-ethnic attributes with the goal of connecting with more users. Research has found that women preferred avatars of their own ethnic group more so than other ethnic groups..3

The final physical attribute to determine was attire and dress. Soul Machines has a host of default clothing options and hair styles that change the perception of an avatar – t-shirt with short spiky hair for casual and funky, jacket with a high bun for formal and capable, and everything in between (see Figure 3).

We strove to balance Fidelity’s overall professional environment with FCAT’s innovative, funky culture to create approachable business casual attire that aims to resonate with a wide spectrum of users – from a newly hired college graduate to a senior vice president. Additionally, the attire was another chance to fold in FCAT brand elements through colors and patterns. A button down with a custom designed pattern and the FCAT logo brought together the desired identity and professionalism (see Figure 4).

Voice Design Process

A key aspect of the avatar’s identity is its voice. Since the main audience of our avatar would be an English-speaking US market, the design elements we had control over were accent, intonation, and pace of speech.

We first wanted to determine how accents could change the perception of our avatar. In our research we found that the following attributes are most commonly linked to particular accents.4 5 6

  • Spanish – friendly
  • Russian – unfriendly
  • German – assertive
  • American – uneducated
  • Swedish – intelligent
  • Italian – passionate
  • French – sophisticated
  • English – intelligent
  • Irish – attractive

Based on that research, it seemed that a Spanish, Irish, Swedish or English accent would best express our brand attributes. Given that FCAT has a presence in Ireland, an Irish accent seemed it could be a good choice. However, researchers have found that speaking with an accent lowers the innate trust of listeners in what the speaker is communicating, unless the person is especially confident.6F To avoid needing a particularly forceful personality for our avatar and to make the accent easily understood across a number of English speaking countries, we selected an American accent to create implicit trust in what our avatar was communicating.8

Next was to find a synthetic voice that sounded as natural as possible. We had a number of options to choose from due to recent text-to-speech advances. We wanted a natural sounding voice for a higher emotional connection than a robotic, synthetic voice. Additionally, we wanted a more free-flowing natural sentence pattern and intonation.

After sampling from 45 available voices within the Soul Machines Digital DNA Studio, we settled on Microsoft’s “Sara” for her simpler American accent, smoother pace, and more natural, flowing intonation. Hear for yourself:

A voice sample of our avatar: “Hi, this is Microsoft’s Sara female voice. The design team chose this voice as the most natural and most readily trusted voice for their avatar.”

The final voice parameter we researched was pace of speech and its impact on comprehension. Currently, we do not have control of pace within Soul Machines. However, research shows that typical conversations have an ideal pace of 120 to 150 words per minute.9 If the content is more complex, a slower pace may be needed.

Personality Design

Now that we had the visual presentation and voice designed, we turned our attention to the personality of the avatar. Ideally, our avatar’s personality would bring the FCAT brand to life: Curious, Innovative, Thoughtful, Confident, Collaborative.

One of the strengths of the Soul Machine offering is the ability to apply preset personalities to an avatar. Soul Machines created nine personality options with varying intensity of four key personality attributes: Happiness, Energy, Formality, and Sensitivity. These attributes impact the avatar’s responses and reactions.

Within those nine personalities, the “Conscientious” personality, with its thoughtful, intelligent and resourceful attributes best captured our FCAT brand attributes (see Figure 5).

A key aspect of this personality was the higher sensitivity or emotional responsiveness that the avatar could have. Soul Machines’ digital people are powered by their proprietary Autonomous Animation and Human Operating System to enable personalized and empathetic interactions with users in real-time. This means that when the user is interacting with the digital person with their camera and microphone enabled, the digital person can recognize data points on the user’s face that signal positivity, negativity and confusion, as well as understanding the sentiment of the words the user says. . The element of social mirroring – when one person nonverbally mimics another or displays similar posture and gestures– increases social influence and boosts trust.10

The final decision for developing the avatar’s personality was choosing its education level. An avatar with too high a level of education could alienate users but too low an education level could cause users to not trust the information being provided.

Our initial starting point mirrors the content focus of Fidelity.com – targeting a 6th or 7th grade level of vocabulary and understanding. We defined this in our prompt settings as well as in any content we created for the various corpus of information that the avatar would be referencing. Additionally, a user can ask the avatar to explain a concept or term at any level of education – a key advantage of using generative AI to enable our avatar (see Figure 6).

Naming our Digital Person

The penultimate step in the design process was to select a name for our digital person. Just as with a human, names can have significant meaning and impact users’ reaction to the avatar.

We began by leveraging generative AI to create a list of ideas that had some correlation to the FCAT organization, mainly names that began with a phonetic “ef” or “cat” sound. From that list, we ran the names through a filtering process to see which ones met key requirements. These attributes, listed below, would be key if we want this name to be useful for a more interactive assistant in the future, allowing the name to be used as a “wake word” (e.g., Alexa, Siri, Cortana) in the future.

  • A name that is unique, so it doesn’t come up in typical conversation.
  • A name that has multiple syllables so that it has a distinct sound that makes it easier for a computer to recognize.
  • If possible, a name that has meaning or connection to FCAT – it phonetically sounds like “FCAT” or is associated with someone from our history.

Once filtered we selected our favorites as finalists…

Begins with “f” Begins with “cat” Feminine versions of Edward or Ned*
Fiona Catalina/Katalina Nedine
Faye   Edina
*Edward (Ned) Johnson III founded FCAT

Through a review process that included key stakeholders in FCAT Design and FCAT Marketing we chose Katalina (meaning pure in ancient Greek) that met the attributes above and has Mediterranean roots that reflect the pan-ethnic identity of her physical appearance.

The Digital Person ‘Backstory’

The final facet to define for Katalina, our digital person, was her backstory. In literature, a backstory provides insight into the history and experience of a character that enables us to understand their motives, choices, and passions. Creating a backstory gives users an opportunity to build empathetic connections with Katalina. This in turn can help build a stronger emotional connection to the institution through Katalina, as users see a more human side of FCAT.

Additionally, the back story is a method to continue the promotion of the FCAT brand through Katalina’s preferences and likes. By incorporating the interest of our own associates – books they read, pop culture we consume, podcasts we listen to, and beyond – we can provide a glimpse into the ‘collective mind’ of FCAT through this one digital representation. We can even highlight some of the unique skills and abilities within FCAT, promoting the work of FCAT associates.

Finally, while not entirely necessary, we often see users asking avatars questions about themselves. Since Generative AI-powered avatars are so new, there is little research on how to answer that question. Whether it’s our human desire to get to ‘know’ them more fully (even if it is only a digital person) or part of an instinctive psychological test we need to do to a machine to get a sense of how ‘real’ it truly is, there’s more research to do to understand this more fully.

Why It Matters

If designed thoughtfully, realistic digital people, coupled with generative AI, can be a unique and powerful way to engage customers and employees. As we’ve shown in our proof of concept, with careful consideration and application of our brand these digital avatars can deepen the trust and emotive connection to Fidelity.

In the future, FCAT hopes to release a beta version of Katalina on the FCAT intranet site as well as explore other internal use cases where a digital person can be the primary engagement point for Fidelity associates.

For more on the technical aspects of launching a digital avatar driven by FCAT content and large language models, click here.

  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print
Mike Musser is an Interactive/UX Design leader who has a passion for creating new, useful things. He currently focuses on designing and launching new AI driven solutions and has experience in digital product creation, web and app design, as well as tradeshow, store, and museum display production for a wide range of industries and clients.

Maria studied Digital Media Studies and Mathematics at the University of Rochester and received an Interactive and Intelligent Systems Master at Universitat Pompeu Fabra. She is currently pursuing a Master's in Interactive Design at Northeastern University where she continues to grow her passion for new technologies and bringing solutions to real-life problems.
1 Payne, J., Szymkowiak, A., Robertson, P., & Johnson, G. (2011). Gendering the Machine: Preferred Virtual Assistant Gender and Realism in Self-service.
2 Nick Yee (2021). Rethinking the importance of female protagonists in video games.
3 Galina Ya. Menshikova, Olga A. Tikhomandritskaya, Olga Saveleva, Tatyana V. Popova (2014). Gender Differences in Interactions with Avatars of Diverse Ethnic Appearances
4 admin34. (2020, January 24). Americans Worry Most about their Accents. Language Magazine.
5 Why Do British Accents Sound Intelligent to Americans? (2016). Psychology Today.
6 YouGov Survey Results Sample Size: 2018 GB Adults. (2014).
7 Finds, S. (2018, September 23). We’re less likely to trust people with accents, unless they speak with confidence. Study Finds.
8 Do we trust people who speak with an accent? We tend to believe speakers who sound the same as us, though much depends on their tone of voice. (n.d.). ScienceDaily.
9 Dom Barnard. (2018, January 20). Average Speaking Rate and Words per Minute. Virtualspeech.com; VirtualSpeech.
10 Handel, S. (2013, February 17). The Unconscious Influence of Mirroring: The Power of Mimicking Other People’s Body Language. The Emotion Machine.
1095726.1.0
close
Please enter a valid e-mail address
Please enter a valid e-mail address
Important legal information about the e-mail you will be sending. By using this service, you agree to input your real e-mail address and only send it to people you know. It is a violation of law in some jurisdictions to falsely identify yourself in an e-mail. All information you provide will be used by Fidelity solely for the purpose of sending the e-mail on your behalf.The subject line of the e-mail you send will be "Fidelity.com: "

Your e-mail has been sent.
close

Your e-mail has been sent.

Related Articles

Artificial Intelligence
By: Sarah Hoffman | December 6, 2023
Sarah Hoffman, VP of AI and Machine Learning Research in FCAT speaks with Juliette Powell, a researcher, entrepreneur, and keynote speaker at the intersection of technology and business, about how organizations can embrace responsible technology.
12/06/2023
Article
Artificial Intelligence
By: Sarah Hoffman | November 29, 2023
Generative AI still shows great promise, but its transformational power for large enterprises has so far been limited. While there are numerous challenges, inaccuracy, cybersecurity, intellectual-property infringement, and regulatory compliance are the four most commonly cited risks that organizations are working to mitigate. The good news: solutions are coming that will make it easier for enterprises to move forward over the coming year.
11/27/2023
Article
Artificial Intelligence
By: FCAT Quantum Incubator | November 21, 2023
Artificial Intelligence (AI) has become an integral part of our lives, shaping industries ranging from healthcare to finance and everything in between. However, as AI systems become increasingly complex, it becomes crucial to understand how they reach their decisions. This is where Explainable AI (XAI) and Simple Rules come into play. In this article, we will explore the concept of XAI and delve into the use of expressive Boolean formulas to make AI more transparent and interpretable.
11/21/2023
Article

This website is operated by Fidelity Center for Applied Technology (FCAT)® which is part of Fidelity Labs, LLC (“Fidelity Labs”), a Fidelity Investments company. FCAT experiments with and provides innovative products, services, content and tools, as a service to its affiliates and as a subsidiary of FMR LLC. Based on user reaction and input, FCAT is better able to engage in technology research and planning for the Fidelity family of companies. FCATalyst.com is independent of fidelity.com. Unless otherwise indicated, the information and items published on this web site are provided by FCAT and are not intended to provide tax, legal, insurance or investment advice and should not be construed as an offer to sell, a solicitation of an offer to buy, or a recommendation for any security by any Fidelity entity or any third-party. In circumstances where FCAT is making available either a product or service of an affiliate through this site, the affiliated company will be identified. Third party trademarks appearing herein are the property of their respective owners. All other trademarks are the property of FMR LLC.


This is for persons in the U.S. only.


245 Summer St, Boston MA

© 2008-2024 FMR LLC All right reserved | FCATalyst.com


Terms of Use | Privacy | Security | DAT Support