Strategies for Tackling Bias in Mobility Data

In the mobility industry, business leaders and policymakers are increasingly looking to big data analytics and artificial intelligence (AI) algorithms to make informed decisions. These decisions range from investments and funding, to reduction of congestion and pollution, to improving safety. However, the data and AI models that underpin these decisions are not perfect. Given that mobility is a human right, it is crucial for leaders to acknowledge, understand, and mitigate data bias to ensure mobility solutions are equitable.

Traditionally, collecting and accessing mobility data has been expensive and time-consuming, often relying on surveys and research. Over time, however, organizations have built reservoirs of big data from sources like crash data, mobile phones, social media, connected vehicles, navigation, ride-hailing, and micro-mobility. There are many ways and stages when bias can enter datasets. This can give an incorrect representation of the reality and needs of society. Data bias influences decisions that impact people’s ability to move, which in turn impacts equity, inclusion, economic growth, health, and safety.

During the Ann Arbor Mobility Summit, Ann Arbor SPARK invited seven experts in mobility, data, and AI to engage in a leadership roundtable. The speakers who participated were:

Michelle Avary, Head of Automotive and Autonomous Mobility, World Economic Forum
Regina Clewlow, Ph.D, CEO & Co-founder, Populus
Cal Coplai, AICP, Product Owner – Safety Insights, Ford
Greg Griffin, PhD, AICP, Assistant Professor, The University of Texas at San Antonio
Genevieve Smith, Associate Director – Center for Equity, Gender & Leadership, University of California Berkeley, Haas School of Business
Ram Vasudevan, Assistant Professor, University of Michigan

During the roundtable, they identified several strategies that leaders can adopt to mitigate data bias and build AI in a way that is more responsible, inclusive, and equitable.

DEFINE HOW DATA SUPPORTS YOUR MISSION

It is essential for organizations to include technical experts in the early philosophical discussions around setting organizational goals. This often requires asking different questions and challenging the assumptions to articulate the organization’s desired impact. For example, organizations working to impact equitable access and mobility for all could discuss the following questions:

What does access and mobility mean to different demographics?
Whose safety and needs are being prioritized in terms of what data is being collected and analyzed to impact mobility related decisions?

Places like Los Angeles, London, and Vienna have found that women’s travel patterns and needs differ from men’s. Women tend to commute at non-peak times because women are overrepresented in professions such as nursing, cleaning, food service, etc., which are not nine-to-five. In addition, women disproportionately take care of children and have more errands to run consequently. They are also more likely to prioritize safety as they plan their travel. As a result of all these reasons, women pay more for transportation in general. In these examples, it becomes clear that the women’s transportation needs have not been thoroughly considered.

Based on their newly-framed goals, an organization can develop more effective strategies to collect and utilize data. To achieve better outcomes, it is essential to document the provenance and assumptions behind the creation of machine learning datasets. For example, when analyzing data that’s being collected from mobile phones, it can be easy to overlook the fact that the data being recorded is not necessarily representative of an individual but rather a device. Mobile devices are often interpreted as proxies for humans, however lower income families might share their phones across their household and community. Once the assumption and gaps have been identified, additional data or research techniques can be used to give a more holistic picture.

DESIGN FOR ALL

Underrepresentation of certain populations in datasets, particularly marginalized communities, is a reflection of biases and stereotypes in society, and is linked to the priorities of the creators, leaders, and decision makers. Therefore, mitigating data bias cannot be achieved through technical fixes alone. For example, a 2019 study from Georgia Tech University warns that some technology used in autonomous vehicles (AVs) was five percent less accurate when detecting pedestrians with darker skin tones. This could have life-or-death impacts on people with darker skin. Similarly, automotive safety for males is improving at a faster rate than automotive safety for females. A 2019 paper in Traffic Injury Prevention found that the odds of a female sustaining a fatal or serious injury during a collision are 73 percent higher than they are for a male. These two examples demonstrate how data bias can lead to devastating outcomes for marginalized populations.

Building solutions that are inclusive, equitable, and accessible is not only ethical, but also expands the organization’s market opportunity and increases product value. Organizations have the opportunity to create value by flipping supply and demand on its head. There is a gray market for transportation that exists among marginalized populations, including rural communities, women, persons of color, low-income households, the elderly, and people with different abilities. There is an attractive economic opportunity to pull this demand out of the gray market and make it more accessible, visible, and scalable — ultimately increasing an organization’s reachable consumer base.

INTEGRATE DIVERSE RESEARCH METHODS AND DATASETS

Due to the complexities related to collecting 100 percent accurate and comprehensive data, it is not a realistic goal to end data bias. Thus, experts aim to minimize data bias to provide a more holistic representation of the reality. For example, when analyzing crowdsourced data from bike ride tracking apps, there is often an overrepresentation of wealthier riders who tend to make recreational rides. This crowdsourced data should be recognized as unrepresentative of all bike riders and complemented by data collected using other diverse research methods like surveys and community engagement.

Ford’s Safety Insights platform recognized the deficiencies in two datasets and combined them to develop a more holistic picture of safety. The team identified that traditional crash data provides multi-modal information, but is neither real time nor comprehensive, and is subject to human error. Conversely, Ford’s connected vehicle data is limited to newer Ford cars (and thus overrepresents higher income population segments) but is real time and even captures near-misses along with accidents. By merging this information, Ford’s Safety Insights team is able to conduct analyses that are more accurate, and build solutions that are more equitable.

BUILD DIVERSE AND MULTIDISCIPLINARY TEAMS

Having diverse, inclusive, and multidisciplinary teams can encourage a variety of perspectives and help address the inherent bias that lies within each individual. Automakers have admitted that their voice activation technology in vehicles has not worked as well for women. This is also the case for people with accents and is likely true for speech impediments. The underlying issue is that homogenous teams can look at themselves as the representative user base and can fail to understand the experiences of others. A team that prioritizes diversity considers how leadership and hiring managers can value different aspects of a resume, like skills and life experiences. Looking beyond elite education and impressive previous employers is an important step in hiring people with diverse backgrounds within an organization. A plan to increase diversity should also be met with a plan to ensure inclusivity. It takes time to shift the culture of an organization to be welcoming and nurturing of different genders, sexualities, races, ethnicities, interests, etc. The first step to inclusivity is recognizing the homogeneity of a team and questioning how to make that team welcome to people who are different.

EMPOWER AND TRAIN YOUR TEAM TO QUESTION

Intentional organizations initiate conversations and develop priorities around mitigating data bias by integrating ethic-based learning and discussions within technical curricula. Ethics in AI should be a prerequisite — not an afterthought — to coding and development work. By investing in employee training and creating a culture of dialogue, organizations can empower their employees to question their assumptions, raise ethical concerns, and bring different perspectives to the table.

If you are interested in learning more about mitigating data bias, check out these publications, which were developed by some of the organizations represented in SPARK’s leadership roundtable: