Incom ist die Kommunikations-Plattform der Fachhochschule Potsdam

In seiner Funktionalität auf die Lehre in gestalterischen Studiengängen zugeschnitten... Schnittstelle für die moderne Lehre

Incom ist die Kommunikations-Plattform der Fachhochschule Potsdam mehr erfahren

Visualizing profiling classifications as a mean to protect against manipulation

Visualizing profiling classifications as a mean to protect against manipulation

Classification, manipulation, psychological profiles. Do you know how companies know about you?

Abstract

In the following text, the application of profiling techniques used with the objective to depict the most precise image of an individual is investigated. The exploitation of user knowledge by businesses and institutions to manipulate their audience with personalized contents is discussed. Ethical issues related to profiling, as well as ways for users to gain awareness about the examination made possible by their data, are addressed. The core of the thesis elaborates on a proposal for a fictitious tool designed to inform users about and empower them against classification. This tool is embedded in a near-future scenario in which, similarly to the General Data Protection Regulation (GDPR), online data is made accessible to their users, is erasable and regulated to a greater extent. Finally, the constructed prospect is evaluated and a personal reflection and outlook formulated.

1. Foreword

In the digital as well as in the analog world, everybody has a personal perspective which is alimented by themselves and their environment and is filled with bias. This reality equally applies to us – two Interface Design students at the University of Applied Sciences in Potsdam, living in a highly technology-versed surrounding. Our study-focus on (big) data and visualization rose our awareness on the discriminatory impacts of technology. In addition, we worked for a data-driven media intelligence company, which collects and visualizes public data such as digital news, social media content and newspaper articles in order to enable third-party companies to understand their target groups and predict trends.

Data’s computability enables efficient and profitable user investigation, leveraged to analyze people and to predict their intentions, fears, and desires. Nonetheless, extracting value is not trivial: Raw information is not humanly readable and needs to be pictured in order to be understood and contextualized. Geo coordinates, for example, are of small interest for individuals, but businesses are able to extract very valuable information from them. Knowing where their customers have been, what they like and which websites they have visited, help them decide where to place advertisements, understand which products customers potentially need, or even adapt their ways of communicating with their target group.

Data-processing techniques such as face recognition, sentiment analysis, and psychological examination form the basis for classification. Used by Facebook, collected information primarily enables targeted advertisement, yet used by other parties, it can enable inconspicuous microtargeted manipulation. In the offline world, the provision of personal information like name, age, geolocation or bank account number to strangers, or the gratification of permission to create a psychological profile of themselves appears unreasonable. In the online world, despite that, giving away data that is redistributed to unknown companies is very common. The reason for this ambivalent behavior is due to the lack of awareness about which data is available, what methods are used to interpret it and how the results of analyses can be used for manipulation.

2. Comprehension of consumers is the new oil

The internet has become a source of great interest to businesses, institutions, and governments. Like the economist states in its 2017 article, paraphrasing the British mathematician Clive Humby, “the world’s most valuable resource is no longer oil, but data” (1). The leading tech companies’ profits have drastically surpassed the major oil businesses’ (2). As opposed to oil, data is produced at an increasingly high pace (3) and is far from being a scarce and finite resource (4). Nonetheless, similar to oil, data needs to be refined in order to be exploited for a specific purpose and thereby to acquire its real value (5).

Companies, governmental agencies, and universities work closely together (6) and share know-how about processing techniques in order to transform the ‘raw’ (not yet processed) data into valuable intelligence. These methods mostly consist of software algorithms that statistically analyze large sets of data and produce interpretations of the information. Those procedures can comprise, for example, face recognition, computer vision, supervised machine learning, deep learning, text analysis or sentiment analysis. Some of those techniques have a direct impact on the users' experience (for example, when face recognition is enabled to suggest tagging a friend in an uploaded picture). However, most of the information is invisible to the users, since it is stored in hidden server farms and is used with unclear intentions.

In the case of businesses, the most common justification for the use of data is the improvement and personalization of the services. Facebook’s data policy page states under the section “How we use this information?”, that data is used to “[…] personalize and improve [their] Products” (7). While this is conceivable, it is barely the main goal. The major purpose of harvesting and processing data is to gain valuable and exploitable insights about users, which are eventually used by external parties willing to pay the price for this information (8). A more discrete paragraph of Facebook’s data policy declares that the company uses data to “[…] help advertisers and other partners measure the effectiveness and distribution of their ads and services, and understand the types of people who use their services and how people interact with their websites, apps, and services“ (9). This better illustrates Facebook’s strategy and constitutes the reason for its cost-free nature.

The magnitude of information, which, in the cases of Facebook or Google is willingly produced by the users themselves, form a knowledge-base that can be of high value to advertisers, researchers or even law enforcement (10). As Apple’s CEO Tim Cook affirms in 2014, “when an online service is free, you’re not the customer – you’re the product” (11). Following Clive Humby’s analogy, one could proclaim that the comprehension of consumers is the new oil, and not the data itself.

‣ 2.1 Criticism

Big tech companies are criticized for trespassing the limits of privacy by aggressively intruding users' intimate lives and for trading this information to unconcerned parties (12). They have been suspected to provoke intellectual isolation, (filter bubbles) (13) by “[…] narrowing fields of vision and potentially creating echo chambers of reinforced belief”, contributing to an increase of extremism and a segmentation of society (14).

Moreover, data leaks are frequent and information can fall in the wrong hands. Cybercrimes such as identity theft or ransomware-attacks are common, even more so with the increase of smart wearables (15) and interconnected smart home devices (16). Millions of account-credentials of major tech companies have been stolen in the past (17) and with them, the valuable information that they contain.

Documents ‘leaked’ by Edward Snowden in 2013 revealed several global surveillance programs that raised great privacy concerns among the public (18). PRISM, one of the most controversial programs revealed by the whistleblower, collects targeted communications by soliciting companies such as Google or Yahoo to deliver data under the FISA Amendments Act of 2008 (19).

Since 2013, a debate has taken place (20). The EU General Data Protection Regulation (GDPR) has been adopted in 2016 and has been the most important regulation of privacy since the Data Protection Directive adopted in 1995 (21). The GDPR aims to give users control over their personal data and to unify regulations across European countries. It enables users to request the insight as well as prospective deletion of any data previously produced by the user and gathered by a service.

Although such regulations improve the protection of the users, they do not necessarily prevent cyber crimes to be committed. More importantly, they can not guarantee that misuse is avoided. As follows in the next chapter, marketers are devoted to gathering information and influence audiences. Regulations partially protect the citizens from profiling abuses, however, cannot constitute the only resource for that aim.

‣ 2.2 Protection and alternatives

Protecting personal information in the analog world might seem easier than online. Critical documents can be safed in a vault. Similarly, individuals can be tracked and stalked. Nevertheless, analog surveillance can never attain the extent of coverage that online monitoring supports. Methods such as scroll- or mouse-tracking are performed without anyone noticing, making the concealment of intimate personal information extremely difficult. For example, moving the cursor over an image can be interpreted as interest. Correlated with other information, data can be leveraged to create a detailed image of the user. While stalking and collecting data in online contexts is much easier, the interpretation of this data is complex. However, processing techniques are evolving and the huge collections of data now available, allow much better analyses.

Data is the user’s property but not in its ownership. Whereas property implies a belonging, ownership relies on possession. Companies like Facebook own user-data and are able to delete or study it. Consequently, one option to avoid data abuses could be to prevent companies to own user-data.

Technologies that grant users both property and ownership already exist, but those are complex and currently require users to have advanced technical knowledge. Peer2Peer or Mesh networks, for example, allow anonymous and safe browsing. In a Peer2Peer network, users do not communicate with a centralized server but via multiple devices within the network simultaneously. Data is stored on all devices that require it. Any device can serve as a sender, recipient or just as a transmitter that forwards information to another device. No centralized, big server farm or company of any sort is involved. Peer2Peer networks like Hypha (22) created by Aral Balkan, have a very complicated initialization process, require decentralized servers and many individual users to host the data. Also, because data is duplicated, it relies on incredibly big storage space.

Mesh networks have a complete independent infrastructure. Users communicate and request data via radio waves. The German Freifunk (23) relies on a network connecting multiple individuals, each required to buy specialized hardware. Because sending data with too strong radio transmitters is forbidden in Germany, the distance of communication is restrained.

In both of the later secure browsing mechanisms, users are not browsing the regular World Wide Web, with all its services and websites, but are rather limited to their own closed web. Social networks, news pages, search engines or online shopping platforms need to be created exclusively for the closed network. Because those alternatives do not enable user-monitoring, business-models relying on microtargeting are not conceivable.

To analyze the potential of such systems, a software that combines various technologies to ensure the users’ sovereignty over data while providing an enjoyable experience can be imagined. For this proposal to be successful at scale, businesses still rely on user-data. Nevertheless, they are obligated to communicate transparently which data are requested and for which purpose. Users are empowered with the edition, modification, and deletion of their data. Finally, an overview of exploited data, as well as the applications of this usage is provided.

‣ 2.3 Anonymity is not the answer

The aforementioned proposal cannot solve the core of privacy issues, as it relies on anonymity to protect user-integrity. Anonymity does protect the user from being identified, however, it also has various disadvantages. “It allows some users to break the law or violate the rights of others, such as defamation, cybercrimes, bullying, propagation of racist or anti-Semitic ideas, involving damage to other individuals or to society as a whole” (24).

Furthermore, anonymity cannot be guaranteed in every domain. Agreements and contracts concluded with a bank or insurance – most of them now being concluded online – require identity verification.

A chance to transition back to fully anonymous cyberspace is unrealistic and not intended. Due to fears of exclusion or lack of alternatives, removing an account is not an option.

As Rainer Rehak declares in his talk at the 2018 Chaos Communication Congress (35C3) (25), the core of the issue constitutes finding ways to protect the ‘weak’ actors (i.e. citizens or consumers) by regulating the ‘strong’ actors (i.e. businesses and governments). Let alone, regulating is not enough. In the physical world, people reveal their true identity and take responsibilities for their acts. When interacting in public contexts, social conventions prevent negative judgment and attracting attention. Similarly, conventions of appropriate online behavior should be developed to diminish the damages of manipulation.

‣ 2.4 Manipulation

There are various kinds of manipulation such as crowd-, data-, internet-, market-, media-, psychological- or social-manipulation. While incomplete, this list indicates the omnipresence of manipulation and its various applications.

An experiment that Facebook conducted on November 2nd, 2010, tested whether it was possible to increase the participation in the U.S. midterm elections, by manipulating the Timeline (26). A feature which allowed millions of users to indicate if they already voted was presented on the interface. One group was able to see which friends already voted, and another was not. The results showed that the first group displayed a 0.39% higher probability of voting (27). This appears insignificant, but at scale, it means that about 60’000 people might have voted solely because of Facebook’s experiment (28). Despite their relatively minor effect, the experiment indicates that such manipulations are effective.

voting.jpgvoting.jpg

In March 2018 the whistleblower Christopher Wylie released a cache of documents prompting the Facebook–Cambridge Analytica (CA) data scandal (29). Wylie worked for CA as a data consultant and was able to give detailed information on how CA collected and analyzed data to enable psychological manipulation by political parties.

In 2014, a psychologist at the Cambridge University, Aleksandr Kogan, published an app on Facebook called „This Is Your Digital Life“. Each visitor of the app was paid $3 to $4 to complete a survey (30). The app did not only analyze the user’s profile with all of its likes, posts, comments and more, but also those of its whole network. That is how Kogan was able to collect data from approximately 87 million Facebook users (31). He developed and used techniques such as the OCEAN model, which is explained later, to create psychological profiles.

Data and analyses enabled Kogan to list relations between factors such as habitat, sex or comments to estimate political views, life-satisfaction or fears. CA advised on how to influence and manipulate the targeted audience. Donald Trump used CA services (32) to shape people’s thinking and to manipulate the votes. He conducted campaigns that discouraged potential Clinton-supporters to question their choice (33). Ted Cruz, spent around 5.8 million US$ (34) to influence attitudes, fan fears or to communicate with each user in a personalized way. According to election forecasts, Ted Cruz’s percentage grew from 5% up to more than 35% (35). Selecting the target group and inspecting specific demographics, psychographics and personality traits facilitate the communication with the group (36).

The practice of targeting single individuals by analyzing their personality is called microtargeting (37). Specifically prepared and personalized ads, statements or fake news are spread into the interface of each user.

CA is not the only company who does and sells microtargeting and profile analyses. Companies like TargetPoint (38) or Grassroots Targeting (39) use similar techniques.

As Antonio García Martínez said, “none of this is even novel: It’s best practice for any smart Facebook advertiser. Custom Audiences was launched almost six (!) [sic!] years ago, marketed publicly at the time, and only now is becoming a mainstream talking point. The ads auction has been studied by marketers and academics for even longer. The only surprise is how surprising it can still seem to many” (40).

The scandal is particularly disturbing, not only because CA ‘stole’ and analyzed information, but also since it counted on a psychological approach to manipulate.

‣ 2.5 Psychological profiles

Methods for psychological profiling are attempts to classify individuals in holistic categories for which cultural, social and economic backgrounds have no influence. As the American psychologist Gerard Saucier declares, “an optimal model will be replicable across methods, cross-culturally generalizable, comprehensive, and high in utility” (41).

The aim of developing systems to categorize individuals arose a long time ago. 300 BC already, Aristotle defined different kinds of human beings (42). The psychological models described below have been established around 100 years ago.

The DISG Theory of William Moulton Marston is based on the assumptions that every ‘normal’ person (without mental illness) can be categorized using four criteria that relatively define a person's traits: dominance, influence, steadiness, and conscientiousness (DISG) (43).

The Enneagram of Personality (44) relies on nine terms to categorize base personality types in a subjective way. Each of these types is combined with ten different perspectives such as ‘Basic fear’, ‘Ego fixation’, ‘Passion’ or ‘Stress/Disintegration’ (45).

As opposed to the abovementioned model, Carl Gustav Jung’s methods divides human beings into rational and irrational people. Each can be split into four subcategories (‘thinking’, ‘feeling’, ‘sensing’ and ‘intuition’ (46)), producing eight different kinds of personalities. Additionally, each person is classified as either extraverted or introverted.

The Big Five personality traits model or also called the OCEAN model was used by Cambridge-Analytica for microtargeting. One of its advantages is its ability to visualize an infinite amount of personalities, as it is not limited to binary classifications and produces individual and unique results. Another difference is the sort of user data that constitutes the source of the analysis. While in most models, users are interviewed or observed, the OCEAN model takes advantage of existing textual data (47).

Martin Gerlach, a former physicist at the Northwestern University, states that we “[…] don't have enough empirical evidence [to show] that something like this [the concept of personality types] really exists” (48). But first results and evaluations also showed that factors for these types “are at least partially genetically predetermined” (49) and thereby enable the estimation of personalities. “It is not yet clear that this is the ‘optimal’ model“ (50) says Saucier, but Richard Robins, a personality researcher at the University of California, argues that “this is by far the most valid estimate we have of how people cluster into types” (51).

The ability of the OCEAN model to express the uniqueness of people reinforces the effectivity of classification and its manipulating character.

zz-psychological-profile.pngzz-psychological-profile.png

‣ 2.6 The limits of classification

In the case of content personalization, the classification of users is inevitable, because marketing experts are required to use understandable terminology in order to define their target groups. When creating an advertising campaign on Facebook, marketers are invited to select among a list of classifications that characterize the target audience (52). Thus, a user with a ‘complicated’ or ‘engaged’ relationship can be picked out of the mass and addressed with tailored advertisements.

As a result of semantic selection, the user is reduced to a description which is limited by the boundaries of the vocabulary and is interpreted subjectively. The selection of words for categorization constitutes a source of marginalization. The word negro, for instance, was first used around 1440 by the Spanish and Portuguese to describe people of dark-colored skin, as the word literally means ‘black’ (53). However, the term is associated with colonial history and is found offensive (54). In 2010, the word was dropped from the U.S. census and was limited to ‘black’ or ‘African American’ (55).

The classification of gender is another example of intensive social debate, in which vocabulary plays a central role. ‘Woman’ and ‘man’ as gender descriptors are considered a source of discrimination (56). In the 70s, the LGBT community has questioned the immutable perception of the sex, both in its social/cultural and physical definition (57). From this time forward, a series of different terms defining diverse gender forms and sexual orientations have emerged. Sam Killermann, a social justice advocate, lists on his website itspronouncedmetrosexual.com a list of more than 50 LGBTQ+ vocabulary definitions, such as androsexual, bigender or skoliosexual (58). Those recent terminologies are attempts to establish norms including people that do not feel represented fairly by the heteronormative gender definitions.

Although the latter might be more appropriate to describe the variety of gender perception or sexual orientations, these definitions are still limited and remain free of interpretation. Likewise, they are charged with cultural values.

All the GAFAM (Google, Apple, Facebook, Amazon, and Microsoft) are US based companies that carry western values and use English to describe their data, despite the globality and multiculturality of their audience.

Both the vocabulary used for classification and the manipulation they enable represent complex issues. The selection of labels will certainly evolve and become less discriminatory, yet labeling will still be crucial in online profiling.

‣ 2.7 Avoiding the classifications

We assume that by making labeling unreliable, we can minimize the risks of manipulation. Personalized content that relies on target groups should become inaccurate and inefficient. However, we do not believe that this inaccuracy should be provoked by chaos, because it could have negative effects on the user experience. Rather, the risks should be minimized by unsharpening the precision of the classifications and by reducing the amount of available data.

Unsharpening and limiting classification require internet users to realize that all data is a source of labeling. To gain awareness, relatability to individual classification is necessary.

The possibility to undertake changes is given, provided the necessary consciousness is reached. Yet knowing how to enable action requires knowledge about how individual classification mechanisms work. Unfortunately, because they mostly rely on machine learning, there is no generic answer. Reverse engineering the services’ mechanisms might help to foresee how they function. Accordingly, online conventions that suggest actions could be developed. They could consist of the following principles:

  • Reducing the amount of data limits the source material for classification
  • Ensuring that classifications become as vague or as uncertain as possible, reduces their effectiveness
  • Diminishing the match-level of classifications can avoid the assignment into target groups
  • Countering a classification by strengthening a contradictory classification, could help to dull the match-level of both classifications

In the proposal that follows, attention is focused on the first three principles, as the last one deteriorates the user experience.

3. The App - Labelless

In the scenario of the following proposal, further regulation such as the General Data Protection Regulation (GDPR) enforces companies to give users access to the terms they are labeled with. In fact, Facebook already presents a similar yet very minimal list of statements used for targeted advertising on the users’ ad-preferences (59). In a comparable yet enforced way, each service that practices profiling is imposed to provide an API that gives other tools access to the classifications for further analysis, once granted by the user.

Information collected from the user-base enables the training of an algorithm that estimates how content is linked to classifications. Performable actions, that limit the risks of manipulation could thereby be suggested.

We designed a tool that informs the visitors about their classifications. In this chapter, we first explain how the users are introduced to the application to progressively familiarize themselves with the tool. Secondly, we illustrate how each classification is represented and codified on the interface. Consequently, each of the views available in our tool is presented as well as the analysis they facilitate. Accordingly, the ability of users to set goals, in order to restrict the degree of their allocation into a category (match-level), is demonstrated. Finally, we reflect on the strengths and improvement potential of our proposal and provide an outlook on the possible next steps.

‣ 3.1 Initiation and access grant

The purpose of the tool is delicate and its approach complex. The users can be expected to instantly comprehend neither all the intricacies of the issue nor the visual codification of the tool. Therefore, they will be initiated to the application and its purpose in a step by step introduction. Once they are familiarized with the concepts, the visitors are invited to connect service for which an analysis should be created. Using a ‘single sign-on’ mechanism such as the typical ‘login with Facebook’ procedure, tools are linked with the services’ APIs. Once the connection is established, the results of the processed data are displayed.

‣ 3.2 Visualizing the classifications

The proposal uses a visual language that codifies various aspects of the user-classification.

Each individual categorization is represented by a circle or, in other words, a bubble. Each of them communicates two vital types of metrics:

The size describes the match-level

01 Size.png01 Size.png

The size of a circle describes how strongly the subject is associated with a category. To indicate the level of correspondence with a descriptive term, the bubble with the highest match-level appears the largest and vice versa. Additionally, each bubble is constituted of three circles, that reflect the uncertainty of the estimation. The outer circle defines the maximum value, the middle circle represents the average and the smallest the minimum estimation. The distances between those three circles suggest the precision of the classification. A bubble with tightly close circles appears sharper and should alert the visitor that his data makes precise classification possible. On the other hand, bubbles with circles of dissimilar radius are considered less precise and therefore less reliable.

The color describes the amount of underlying data

02 Color.png02 Color.png

We assume that large quantities of data constitute a greater source for classification. Therefore, we consider crucial to convey the amount of information on which the estimations are based. We use a color scale going from blue to pink to color-code each bubble. Less saturated blue circles are based on fewer data and highly saturated pink circles on more information.

These two representative dimensions are key to the understanding of our interface. The use of both size and color is replicated in the different visualizations of the tool.

‣ 3.3 The views

The tool offers two views: The so-called cluster-view gives an overview of the average match-level for each category. The so-called grid-view explains with greater detail the data-based origins of the estimations.

The cluster-view

04b Bubble View hover 2.png04b Bubble View hover 2.png
03 Bubble View.png03 Bubble View.png
04 Bubble View hover.png04 Bubble View hover.png

As its name suggests, the cluster-view visualizes each classification in organized and understandable groups that form clusters and are placed in cloud-like formations on the canvas. Within each cluster, the bubbles are physically attracted, as if they were magnets or charged with gravitational force. As a consequence, the biggest bubbles occupy the center of the groups and the smallest reside on the margin.

Hovering a bubble displays a tooltip that indicates numeric metrics about the match-level and the amount of underlying data.

The grid-view

06 Grid View children.png06 Grid View children.png
05 Grid View.png05 Grid View.png

The grid-view offers details about the estimations’ source accessible through two perspectives.

Data perspective:

This representation allows the most precise examination. It displays what estimations are made in each data node and with which match-level.

In the left sidebar, the same groups as in the cluster-view are listed. When a group is expanded, its children classifications are revealed. On the main area, right of the sidebar, a grid of thin grey lines represents the data-nodes such as posts, likes, website visits, uploaded pictures, or any piece of data used for classification.

On some of the intersections between horizontal (classification) and vertical lines (data element), a blue bar indicates the degree at which the data element is associated with the corresponding category. As in the bubbles of the cluster-view, the bar is constituted of three shades that illustrate the uncertainty of the match-level. Hovering on either a category, a data-node or on a bar displays a tooltip filled with detailed metrics.

07 Bars.png07 Bars.png
08 Bar Tooltip.png08 Bar Tooltip.png

On a group’s row, the blue bars are radially oriented, summarized in a star-like shape. The lopsided distribution of the star’s spikes can express if a data element is the source of an intensive classification, and reveal the sharpness of the estimation.

09 Star Tooltip.png09 Star Tooltip.png

Time perspective:

An additional inspection angle is provided by the time-based summary of the categories. This view is organized in the same way as in the data perspective. However, instead of showing the individual data nodes, the x-axis displays the classification over time. Each term is illustrated by a line chart. Like the bubbles, the chart is colored to demonstrate the amount of underlying data. The gradient colorization is due to the progressive transition from a month’s color to the next’s. Like in the data perspective, group rows summarize child classifications with a star-like form. Yet on this mode, each branch of the star is colorized to illustrate the amount of underlying data.

10 Wave Charts.png10 Wave Charts.png
10b Wave Charts.png10b Wave Charts.png
11 Star.png11 Star.png

Wave chart:
The choice for the particular form, as opposed to a simple line chart, is justified as an analogy to the bubble representation. In fact, the bubble and its three circles can also be seen as a mirrored bar chart (like in the data perspective).

12 Single Wave.png12 Single Wave.png

‣ 3.4 Filtering

Familiarized users can filter to concentrate their focus on classifications that match particular criteria. The two filters currently proposed are:

  • Filtering by size:
    Filtering by size can be very useful to concentrate on bubbles of higher match-levels by excluding the smaller ones.

13b Filter Size.png13b Filter Size.png
13 Filter Size.png13 Filter Size.png

  • Filtering by color:
    Filtering by color can be useful to identify the bubbles relying on vast amounts of data and potentially reliably usable for manipulation.

14 Filter Color.png14 Filter Color.png
14b Filter Color.png14b Filter Color.png

  • Combining filters:
    Both filters can be combined to identify which estimations can be minimized easily. Those appear blue, big and sharp as they are base on few data are estimated with confidence.

Bubble View -- Overview Drawer Groups - b.pngBubble View -- Overview Drawer Groups - b.png
Identity misuse - Correlation btw. drawers and data - Collapsed Categories - Time viewmode - b.pngIdentity misuse - Correlation btw. drawers and data - Collapsed Categories - Time viewmode - b.png

‣ 3.5 Setting goals

For each category, users are invited to set a match-level goal. By dragging an elastic slider, a bordered circle representing the well-aimed objective is displayed on the bubble. While the interaction is ongoing, tips progressively slide in a sidebar on the right. Each tip is accompanied by an indication of the potential effect it has on the bubble.

Bubble View -- Settings -- drag.pngBubble View -- Settings -- drag.png
Bubble View -- Settings -- define goal.pngBubble View -- Settings -- define goal.png
Bubble View -- Settings -- drag released.pngBubble View -- Settings -- drag released.png
Bubble View -- Settings -- expanded tips.pngBubble View -- Settings -- expanded tips.png

The more challenging the goal, the more impactful the tips. These can consist of either the deletion or creation of data. Modifications that are executable via the services’ APIs can be performed without leaving the tool. The creation of alternative content, however, must be done by the users themselves. Once set, the objective circles are displayed on the bubbles and visible from the cluster-view. This way, the evolution of the classification can be observed in a later visit.

‣ 3.6 Design Evaluation

Strengths

  • The choice of the visual language of the tool allows a detailed analysis of the user’s classification.
  • The representation of each category with circles and the color coding, once comprehended, facilitates their investigation.
  • The visualization of the match-level, as well as its uncertainty, are visualized consistently throughout the whole design.
  • The form selection is adapted to the summarization of individual classifications, both over time and in total.
  • The interactions, because of their visual appropriation, feel natural and contribute to the familiarization with the complex concepts of the tool
  • The two views enable different depths of investigation

Improvement potential

  • The choice of size- and color-scale is not well suited for comparison and only permits a rough approximation.
  • The list of provided classifications is also far from being exhaustive.
  • The choice of vocabulary used in the groups’ titles, as well as the selected terms for the categories inside those groups, are inspired by Facebook’s ad-settings. However, deeper research could help to appropriately justify this selection.
  • The color-scale is certainly not accessible for color blind people and is potentially hardly differentiable.
  • The use of light shades results in low contrast that can disadvantage people with limited eyesight.
  • The relatedness between various categories is not extensively visualized.
  • The indication of which service is associated with an estimation is not emphasized.
  • The tool could alert the users more precisely on high priority risks.
  • Currently, the designs do not support a mobile investigation.

Next steps

  • A prototype that implements an explorable introduction and schemes the underlying purpose of the tool can be developed.
  • Extended filters (data-type, sharpness…) can enable a more precise inspection.
  • To improve the understanding of the personal classifications, a comparison with others can be supplied

4. Conclusion

Profiling is complex and finding solutions that seem to fully protect the users seem impossible. Our thesis relies on the quite pessimistic assumption that manipulative and dishonest profiling will continue to happen. Yet we did not want to do Critical Design in a way that either depicts a utopic future or a dystopic one, by just pointing an issue by the finger, without offering any kind of solution.

We identified labeling as an effective influencing method. As a result, the app focuses on visualizing and modifying the extent of those classification estimations. There are certainly other ways to protect users, not covered by our tool. There are also other forms of manipulation that are not limited to the classification of users for which our tool provides no solution and that might have greater manipulative impacts. However, we decided to focus on user labeling because it is understandable by and accessible to most and because we assume that all practitioners of profiling rely on terminology to classify their audience. Yet in spite of profiling, we do not think that all forms of manipulations are nocive to the users. Although, in our opinion, even a manipulation that serves an honest goal should not be done in the first place when not communicated transparently and comprehensively.

We are aware that by suggesting an approach to a hypothetical and relatively effective solution, we make ourselves subject to critique. For instance, we do not clarify who is to develop the tool we propose and which interests those could have. Would the tool be commercial and if so, would it not be for the ‘rich’ instead of for the ‘weak’? Moreover, it is not clear whether the reverse-engineering mechanisms behind the metrics of our proposal would work, and whether the produced classifications would be reliable enough to produce actionable tips. Observing the app, the services could find alternative, non-identifiable strategies to counter-effect the algorithms.

The evolution of technology has made the world smarter, smaller and more connected. Due to the effectiveness of this evolution, it is our responsibility to take control of individual data. Technology has and will continue to evolve. Keeping abreast with this development is important in order to avoid manipulation.

Our reason to study interface design originates from our fascination for technology. Yet the pace at which it has evolved is so high, that we doubt our capacity to be able to follow it, to understand it and to educate the following generations in a way that prevents great disbalances of power between those who know and those who do not. We guess that this problem is as old as humanity, universal and not limited to technology. Yet our thesis relies on the necessity we see to address this topic with different solution-based approaches. We hope to stimulate reflection and to nourish the debate with a visual rather than a technical approach.

5. Personal reflection

The realization of this thesis has enabled us to deep dive in the very present topics of privacy, big data scandals, profiling, targeted advertisement, microtargeting, or technologies that can offer alternatives to the world wide web. Despite our initial, very limited knowledge and naive grasp of the topic, we were able to learn a great deal and to form the dawn of our opinion. We only scratched the surface and are frankly overwhelmed by its immensity. However, we are neither discouraged nor weary. On the contrary, we developed a great interest and are confident that we will keep cultivating related knowledge.

As mentioned in the prologue, we live in a technology-versed surrounding that doubtlessly influences our perspective. The degree at which we consider profiling to be an issue might also originate from our social bubble and professional situation. Probably, many people neither consider profiling an issue nor have the luxury to question it. Nevertheless, hence the increasing dependence of humanity on technology, and because as interface designers we participate in its development, we believe that it is our role to question such topics. As a matter of fact, whilst our studies at the University of Applied Sciences, we have been encouraged to do so. Consequently, we wish to pursue our life, as professional designers but also as users, aware and critical.

6. References and Sources

  1. The Economist Group Limited (2017, May 6). The world’s most valuable resource is no longer oil, but data. Retrieved May 6, 2019, from https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data
  2. Statista (2018, May 11). The 100 largest companies in the world by market value in 2018 (in billion U.S. dollars). Retrieved May 6, 2019, from https://www.statista.com/statistics/263264/top-companies-in-the-world-by-market-value/
  3. SINTEF. (2013, May 22). Big Data, for better or worse: 90% of world's data generated over last two years. ScienceDaily. Retrieved May 6, 2019, from www.sciencedaily.com/releases/2013/05/130522085217.htm
  4. Bernard Marr for Forbes (2018, Mar 5). Here’s Why Data Is Not The New Oil. Retrieved May 16, 2019, from https://www.forbes.com/sites/bernardmarr/2018/03/05/heres-why-data-is-not-the-new-oil/
  5. Michael Palmer quoting Clive Humby (2006, Nov 3). Data is the New Oil. Retrieved May 6, 2019, from https://ana.blogs.com/maestros/2006/11/data_is_the_new.html
  6. F. Stalder “Politische Überwachung”. In Kultur der Digitalität, by Suhrkamp Verlag Berlin, pp. 234--235. 2016
  7. Facebook. Data Policy. Retrieved May 16, 2019, from https://www.facebook.com/full_data_use_policy#measurement-analytics-services
  8. K. K. Roberts. “In Privacy and Perceptions: How Facebook Advertising Affects its Users”. In The Elon Journal of Undergraduate Research in Communications, by Elon University, pp. 1--2. 2010
  9. Facebook. Data Policy. Retrieved May 16, 2019, from https://www.facebook.com/full_data_use_policy
  10. Facebook. Data Policy. Retrieved May 16, 2019, from https://www.facebook.com/full_data_use_policy
  11. Sam Colt for Business Insider (2014, Sep 17). Tim Cook Has An Open Letter To All Customers That Explains How Apple's Privacy Features Work. Retrieved May 16, 2019, from https://www.businessinsider.com/tim-cook-published-a-letter-on-apple-privacy-policies-2014-9?IR=T
  12. Alex Webb for Bloomberg (2012, Dec 5). To complete Here’s How Everyone Is Attacking Big Tech. Retrieved May 17, 2019, from https://www.bloomberg.com/news/features/2017-12-05/here-s-how-everyone-is-attacking-big-tech
  13. E. Pariser. The Filter Bubble: What The Internet Is Hiding From You, by Penguin Books, May 12, 2011
  14. S. Vaidhyanathan. Antisocial Media: How Facebook Disconnects Us and Undermines Democracy, by Oxford University Press, p. 6. May 15, 2018
  15. V.A. Goodyear, C. Kerner, M. Quennerstedt (Sep 22, 2019). Young people’s uses of wearable healthy lifestyle technologies; surveillance, self-surveillance and resistance, Sport, Education and Society, pp 212--225
  16. Anonymous hacker (2012). Internet Census 2012 – Port scanning /0 using insecure embedded devices – Carna Botnet. Retrieved May 16, 2019, from http://census2012.sourceforge.net/paper.html
  17. Nicolas Rivero for QUARTZ (2018, Nov 30).The biggest data breaches of all time, ranked. Retrieved May 24, 2019 from https://qz.com/1480809/the-biggest-data-breaches-of-all-time-ranked/
  18. Rem Rieder for USA Today (2013, Jun 12). Rieder: Snowden's NSA bombshell sparks debate. Retrieved May 16, 2019, from https://eu.usatoday.com/story/money/columnist/rieder/2013/06/12/rem-rieder-surveillance/2415753/
  19. Barton Gellman and Ashkan Soltani for The Washington Post (2013, October 30). „NSA infiltrates links to Yahoo, Google data centers worldwide, Snowden documents say“. Retrieved October 31, 2013 from https://www.washingtonpost.com/world/national-security/nsa-infiltrates-links-to-yahoo-google-data-centers-worldwide-snowden-documents-say/2013/10/30/e51d661e-4166-11e3-8b74-d89d714ca4dd_story.html
  20. Stewart Baker, John Yoo, Michael Chertoff and Catherine Crump on Intelligence Squared U.S. (2017, June 6). “Debating the Constitution: Technology and Privacy”. Retrieved May 26, 2019, from https://www.intelligencesquaredus.org/debates/debating-constitution-technology-and-privacy
  21. EU GDPR.ORG Homepage. Retrieved May 16, 2019, from https://eugdpr.org/
  22. Aral Balkon (2019, Feb 13). On the General Architecture of the Peer Web (and the placement of the PC 2.0 era within the timeline of general computing and the greater socioeconomic context). Retrieved May 24, 2019 from https://ar.al/2019/02/13/on-the-general-architecture-of-the-peer-web/
  23. Freifunk Homepage. Retrieved May 24, 2019 from https://freifunk.net/
  24. Sami Saadaoui for iGmena. Online anonymity: Sometimes necessary, sometimes dangerous. Retrieved May 24, 2019, from igmena.org/Online-anonymity-Sometimes-necessary-sometimes-dangerous
  25. Rainer Rehak (2018, December 28). “Was schützt eigentlich der Datenschutz?”. Retrieved May 26, 2019, from https://media.ccc.de/v/35c3-9733-was_schutzt_eigentlich_der_datenschutz
  26. F. Stalder “Verkaufen, Vorhersagen, Verändern”. In Kultur der Digitalität, by Suhrkamp Verlag Berlin, pp. 224. 2016
  27. F. Stalder “Verkaufen, Vorhersagen, Verändern”. In Kultur der Digitalität, by Suhrkamp Verlag Berlin, pp. 224. 2016
  28. F. Stalder “Verkaufen, Vorhersagen, Verändern”. In Kultur der Digitalität, by Suhrkamp Verlag Berlin, pp. 224. 2016
  29. Carole Cadwalladr and Emma Graham-Harrison for The Guardian (2018, Mar 17). Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach. Retrieved May 24, 2019 from https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election
  30. Matthew Weaver for The Guardian (2018, Mar 21). Facebook scandal: I am being used as scapegoat – academic who mined data. Retrieved May 24, 2019 from https://www.theguardian.com/uk-news/2018/mar/21/facebook-row-i-am-being-used-as-scapegoat-says-academic-aleksandr-kogan-cambridge-analytica
  31. Cecilia Kang and Sheera Frenkel for The New York Times (2018, Apr 04). Facebook Says Cambridge Analytica Harvested Data of Up to 87 Million Users. Retrieved May 24, 2019 from https://www.nytimes.com/2018/04/04/technology/mark-zuckerberg-testify-congress.html
  32. Kate Kaye for AdAge (2016, Aug 18). IN D.C., CAMBRIDGE ANALYTICA NOT EXACTLY TOAST OF THE TOWN. Retrieved May 24, 2019 from https://adage.com/article/campaign-trail/cambridge-analytica-toast/305439
  33. Antonio García Martínez for Wired (2018, Feb 23). How Trump Conquered Facebook—Without Russian Ads. Retrieved May 24, 2019 from https://www.wired.com/story/how-trump-conquered-facebookwithout-russian-ads/
  34. Harry Davies for The Guardian (2016, Feb 01). Ted Cruz erased Trump's Iowa lead by spending millions on voter targeting. Retrieved May 24, 2019 from https://www.theguardian.com/us-news/2016/feb/01/ted-cruz-trump-iowa-caucus-voter-targeting
  35. Concordia for Youtube (2016, Sep 27). Cambridge Analytica - The Power of Big Data and Psychographics. Retrieved May 24, 2019 from https://www.youtube.com/watch?v=n8Dd5aVXLCc
  36. Concordia for Youtube (2016, Sep 27). Cambridge Analytica - The Power of Big Data and Psychographics. Retrieved May 24, 2019 from https://www.youtube.com/watch?v=n8Dd5aVXLCc
  37. Dipayan Ghosh for Mozilla (2018, Oct 04). What is microtargeting and what is it doing in our politics?. Retrieved May 24, 2019 from https://blog.mozilla.org/internetcitizen/2018/10/04/microtargeting-dipayan-ghosh/
  38. TargetPoint Conculting Homepage. Retrieved May 24, 2019 from https://www.targetpointconsulting.com/products-services/
  39. Grassroots Targeting Homepage. Retrieved May 24, 2019 from https://www.grassrootstargeting.com/
  40. Antonio García Martínez for Wired (2018, Feb 23). How Trump Conquered Facebook—Without Russian Ads. Retrieved May 24, 2019 from https://www.wired.com/story/how-trump-conquered-facebookwithout-russian-ads/
  41. G. Saucier (2002). Orthogonal markers for orthogonal factors: The case of the Big Five. Journal of Research in Personality, 36, 1-31. + https://www.researchgate.net/publication/27826750_Critique_of_the_five-factor_model_of_personality
  42. Stanford Encyclopedia of Philosophy Homepage. Retrieved May 24, 2019 from https://plato.stanford.edu/entries/moral-character/#Ari384BCE
  43. Wikipedia. Retrieved May 24, 2019 from https://de.wikipedia.org/wiki/DISG
  44. Wikipedia. Retrieved May 24, 2019 from https://de.wikipedia.org/wiki/Enneagramm#Grundlagen_des_Pers%C3%B6nlichkeitsenneagramms
  45. Wikipedia. Retrieved May 24, 2019 from https://en.wikipedia.org/wiki/Enneagram_of_Personality
  46. C.G. Jung (1921).Psychologische Typen. https://de.wikipedia.org/wiki/Carl_Gustav_Jung#Psychologische_Typen
  47. Personality Insights Homepage. Retrieved May 24, 2019 from https://personality-insights-demo.ng.bluemix.net/
  48. Dana G. Smith for Scientific American (2018, 09 18). Big Data Gives the “Big 5” Personality Traits a Makeover. Retrieved May 24, 2019 from https://www.scientificamerican.com/article/big-data-gives-the-big-5-personality-traits-a-makeover/
  49. Jang, K. L., Livesley, W. J., Angleitner, A., Riemann, R., & Vernon, P. A. (2002). Genetic and environmental influences on the covariance of facets defining the domains of the five factor model of personality. Personality and Individual Differences, 33, 83-101. + https://www.researchgate.net/publication/27826750_Critique_of_the_five-factor_model_of_personality
  50. Saucier, G. (2002). Orthogonal markers for orthogonal factors: The case of the Big Five. Journal of Research in Personality, 36, 1-31. + https://www.researchgate.net/publication/27826750_Critique_of_the_five-factor_model_of_personality
  51. Dana G. Smith for Scientific American (2018, 09 18). Big Data Gives the “Big 5” Personality Traits a Makeover. Retrieved May 24, 2019 from
  52. Facebook Ads Manager. Retrieved May 18, 2019, from https://www.facebook.com/adsmanager/
  53. L. Bennett Jr. What's In a Name? Negro vs. Afro-American vs. Black. In Ebony Magazine Issue 23. By Ebony Magazine. Nov 1967. pp 46--48, 50--52, 54
  54. L. Bennett Jr. What's In a Name? Negro vs. Afro-American vs. Black. In Ebony Magazine Issue 23. By Ebony Magazine. Nov 1967. pp 46--48, 50--52, 54
  55. Amanda Holpuch for The Guardian (2013, Feb 25). US Census Bureau drops 'negro' as option for respondents on race. Retrieved May 18, 2019, from https://www.theguardian.com/world/2013/feb/25/us-census-bureau-negro-respondents-race
  56. LGBTQIA Resource Center staff (2018, August 10). “Words that Hurt”. Retrieved May 26, 2019, from https://lgbtqia.ucdavis.edu/educated/words
  57. J. Butler. Das Unbehagen der Geschlechter. By Suhrkamp. 1991.
  58. Sam Killermann (2013, Jan 7). Comprehensive* List of LGBTQ+ Vocabulary Definitions. Retrieved May 18, 2019, from https://www.itspronouncedmetrosexual.com/2013/01/a-comprehensive-list-of-lgbtq-term-definitions/
  59. Facebook Ad Preferences. Retrieved May 20, 2019, from https://www.facebook.com/ads/preferences/