The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence – Paola Tubaro, Antonio A Casilli, Marion Coville, 2020 – SAGE Journal – 24/04/2020

This paper sheds light on the role of digital platform labour in the development of today’s artificial intelligence, predicated on data-intensive machine learning algorithms. Focus is on the specific ways in which outsourcing of data tasks to myriad ‘micro-workers’, recruited and managed through specialized platforms, powers virtual assistants, self-driving vehicles and connected objects. Using qualitative data from multiple sources, we show that micro-work performs a variety of functions, between three poles that we label, respectively, ‘artificial intelligence preparation’, ‘artificial intelligence verification’ and ‘artificial intelligence impersonation’. Because of the wide scope of application of micro-work, it is a structural component of contemporary artificial intelligence production processes – not an ephemeral form of support that may vanish once the technology reaches maturity stage. Through the lens of micro-work, we prefigure the policy implications of a future in which data technologies do not replace human workforce but imply its marginalization and precariousness.

Recent spectacular progress in research on artificial intelligence (AI) has revamped concerns that date back to the early nineteenth century, when the idea that machines may supersede human labour first spread among scholars, policy-makers and workers. A well-publicized prospective literature emphasizes potential job losses in a world that thrives on data and automation (Brynjolfsson and McAfee, 2014Chui et al., 2016Frey and Osborne, 2017). While some scientific research corroborates these predictions (Acemoglu and Restrepo, 2018), other studies highlight historical dynamics of complementarity rather than substitution between human labour and machinery (Autor, 2015Bessen, 2017), with more complex outcomes such as polarization between high- and low-skilled workers (Autor and Dorn, 2013).

However divergent these analyses may be, they share a quasi-exclusive focus on the expected spillovers of AI to other economic sectors, inferred from long-run industry trends and from known effects of previous waves of labour-saving mechanization. Less commonly discussed is the place of labour in the production of AI itself, and its embeddedness in emerging ‘big data’ practices that define its distinctive nature, with implications that are specific to the present moment.

This paper builds on the assumption that today’s ‘datafied’ economy (Mayer-Schönberger and Cukier, 2013) shapes AI production processes and their unique effects on labour. Contemporary AI solutions are predicated on machine learning algorithms with a voracious appetite for data, despite a history of diverse approaches and visions (Domingos, 2017). As part of a broader drive to accumulate and leverage data resources (Kitchin, 2014), AI production needs human help not only to design cutting-edge algorithms (highly qualified engineers and computer scientists) but also at a much more basic level, to produce, enrich and curate data. This is the role of ‘micro-workers’ (Irani, 2015a), barely visible and poorly compensated contributors, who operate remotely online from their computer or smartphone to execute fragments of large-scale data projects. For example, they flag inappropriate web content, label images, transcribe or translate bits of text. These are activities that humans can do quickly and easily (whence the ‘micro’ adjective), yet more efficiently than computers.

Against popular discourses, the very existence of micro-work suggests that today’s AI industry is in fact labour-intensive, although under less-than-ideal working conditions – poorly paid and lacking job security (Berg et al., 2018). Even whilst automation is still in the making and has not yet been deployed at large scale, its demand for micro-tasks is already transforming the daily practices, experiences and career trajectories of thousands of workers worldwide (Gray and Suri, 2019). In the language of Ekbia and Nardi (2017), this is an instance of ‘heteromation’, a neologism that stresses how, against a myth of automation capable of liberating people from the need to toil, the demand for labour is still high but humans operate on the margin of machines and computerized systems. Gray and Suri (2017)call this mixed configuration of machinery and human activity the ‘paradox of automation’s last mile’, the incessant creation of residual human tasks as a result of technological progress.

This (still scant) literature provides only the global picture, though, without looking deeper into the specific functions of micro-work in today’s AI industry. At what stage(s) of the production chain are humans needed? That is, what micro-tasks support what processes? Which of these tasks, if any, are temporary solutions to fill workflow gaps that will be probably resolved in future? Which ones, instead, fulfil structural needs and are therefore likely to be permanently needed? Are there any tasks or processes where human intervention is more likely to be swept under the carpet for any reason? Answering these questions is important to unpack the nature of the linkages between data and labour, and the re-organizations that digital technologies induce. It is also important to inform policy action: if demand for micro-work is a transitory phenomenon, short-term measures will suffice, but if not, a more profound re-thinking of labour conditions will be needed.

In this paper, we review micro-working activities and show that they perform not just one, but a continuum of crucial functions, between three poles that we label, respectively, ‘AI preparation’, ‘AI verification’ and ‘AI impersonation’. We conclude that because of its wide and diverse scope of application, micro-work is a structural component of data-intensive AI production processes – not an ephemeral form of support that may vanish once the technology reaches maturity stage.

Our findings invite platforms and regulators alike to take concrete steps in regard to the working conditions, remunerations and career prospects of the people who toil behind the successes and promises of present-day AI. This requires a major effort to raise awareness and change mindsets, insofar as micro-work has remained largely out-of-sight so far – not explicitly considered even in otherwise laudable attempts to develop ethical principles for AI (Jobin et al., 2019).

Like other forms of digital labour (Casilli, 2019), micro-work is an outcome of the emergence of platforms as devices to coordinate economic activity between service providers and clients – both construed as independent businesses that make a one-off deal, rather than parts of a long-term employer–employee relationship. Platforms enable client companies to access workforce on demand, at a fraction of the cost of salaried staff, and usually with quicker turnaround times. They advertise themselves to clients as AI-service vendors, and to workers as providers of online earnings opportunities.

The most famous micro-work platform is Mechanical Turk, originally an internal service that Amazon developed to remove duplicates from its catalogue. It then opened its service to external users, positioning itself as an intermediary between providers and ‘requesters’. Many more platforms, such as Microworkers, adopt variants of this model today. Fewer micro-work platforms serve a single monopsonist, for example UHRS (Universal Human Relevance System) for Microsoft. There are also mixed models, such as the German Clickworker which offers both a marketplace like Amazon and a managed service for larger clients (including UHRS).

Micro-work platforms differ in size and scope. Some are tiny start-ups, others have grown to multi-nationals, such as Appen, a publicly traded company head-quartered in Sydney which has acquired former major players such as Leapforce and Figure Eight (formerly called CrowdFlower). Some platforms such as Mechanical Turk cater to a diverse range of corporate needs, while others specialize in AI services (Schmidt, 2019). The latter case often involves alliances between a company that manages micro-workers and another that sells to AI producers, for example Spare5 and Mighty AI (now part of Uber).

In the typology proposed by Schmidt (2017), and re-elaborated by Berg et al. (2018), platform micro-work is an instance of ‘cloud-work’, performed remotely online. It differs from the other main variety of cloud-work, web-based freelancing, which concerns creative work such as graphic design and software development, involves qualified professionals and entrusts them with whole, relatively long projects rather than single short tasks. Another difference is that micro-work is dispersed to an undefined set of anonymous, replaceable contributors via the platform, rather than assigned in full to a selected, identified contractor. Hence, micro-work is sometimes referred to as ‘crowd-work’ or ‘crowdsourcing’.

Both forms of cloud-work differ from ‘gig’ labour where services such as ride-hailing and goods delivery are performed offline, even though coordination occurs online through a platform. Yet one form of gig-work, which Schmidt (2017, p. 7) calls ‘local micro-tasking’, is similar to micro-work and consists of small tasks (such as taking pictures of products in shops) given to an unspecific set of providers. The platform Clickworker offers both online and local micro-tasking.

The nascent literature on platform labour often conflates micro-work with other forms of platform-based digital labour, especially freelancing. Indeed, some micro-work platforms occasionally make available more qualified tasks, such as translations or text-writing, and conversely freelancing platforms happen to publish simpler, little-compensated tasks. While useful to assess the size and growth rate of the whole online (or cloud-based) global labour market (Kässi and Lehdonvirta, 2018), this approach obscures the linkages with data and AI production. There is now a need to decouple micro-work from other forms of platform labour, which serve a range of economic and societal needs which do not necessarily tie in to the data economy behind automation. How do paid micro-tasks precisely affect this particular production chain, and distinguish themselves from platform work more generally?

AI producers are companies, start-ups and research labs that use machine learning to develop applications ranging from chatbots and hands-free vocal assistants, to automated medical image analysis, self-driving vehicles and drones. Let us first review the basic functioning of machine learning, and derive preliminary conjectures on where and when it may need human intervention.

How data fuel machine-learning algorithms

At the crossroads of informatics and statistics, machine learning ‘teaches’ computers to find solutions from data, without each step having to be explicitly programmed (Alpaydin, 20142016). Its quality can get progressively better over time, depending not only on the algorithm but also on the data given to it. For example, development of a vocal assistant requires huge audio datasets with examples of potential user requests (like ‘turn on the kitchen lights’, ‘call mum’, etc.).

So-called ‘supervised’ machine learning algorithms, the most widely used in both research and industry to date, need not only high-quantity, but also high-quality data, that is, complete with annotations. Supervised machine learning aims to infer a function that maps an input to an output based on exemplary input–output pairs (‘training’ dataset). The learned function must be able to assess new cases in ‘test’ datasets. For example, to teach a computer to distinguish between images of dogs and other animals, one would need a training dataset that associates each image (input) to an annotation, such as a tag that says whether the image shows a ‘dog’ or ‘other’ (output); after having been exposed to many tagged images, the algorithm will be able to classify new, untagged images and determine whether they represent dogs. The more accurate the tags in the training dataset, the more the solution can be fine-tuned and generalized to a wide range of real-world cases.

Despite its apparent simplicity, the image recognition classification algorithm just outlined has far-reaching implications (Bechmann and Bowker, 2019), for example in the development of self-driving cars, which need to recognize objects such as a dog crossing the street, before they can make decisions (Schmidt, 2019Tubaro and Casilli, 2019). A state-of-the-art example of the supervised family is ‘deep’ learning, which analyses data through a layered structure of algorithms inspired by the neural network of the human brain, leading to more effective learning.

Less demanding in terms of data quality is ‘unsupervised’ machine learning, where data have no labels, and the algorithm is left to find the underlying structure based on common patterns. Two main types of algorithms can be distinguished, dimensionality reduction which consists in mapping a multidimensional dataset into more interpretable two-dimensional structures, and clustering, which groups observations into coherent classes. Today’s unsupervised learning brings to the next level some quantitative techniques traditionally used in social science, namely factorial analysis and hierarchical clustering (Boelaert and Ollion, 2018). Like these older tools, it is often used when the objective is unclear, or for exploratory analysis. It is unlikely to wipe out the supervised variant, because of the many tasks that it cannot do. Besides, interpretation of results can be problematic due to lack of objective standards to judge algorithmic performance.

A third family is reinforcement learning, a formalized version of human trial-and-error which uses mapping between input and output like supervised learning, but unlabelled data like unsupervised learning. It includes a feedback loop that gives the algorithm positive and negative signals, so that it adjusts accordingly. Because of its massive data needs, long computation times and limited generalizability, reinforcement learning can only be applied to specific domains such as games, where data can be sourced by simulation.

A structural demand of micro-work for AI data preparation?

The above summary suggests that AI companies depend heavily on data resources, including not only raw data but also annotations that add extra meaning by associating each data point, such as an image, with relevant attribute tags. To account for these dual aspects, we propose to distinguish data generation and data annotation1as two separate sub-processes in the production chain of AI. They are part of the preliminary phase, the first stage of the AI production chain – which we label ‘AI preparation’. They are challenges for AI producers, despite a widespread rhetoric of ‘data deluge’ (Anderson, 2008): the right data are not always available or accessible, and when they are, they often lack suitable annotations, and need intervention before they can be used. On this basis, we can formulate a preliminary expectation to guide our empirical analysis. It is that micro-work caters precisely to these unmet data needs: it contributes to AI preparation, in terms of data generation and data annotation. Micro-work is an input to AI in the current data economy.

Our other expectation is that micro-work is a structural rather than a temporary input to AI production. While some technology enthusiasts believe that data generation and annotation tasks will ultimately be fully automated, the ‘heteromation’ paradigm (Ekbia and Nardi, 2017) implies that some essential tasks will always be directed to humans as indispensable though hidden providers. There will always be a division of labour between the two. Our review of machine learning techniques corroborates this line of thought: their huge and growing data needs will keep demand for micro-work high in the foreseeable future.

Our expectations are not hypotheses to be tested stricto sensu. We take them simply as the starting points with which we enter the research setting, the prior assumptions that derive from the still limited social research on platform labour, and from contextual knowledge of the basic linkages between AI, machine learning and data. Being reflexively aware of these initial hunches is a guide toward comparing them to common discourses and to stakeholders’ actual experience, but does not engage us to stick to them. Rather, we use empirical evidence to enrich and substantiate these preliminary expectations, to complexify them and if necessary to revise them, in an iterative, emergent process.

The data we use are from a mixed-methods study of technology companies and of the day-to-day routines of platform workers that took place in 2017–2018 in France (Casilli et al., 2019). A country with traditionally high technological and scientific development, France is currently the second European country by number of AI start-ups (MMC Ventures, 2019), after significant public and private investments (French Government (Ministry for the Economy, Ministry of Education and Ministry of Digital Technologies), 2017Villani et al., 2018). In addition to its inherent interest, focus on France enables to extend our gaze beyond the high-profile platforms, particularly Amazon Mechanical Turk, which have been overrepresented in the literature to date, despite being little used outside the United States. With its numerous, competing players, some of which operate only within national boundaries, France is exemplary of a trend toward diversification and specialization of micro-work platforms, attracting an ever-wide range of users.

We combine insight from AI producers and micro-work platforms, and the views of people who perform micro-tasks online. We triangulate information obtained from these different stakeholders in order to reach greater consistency and completeness, to cross-check findings and to corroborate them. We do so because each of these stakeholders has a different perspective and, taken in isolation, would provide only a partial and incomplete view. Leaders and staff of AI companies have the best understanding of technology and its needs, but as we will show, they are often unwilling or unable to disclose their use of human workers. In turn, micro-work platforms that mediate between AI companies and workers are best positioned to know the structure of the market, but their need to attract clients and investors may inflect their communication strategies. Finally, workers can share the unique, concrete experience of doing micro-tasks, but they are not always aware of their purposes and final uses.

To uncover the viewpoint of AI producers and of the platforms/vendors that supply data labourers to them, we use primarily an inventory of micro-work platforms and related AI data service vendors. Although focus is on France (local platforms such as Foule Factory2and IsAHit), the inventory also includes information on international platforms whose scope of activity is global and includes France (like Appen, Clickworker, Lionbridge, Mechanical Turk, Microworkers), and for comparison purposes, it adds a few AI start-ups with more limited penetration in France (like the former Mighty AI). We compiled this inventory using desk research. We explored industry reports, newspaper articles, and most importantly the websites, press kits, and other communication tools of the platforms and companies concerned. To validate our findings from this material, and to gain more insight into less publicized aspects, we use in-depth interviews with three French clients and platform owners.

To account for micro-workers’ perspective, we rely on a questionnaire that we distributed in 2018 as a paid task on Foule Factory, collecting 908 unique, complete responses. It took about 25 minutes to fill this questionnaire, which covered a variety of topics, from basic socio-demographic information to education and skills; family composition, parenting responsibilities and household activities; income, professional activity and work experience; social capital; and micro-working practices including, among other things, frequency of activity, number and types of platforms used, earnings derived. While a full description of these data is beyond the scope of this paper, the interested reader may refer to Casilli et al. (2019)for more details. Here, we use only one open-ended question from this survey: Please tell us about the last task you did on Foule Factory. We coded responses independently and cross-checked our categories for greater reliability. For the purposes of the present paper, we analyse results qualitatively, to identify common patterns; a quantitative description can be found in Casilli et al. (2019). We also use in-depth interviews with micro-workers. We invited a sub-sample of 72 questionnaire respondents to a follow-up interview of 30–60 minutes. The interviews allowed them to expand on the responses they had given to the questionnaire, sharing not only more factual details, but also their own views and the meaning they gave to their micro-working activities. We did another set of interviews (60–120 minutes) with 14 French micro-workers active on varied international platforms such as Appen, Clickworker and Lionbridge, and with 3 African micro-workers who do tasks for French requesters through IsAHit. All interviews were audio-recorded and followed by a written report by the interviewer(s).

Let us start by looking at the role of micro-work platforms in the provision of data generation and annotation services for AI companies. Appen says openly that effectively harnessing the power of machine learning requires human skills.3Lionbridge AI sells ‘Machine intelligence, powered by humans’.4We first look at these value propositions through platforms’ communication materials, before turning our attention to the views of the underlying workforce.

Platforms’ offer of AI preparation services

An example of data generation that most platforms advertise to their clients is audio utterance collection, important to train voice-controlled devices. Platforms can leverage their contributor base to gather this data with a variety of vocal timbres, regional accents, uses of slang and contexts (such as background noise). Platforms that operate at global level can replicate the data collection in different languages – Appen boasts over 180, Lionbridge 300. Platforms that operate at national level also have advantages: a vocal assistant to be sold in France, for example, must be trained in the country to learn French accents, the names of French cities and personalities, and local acronyms. The process can scale: a producer of vocal assistant software that we interviewed has built an application allowing users to customize the assistant to their needs. It integrates a ‘data generation’ functionality through which users can request bespoke datasets: the company manages the order by passing it to a standard platform such as Mechanical Turk or Foule Factory, monitoring execution and ensuring delivery.

Platforms present data annotation as their core offer to clients. With sound or text data, they propose services such as categorization of topics in a conversation, determination of emotions behind a statement, classification of intents and identification of parts of speech. With images and videos, the offer includes assignment of images to categories, detection of objects within images with dedicated tools such as bounding boxes (rectangles around the objects of interest), cuboids (3D bounding boxes) or polygons (precise drawings around objects of interest, possibly of irregular shape), addition of in-image tags to each object and labelling of anatomical or structural points of interest (like eyes in faces) with so-called ‘landmark annotation’.

Technology moves fast, and computers are now pretty good at tasks that seemed insurmountable even just a few years ago, such as (to use the above example) telling apart a dog from another animal. Human capacity is now in demand to recognize details and nuances, indispensable to increase the precision of computer vision software for sensitive applications such as autonomous vehicles and medical image analysis. A state-of-the-art technique is semantic segmentation, much more precise than those mentioned above because it involves separating every pixel of an image into the parts that an algorithm will have to recognize. On Lionbridge’s blog, a machine learning specialist speculates that pixel-accurate annotation is becoming the new norm, while rougher tools such as bounding boxes may eventually disappear.5

Such accuracy would be impossible if workers had to draw shapes with the functionalities of standard software. Micro-work platforms such as Appen and Lionbridge compete fiercely to develop cutting-edge tools, themselves based on machine learning, that increase the precision, scale and speed of human data annotators. Some AI start-ups have also joined the race, usually focusing on technological development and using one or more standard micro-work platforms to access human contributors. One solution consists in having workers manually label a sub-set of data, and then letting an algorithm learn those annotations and applying them to the rest of the dataset. Another relies on an automated tool that roughly pre-annotates objects (for example, by forming lines around cars in a traffic image), so that the worker only needs to adjust the details. Figure Eight’s ‘active learning’ distributes labour between humans and machines:

Computers can automate a portion but not all of the data, thus requiring a human-in-the-loop workflow. In this environment, computers can complete the high confidence rows and humans the lower confidence.6

To summarize, technological progress has not eliminated the need for micro-tasking, but transformed it, integrating humans and computers more tightly. These evolutions accompany the growth of the business of AI preparation: the industry think-tank Cognilytica estimates the worldwide market for what we call data generation and annotation at over $500M in 2018, expecting it to rise to $1.2B by 2023. As part of this trend, the market for third-party data solutions attained $150M in 2018 and will exceed $1B by 2023 (Cognilytica, 2019). According to Lukas Biewald, founder of Figure Eight, the recent rise of deep learning has boosted demand, because its complex algorithmic structures require much larger (labelled) datasets than other machine learning techniques:

Deep learning has been fantastic […] We began noticing deep learning when we started having customers who would ask for tens of millions of data rows right off the bat.7

Micro-workers’ experience of AI preparation

The concrete experience of micro-workers broadly confirms what platforms’ communication suggests. Our online survey provides evidence of data generation in the form of voice recordings: many participants reported having read aloud a few short sentences in French and audio-recorded them. Variants of this task include requests to record, say, five ways to ask a virtual assistant about the weather. Some micro-workers understood that this was ‘to help design intelligent virtual assistants controlling connected objects’ (L.8). This task requires large numbers of participants to ensure sufficient variety, and it is unlikely to change much over time, in that linguistic skills and local knowledge cannot be easily replaced or outsourced to offshore providers.

Data annotation tasks are also common. One consists in classifying objects such as DVD titles, photos and ‘virtual avatars’ (R.). Sometimes, workers had to associate images and names of commercial products as found in multiple online marketplaces – clearly to teach computers to recognize essential similarities (same product) despite dissimilar contexts (different websites). CV anonymization, reported by almost one fifth of respondents, was understood to be about ‘removing all distinctive marks that could be discriminatory’ (C.). Of note, workers also had to tag the spaces in the document where names, birth dates and addresses were placed originally – arguably to help some recruitment algorithm to understand the structure of a CV. In passing, this is a task that requires local knowledge insofar as job application standards vary across countries.

Regarding image annotation, some respondents mentioned a task they called ‘motocross’ where they had to identify roads and tracks in photographs and to indicate the nature of the ground (pebbles, road, sand, etc.). Some thought it was for a video game, others for a census of racetracks. This is because, as we soon realized, requesters vary widely in the extent to which they provide detailed information on their tasks, and on the purposes they serve, leaving workers often confused. A more dramatic example of the consequences of erratic information from clients is a task that asked micro-workers to tag vegetables (tomatoes, carrots, etc.) in pictures of salads. M. (30 years old, married, resident of a mid-sized city, full-time teacher and micro-worker in her spare time) found this task ‘silly’ but adequately paid for the limited effort it required. She grasped that it served to develop some software application for nutrition. But D. (25 years old, single, living in a rural area, unemployed) could not make sense of it:

They tell you: draw a circle around a tomato. We don’t know why. I think everyone knows what a tomato is, I hope […]. Then I think to myself: if it’s there, it must be useful to someone, for something, but … Why, I don’t know.

A type of task that did not surface in the questionnaire, but was mentioned in interviews, consists in flagging violent, pornographic or otherwise inappropriate online content. After the attacks of 2015–2016, A. checked ‘monstrous’ terrorist videos for several weeks, 30 hours a week, as clients ‘were panicking’. Exposure to this content can be distressing (Roberts, 2019), although A. assures that she has found ways not to be personally affected. Only a small part of content moderation can be automated: any new types of data first require micro-workers to train future automated solutions.

In sum, micro-workers’ experience confirms their important contribution to data generation and annotation for AI, suggesting that this role is neither temporally nor spatially concentrated, although they are not always aware of it.

Our prior expectations about the linkages between micro-work and AI did not factor in scandals, yet examples abound. In 2019, investment firm MMC Ventures reviewed over 2,800 purported AI start-ups across Europe, and found evidence of AI consistent with their value proposition in about 60% of them. The newspapers that covered the story were eager to stress that, well, a whopping 40% of these start-ups do not do AI (Ram, 2019).

The year before, we heard similarly outraged voices – not from micro-workers, who often lack awareness of the ultimate goals that their activity is serving as discussed above, and are therefore ill-positioned to judge whether an alleged AI is genuine. We interviewed K., a Parisian entrepreneur and start-up founder who blamed his competitors for their claim to do AI while, instead, they outsource all work to humans recruited through platforms overseas. He went as far as to claim that ‘Madagascar is the leader in French artificial intelligence’. Even more upset was S., a student who did an internship in an AI start-up that offered personalized luxury travel recommendations to the better-off. His company’s communication strategy emphasized automation, with a recommender system allegedly based on users’ preferences extracted from social media. But behind the scenes, it outsourced all its processes to micro-providers in Madagascar. It did no machine learning, and the intern could not gain the high-tech skills he dreamt of.

Why do start-ups cheat? Machine learning is expensive, as it requires powerful hardware, the brainpower of highly qualified computer scientists, and top-quality data: semantic segmentation costs a few dollars per image, compared to bounding boxes that are priced less than a dime, and simple categorizations that are available for one or two cents. Costs go further up if accuracy of results is sought, for example by having each data point annotated by multiple platform workers. Under pressure to perform, companies may find it cheaper to just leave aside cutting-edge technology, fragment the work into micro-tasks and sub-contract them to low-paid workers through platforms.

If so, there is another role for micro-work in addition to AI preparation, and we label it ‘AI impersonation’. It happens when humans, so to speak, steal computers’ jobs. This is the very idea behind Amazon Mechanical Turk, the platform that first popularized micro-work. Its name is that of a fake chess-playing machine built in the late eighteenth century and dressed in seemingly Ottoman clothes, but in fact operated by a human player hidden inside. That Amazon dubbed its creation ‘artificial artificial intelligence’ is also indicative of its intent of filling the gap of what artificial intelligence is expected but unable to do (Irani, 2015b).

Seen in this way, impersonation is not just about fraud, and indeed Amazon has always been upfront about it. It is the ‘human-in-the-loop’ principle that makes workers hardly distinguishable from algorithms. Amazon’s goal was to allow programmers to seamlessly integrate the two into their processes, whereby managing a task for ‘Turkers’ would be similar to sending a remote request for an algorithm to execute. More generally, the idea is that whenever an algorithm cannot autonomously bring an activity to completion, it hands control over to a human operator. This is the approach followed, among others, by Google Duplex, a conversational assistant that makes restaurant reservations, where up to 25% of calls were made by humans as of May 2019 (Chen and Metz, 2019). An apparent deception, it is nevertheless a way to gradually train the assistant.9Of note, impersonation sometimes involves qualified employees rather than micro-workers. The creators of Julie Desk, a French start-up producing an email-based scheduling assistant, initially did the job by hand, in place of an algorithm that had yet to be coded:

We worked as assistants ourselves for a period of 8 months, and manually answered all the requests we received! It allowed us to understand what the recurrent patterns in the meeting scheduling process were and then, with the help of data scientists, we coded them to give birth to Julie. (Hobeika, 2016)

The ‘birth of Julie’ did not end human intervention, but started a human-computer loop in which:

80% is done correctly by the machine and 20% is corrected by humans. However, all the meeting requests received by Julie are sent to our operators for a final human validation before Julie replies to our clients. The AI pre-processes everything and the human operators give the ‘go-ahead’ for a reply. (Hobeika, 2016)

AI ‘birth’ is in fact a continuous process which will always be supported by micro-work. According to the founders of Julie Desk, this is even ‘positive for the future because it will create new jobs as “AI Trainers” or “AI Supervisors”, like our operators at Julie Desk!’ (Hobeika, 2016).

Micro-work platforms do provide human labour force to meet these needs, but tend not to explicitly advertise these roles: they arguably negotiate impersonating tasks individually with clients as part of their managed service packages. Humans are always in the loop, but they are even less visible here, than when they do ‘preparation’ tasks. To take these aspects into account, it is important to include impersonation in our framework: it is not just a temporary strategy to keep afloat an insufficiently funded start-up, but part of the ‘heteromated’ system that increases demand for human workers with every new problem to be solved, while keeping them in marginal and unrecognised roles.

Even after inclusion of AI impersonation together with AI preparation, our interviews hint that there are more services that micro-workers provide to AI. In spring 2019, public outcry followed revelation in the news that human workers listen to users’ conversations with smart assistants (Hern, 2019). The year before, we had interviewed J., a transcriber who worked for six months to improve the quality of the French version of one of these virtual assistants, sold by a major technology multinational. Her job was to check that the virtual assistant correctly understood what its users said. She listened to audio recordings (usually short tracks, averaging between 3 and 15 seconds), then compared her understanding to the automated transcription produced by the virtual assistant. If the transcription was inaccurate, she had to correct it: any misunderstanding, conjugation or spelling mistakes had to be highlighted. Another part of her work consisted in adding tags to the transcribed text indicating any sounds or events that could explain the virtual assistant’s performance – why some sentences were well understood, some not. J. knew that fellow transcribers were doing the same tasks in other European countries and languages, all following the same guidelines.

Because the role of J. was undisclosed to users, her case reminds of impersonation, but the difference is that she was not replacing a failing algorithm: the one she checked for quality was up and running. She realized that the results of her work would help engineers and computer scientists to ensure the virtual assistant would not make the same mistakes in future. In this sense, the case of J. has something in common with AI preparation, with the difference that she was intervening ‘post-robot’: she produced training data from the amended outputs of an already-trained algorithm. Therefore, we propose to call this case, similar yet not identical to the other two, ‘AI verification’.

We found other examples of AI verification in our fieldwork. A., the micro-worker who moderated violent content (see above), also did relevance scoring. This type of task consists in assessing the extent to which the outputs of search engines or conversational agents are relevant to a user’s request. The final validation of AI outputs done by Julie Desk’s ‘operators’ (see above) is also an example of what we call verification. Another post-robot task involves checking the results of optical character recognition (OCR) software. For example, a firm that aims to digitize its invoices may scan the original documents, use OCR to convert the resulting images into character codes, and get human help to look at the outcomes, fill the gaps, and make corrections if necessary. We interviewed three African micro-workers who did such tasks for French clients through the Paris-based platform IsAHit. In all these cases, humans intervene well after the preparatory phase, when the AI solution has already been trained, tested and brought to market.

Micro-work platforms do advertise services such as relevance scoring and transcription checking to their AI-producing clients, but do not group them together in a separate category corresponding to our AI verification. Partly, this is due to the same reasons that keep them quiet about impersonation: any discovery that a supposedly automated solution is at least partly hand-made, may be seen as deceptive. Additionally, output checks performed by humans sometimes involve privacy leaks (which our interviewee J. loudly deplored) that may damage the reputation of the company or platform. Even more than in the other cases, micro-workers’ contribution is surrounded by silence.

We make space for AI verification in our analysis, because of its wide scope of application. Checks of the accuracy and quality of algorithmic solutions will always be needed, regardless of whether supervised, unsupervised or reinforcement learning is used. Hence, verification is not a temporary need but a recurrent one. As the sales of AI-based tools increase and affect a more diverse range of users, there will be a growing need to ensure that outputs meet expectations.

To map the linkages between AI and micro-work in our datafied economies, we started this paper by stating the expectations that micro-work contributes to the preliminary, input phase of the AI production process, and that its contribution is structural rather than temporary. The former is in line with the communication strategies of platforms, and their insistence on the value of data produced and annotated with a ‘human touch’; the latter is at odds with the opinions of technology enthusiasts who anticipate full automation of data generation and annotation, but resonates with industry reports that the global market for human-powered data services for AI is growing.

We reflectively thought through our expectations in light of empirical evidence from desk research, responses to an online questionnaire and in-depth interviews. We compared and contrasted the voices of all stakeholders to probe and refine our ideas. This material corroborated our initial assumption of an important role of micro-work in AI preparation. But we also noted some anomalies that led us to broaden the set of roles that micro-work may play in AI production. Industry actors brought us to identify AI impersonation, which occurs whenever humans outperform computers, so that it is advantageous to use them instead of (parts of) algorithms. In turn, micro-workers’ accounts of interventions to check the outputs of an automated system – that is, at the end of the production chain – revealed AI verification.

The result of this analysis is a typology, summarized in Figure 1. The process of AI production starts with preparation (left panel), which includes both data generation and annotation. This may concern image, text, sound, video or other types of data, and it is largely outsourced to online micro-workers. The data that they produce or enrich feed an algorithm that learns a model (central panel) which in turn, returns an output with some degree of certainty. For example, the output of an image classification algorithm can be ‘it is 90% likely to be a dog, 10% likely to be another animal’. If impersonation occurs at all, it is at this stage. Humans replace part of the algorithm (when they step in to complete a task that, say, Google Duplex struggles to achieve) or all of it (when they entirely simulate an algorithm that has not yet been coded, as in the early days of Julie Desk). AI verification (right panel) is the process through which outputs are sent to micro-workers to be checked for accuracy and if necessary, corrected.


                        figure

Figure 1. The three main functions of micro-work in the development of data-intensive, machine-learning based AI solutions. Source: authors’ elaboration based on Casilli et al. (2019).

To be sure, there are possible overlaps between our three cases of AI preparation, AI verification and AI impersonation. On the one hand, both impersonation and verification may be first steps toward developing datasets that can be subsequently used for preparation purposes. On the other hand, the boundaries between verification and impersonation become fuzzy when humans intervene to correct errors in real time. There is in fact a continuum of functions for micro-work in AI, many real-world cases being positioned in-between the three main types that we have singled out.

There might even be cases that do not fit with any of these types. In ‘click-farms’, workers have to ‘like’ (or dislike, share, etc.) the webpages of brands, products, celebrities, sports teams or politicians (Kuek et al., 2015). These tasks are often outsourced through the same platforms that feed the micro-work value chain, although they are even less paid than standard micro-work, and are frequently offered to providers who reside in low-income countries. In this way, apparently spontaneous user-generated web contents turn out to be the output of paid work by myriad providers. Their activity artificially inflates the indicators of quality and popularity used by (among others) search engines and rating systems, thereby lowering their information value (Lehdonvirta and Ernkvist, 2011). Likewise, social media bots, at the heart of widespread reports of digital influence operations during major elections, are in fact mostly human-assisted and rely on similar systems to recruit and remunerate online workers (Gorwa and Guilbeault, 2018). To extend our typology, all these forms may be said to perform some kind of ‘AI disruption’. They are not always illegal, but go against the intentions of the designers of the systems, providing no added value to any of the other users (Lehdonvirta and Ernkvist, 2011).

Refining our initial idea to add more roles of micro-work, at different stages of the AI production process, helps us find a comprehensive answer to the question of the extent to which micro-work is a temporary or structural component of AI. If we had focused exclusively on AI preparation, as in the discourse of most micro-work platforms, we might have thought that the need to accumulate labour-powered data is specific to the current times, in which AI is growing fast but has not reached maturity, and the less data-demanding unsupervised learning has not made enough progress. But in our data economy, this is unlikely to happen. Data availability will never reach a steady state: most use cases for machine learning require ongoing acquisition of new sources to continuously adjust to changing conditions, resulting in a steadily growing need for humans to produce data for more accurate, more precise, and more profitable results. The discovery of AI verification strengthens this idea, in that some of the data used to re-train an existing algorithm and adapt it to new circumstances, come from the quality checks routinely done by humans. Taking into account AI verification, not just preparation, also contributes to dismissing the related idea that progress in unsupervised learning might eliminate the need for humans: even with fewer data preparation tasks, verification will always have to be performed.

Similarly, impersonation should be understood in light of the other two types. A single, perhaps high-profile case may suggest that micro-work is a transitory phenomenon, to disappear as companies accumulate the necessary data, skills and computational capacity. But our typology supports the idea that impersonation is systemic and will always be present to some degree, because it ensures the necessary connection between AI preparation and AI verification, supplementing algorithms when they fail. Impersonation also demonstrates that the durability of demand for micro-work depends not only on technological, but also on economic factors. As long as there are humans who can perform tasks more cheaply than AI, perhaps (but not necessarily) because they reside in countries where the cost of labour is low, it will be advantageous to substitute them for machines. Overall, our typology hints that full automation is not to be expected any time soon, and that human work will continue to play an important role in keeping the industry going.

Of course, these conclusions only hold to the extent that the dominant paradigm of AI remains based on statistical learning, as described above; a return to ‘symbolic’ AI or the emergence of some new approach, though not in sight at the moment, might change the balance between humans and machines. These conclusions are also contingent on the state of the AI industry and assume that the current hype continues: if for any reason, enthusiasms faded away and investors withdrew their money from the sector, then demand for workers to do preparation, verification and impersonation might plummet. Human contribution is a structural need of AI only under these conditions, and does not even need to be always organized as platform-mediated micro-work, although this is the most common form it takes today. Indeed, tasks change over time, and may sometimes not even be ‘micro’ strictly speaking, for example state-of-the-art semantic segmentation which requires more time and higher skills than older bounding boxes. Interestingly, this is a case in which technology has managed to automate parts of the image annotation process, without eliminating humans but changing their roles as they now have to handle nuances and details instead of just identifying gross traits (Schmidt, 2019). Likewise, not all human contributions to AI preparation, verification and impersonation are managed by platforms such as Amazon Mechanical Turk: some of these tasks are performed by vendor companies that hire employees (mostly in emerging countries) (Gray and Suri, 2019).

The need for humans and their multiple roles in the AI production process may make platforms pride themselves on countering the gloomy predictions of AI-induced job losses, by creating earnings opportunities that would not exist without the technology. But difficult questions must be asked about the conditions under which micro-work is performed. Although a detailed analysis of working conditions is outside the scope of this paper, the interviews we used hint that executing un-challenging tasks such as labelling images for an unknown purpose can be destabilizing; that involuntarily accessing personal data of other people, or witnessing grossly deceptive forms of impersonation, brings ethical dilemmas; and that exposure to violent web content may generate distress (Casilli et al., 2019). Addressing these issues is all the more difficult as platform labour challenges the boundaries of employment regulation and social protection systems (Prassl, 2018), and profound differences between the various types of platform work hinder common solutions (De Stefano, 2016). Possible ways forward range from forms of workers’ organization (Silberman and Irani, 2016) to the creation of intermediate categories of ‘independent workers’ (extensively discussed, but also criticized, in Cherry and Aloisi (2018)) and the reinforcement of extant parts of labour law (Aloisi and De Stefano, 2020).

We repeatedly noted the silence surrounding micro-work. Platforms tell clients that human contribution has value, but not who these humans are and in what conditions they work. As a result, clients know little about micro-workers – just as the latter are often unaware of the purposes of their tasks – and may find it difficult to interact with them as mentioned above. Ironically, the full extent of human intervention is unclear even to key industry actors. The incentive to obscure the role of human contributors is highest when the credibility of full automation promises is at stake. As a general tendency and beyond one-off revelations, this contributes to keeping micro-work far from the gaze of the general public and from the agenda of policy-makers. Out of the reach of institutional regulations, subject only to the forces of a market by an excess supply of workers (Graham and Anwar, 2019), it remains structurally unprotected and insufficiently paid.

In sum, AI is not the end of human labour, but is depriving it of the quality, meaning and social status that it acquired over time. There is a need for ambitious, long-term policies that frame the further development of AI by taking into account the concrete conditions of its production, in light of ongoing debates on digital platform labour and its shortcomings – from low remuneration and precariousness to lack of social security (Graham and Shaw, 2017). Put differently, credible commitment to socially responsible AI requires the definition of labour standards in the processes that underpin it. More transparency is needed, toward workers as well as the general public, to ensure the full extent of human participation is understood and recognized for what it is worth.

We would like to thank current and former DiPLab team members, notably Maxime Besenval, Odile Chagny, Clément Le Ludec, Touhfat Mouhtare, Lise Mounier, Manisha Venkat and Elinor Wahal. Preliminary versions of this paper were presented at the Reshaping Work conference, Amsterdam, October 2018, and at the Research Day of the Society and Organizations Centre of HEC Paris, May 2019. Many thanks to discussants and participants for constructive feedback. We also thank the editors and three anonymous reviewers of Big Data & Societyfor their helpful comments.

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper presents results of a larger study called Digital Platform Labour (DiPLab), co-funded by Maison des Sciences de l’Homme Paris-Saclay; Force Ouvrière, a workers’ union, as part of a grant from Institut de recherches économiques et sociales (IRES); and France Stratégie, a service of the French Prime Minister’s office. The platform Foule Factory offered logistical support, and Inria provided complementary funding.

Paola Tubaro https://orcid.org/0000-0002-1215-9145

Acemoglu, D, Restrepo, P(2018The race between man and machine: Implications of technology for growth, factor shares, and employment. American Economic Review 108(6): 14881542.
Google Scholar | Crossref


Alpaydin, E (2014) Introduction to Machine Learning. 3rd edCambridge, MAMIT Press.
Google Scholar


Alpaydin, E (2016) Machine Learning: The New AI. Cambridge, MAMIT Press.
Google Scholar


Anderson, C (2008) The end of theory: The data deluge makes the scientific method obsolete. Wired.Available at: www.wired.com/2008/06/pb-theory/(accessed 7 April 2020).
Google Scholar


Autor, DH (2015Why are there still so many jobs? The history and future of workplace automation. Journal of Economic Perspectives 29(3): 330.
Google Scholar | Crossref | ISI


Autor, DH, Dorn, D(2013The growth of low-skill service jobs and the polarization of the US labor market. American Economic Review 103(5): 15531597.
Google Scholar | Crossref | ISI


Bechmann, A, Bowker, G(2019Unsupervised by any other name: Hidden layers of knowledge production in artificial intelligence on social media. Big Data & Society 6(1). Available at: https://doi.org/10.1177/2053951718819569
Google Scholar | SAGE Journals


Berg, J, Furrer, M, Harmon, E, et al. (2018) Digital labour platforms and the future of work: Towards decent work in the online world. ILO Report.
Google Scholar


Bessen, J (2017) Automation and jobs: When technology boosts employment. Boston University School of Law, Law and Economics Research Paper No. 17-09. Boston: Boston University School of Law.
Google Scholar


Boelaert, J, Ollion, E(2018The great regression: Machine learning, econometrics, and the future of quantitative social sciences. Revue Française de Sociologie 59(3): 475506.
Google Scholar | Crossref


Brynjolfsson, E, McAfee, A(2014) The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies. New York: W. WNorton.
Google Scholar


Casilli, A (2019) En attendant les robots: Enquête sur le travail du clic. FranceSeuil.
Google Scholar


Casilli, A, Tubaro, P, Le, Ludec, C, et al. (2019) Le micro-travail en France. Derrière l’automatisation, de nouvelles précarités au travail? Paris, France: Digital Platform Labor (DiPLab) project.
Google Scholar


Chen, B, Metz, C(2019) Google’s Duplex uses A.I. to mimic humans (sometimes). The New York Times.Available at: https://www.nytimes.com/2019/05/22/technology/personaltech/ai-google-duplex.html.
Google Scholar


Cherry, M, Aloisi, A(2018A critical examination of a third employment category for on-demand work (in comparative perspective). In: Davidson, N, Finck, M, Infranca, J(eds.) The Cambridge Handbook of the Law of the Sharing Economy. Cambridge: Cambridge University Press, pp.316327.
Google Scholar | Crossref


Chui, M, Mayika, J, Miremadi, M(2016Four fundamentals of workplace automation. McKinsey Q 2016;1. Available at: https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/four-fundamentals-of-workplace-automation.
Google Scholar


Cognilytica (2019) Data engineering, preparation, and labeling for AI 2019. Report n. cgr-de100. Washington DC: Cognilytica. Washington DC: Cognilytica.
Google Scholar


De Stefano, V (2016Introduction to the special issue: crowdsourcing, the gig-economy and the law. Comparative Labor Law & Policy Journal 37(3): 461–470.
Google Scholar


Aloisi, A, De Stefano, V(2020Regulation and the future of work: the employment relationship as an innovation facilitator. International Labour Review. Available at: https://doi.org/10.1111/ilr.12160.
Google Scholar | Crossref


Domingos, P (2017) The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. LondonPenguin.
Google Scholar


Ekbia, H, Nardi, B(2017) Heteromation, and Other Stories of Computing and Capitalism. Cambridge, MAMIT Press.
Google Scholar | Crossref


French Government (Ministry for the Economy, Ministry of Education and Ministry of Digital Technologies) (2017) France intelligence artificielle – Rapport de synthèse. Report. Ministry for the Economy, Ministry of Education and Ministry of Digital Technologies, France.
Google Scholar


Frey, C, Osborne, M(2017The future of employment: How susceptible are jobs to computerisation?Technological Forecasting and Social Change 114(C): 254280.
Google Scholar | Crossref


Gorwa R and Guilbeault D (2018) Unpacking the social media bot: a typology to guide research and policy. Policy & Internet. Availablet at: https://doi.org/10.1002/poi3.184.
Google Scholar


Graham, M, Anwar, M(2019The global gig economy: towards a planetary labour market?First Monday 24(4). Available at: https://doi.org/10.5210/fm.v24i4.9913.
Google Scholar | Crossref


Graham, M, Shaw, J(2017) Towards a Fairer Gig Economy. LondonMeatspace Press.
Google Scholar


Gray, M, Suri, S(2017The humans working behind the AI curtain. Harvard Business Review, 9 January, 25.
Google Scholar


Gray, M, Suri, S(2019) Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Boston, MAHoughton Mifflin Harcourt.
Google Scholar


Hern, A (2019) Apple contractors ‘regularly hear confidential details’ on Siri recordings. The Guardian. Available at: https://www.theguardian.com/technology/2019/jul/26/apple-contractors-regularly-hear-confidential-details-on-siri-recordings
Google Scholar


Hobeika, J (2016) How to build empathy in AI? Julie desk blog. Available at: www.juliedesk.com/blog/artificial-intelligence-empathy/(accessed 14 February 2020).
Google Scholar


Irani, L (2015a) The cultural work of microwork. New Media & Society 17(5): 720739.
Google Scholar | SAGE Journals | ISI


Irani, L (2015b) Difference and dependence among digital workers: the case of Amazon Mechanical Turk. South Atlantic Quarterly 114: 225–234.
Google Scholar | Crossref | ISI


Jobin, A, Ienca, M, Vayena, E(2019The global landscape of AI ethics guidelines. Nature Machine Intelligence 1: 389399.
Google Scholar | Crossref


Kässi, O, Lehdonvirta, V(2018Online labour index: Measuring the online gig economy for policy and research. Technological Forecasting and Social Change 137: 241248.
Google Scholar | Crossref


Kitchin, R (2014) The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. Thousand Oaks, CASage.
Google Scholar | Crossref


Kuek, S, Paradi-Guilford, C, Fayomi, T, et al. (2015) The global opportunity in online outsourcing. Report, World Bank, Washington, DC.
Google Scholar


Lehdonvirta, V, Ernkvist, M(2011) Knowledge map of the virtual economy. Converting the virtual economy into development potential. Report, World Bank, Washington, DC.
Google Scholar


Mayer-Schönberger, V, Cukier, K(2013Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston, MA: Houghton Mifflin Harcourt.
Google Scholar


MMC Ventures (2019) The state of AI 2019: Divergence. Report. London, UK: MMC Ventures.
Google Scholar


Prassl, J (2018) Humans as a Service: The Promise and Perils of Work in the Gig Economy. Oxford: Oxford University Press.
Google Scholar | Crossref


Ram, A (2019) Europe’s AI start-ups often do not use AI, study finds. Financial Times.Available at: https://www.ft.com/content/21b19010-3e9f-11e9-b896-fe36ec32aece.
Google Scholar


Roberts, S (2019) Behind the Screen: Content Moderation in the Shadows of Social Media. New Haven, CTYale University Press.
Google Scholar | Crossref


Schmidt, C (2017) Digital labour markets in the platform economy: Mapping the political challenges of crowd work and gig work. Report, Friedrich-Ebert-Stiftung, Bonn, Germany.
Google Scholar


Schmidt, F (2019) Crowdproduktion von Trainingsdaten: Zur Rolle von Online-Arbeit beim Trainieren autonomer Fahrzeuge. Report, Hans-Böckler-Stiftung, Düsseldorf, Germany.
Google Scholar


Silberman, S, Irani, L(2016Operating an employer reputation system: Lessons from Turkopticon, 2008–2015. Comparative Labor Law & Policy Journal 37(3): 505–542.
Google Scholar


Tubaro, P, Casilli, A(2019Micro-work, artificial intelligence and the automotive industry. Journal of Industrial and Business Economics 46(3): 333345.
Google Scholar | Crossref


Villani, C, Schoenauer, M, Bonnet, Y, et al. (2018) Donner un sens à l’intelligence artificielle: Pour une stratégie nationale et européenne. Report, France Premier Ministre.

Source : The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence – Paola Tubaro, Antonio A Casilli, Marion Coville, 2020

Votre commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l’aide de votre compte WordPress.com. Déconnexion /  Changer )

Photo Google

Vous commentez à l’aide de votre compte Google. Déconnexion /  Changer )

Image Twitter

Vous commentez à l’aide de votre compte Twitter. Déconnexion /  Changer )

Photo Facebook

Vous commentez à l’aide de votre compte Facebook. Déconnexion /  Changer )

Connexion à %s

Ce site utilise Akismet pour réduire les indésirables. En savoir plus sur la façon dont les données de vos commentaires sont traitées.