Sie sind auf Seite 1von 41

Getting Started with Mechanical Turk

Emily Tucker Prud’hommeaux


June 15, 2010
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools:
6. Getting fancy with the command line: external pages.
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Mechanical Turk, a.k.a Mturk
What is Mechanical Turk?
• Then: A chess-playing “robot” -- actually a guy in a box.

• Now: A service run by Amazon.com that allows people


worldwide to do work or answer questions for you.
Mechanical Turk Terminology
• Requester: You, the person asking the
questions.
• Workers (or Turkers): The people answering
your questions.
• Human Intelligence Task (HIT): The question or
set of questions you want them to answer.
• Reward: How much you pay a Worker for a HIT.
MTurk vs. Traditional Methods

Mechanical Turk Traditional Methods

Many workers answer a few Few subjects answer lots of


questions in a short period. questions over a long period.
Not a lot of interaction -- may be Tons of interaction -- lots of
hard to explain task. opportunity to explain things.
Who are these people?!? You know your subjects.
Very cheap, and you don’t have to Not so cheap, and you have to
pay if they do a bad job. pay the people anyway.

Quality control is tricky. Quality control is not so hard.

Less opportunity for bias on the


More opportunity for bias.
part of the experimenter.
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Creating Your Accounts
1. Create an Amazon Mechanical Turk Requester
account. You need this to use Mechanical Turk.
https://requester.mturk.com/mturk/beginsignin

2. (Optional) Create an Amazon Web Services (AWS)


account. You need this to be able use the command line
tools and possibly for some other things:
https://aws-portal.amazon.com/gp/aws/developer/registration/index.html
Funding Your Account
Funding Your Account
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your first experiment.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Creating a HIT
1. Click the Design tab
Select a Template
2. Select a HIT template.

Letʼs try Data Collection


Enter Properties

Be brief but descriptive.

Don’t give people too much time

Other criteria can be helpful


(e.g., must live in US). Amazon
displays your HIT only to the
people who meet the criteria.

Reward: usually just a few


cents, unless it’s really long.
Design Layout

Click here to edit the HTML.

Ah, much better!


Design Layout

Input data variables. You’ll


upload a CSV file containing their
values. Format them this way and
MTurk will interpret them for you.

This is how worker responses


get stored, just like a regular old
HTML form, which you already
know all about.

Hint: If you want some specific type of HTML form input (e.g., radio
buttons, drop down menu, checkbox), look at the Blank Template template.
Preview and Finish

Recall: we will upload a CSV file


to fill in these blanks for each HIT.
Publishing Your HIT
Create and Upload CSV File

You create the CSV file on your computer and upload it here. It will look
something like this for this example.

name,address,phone
Bread and Ink,3600 SE Hawthorne,503-555-1212
Three Doors Down,1415 SE 38th,503-555-1213
Cha cha cha!,3375 SE Hawthorne,503-555-1214
Preview your HIT

The ${name}, ${phone}, and


${address} variables got
filled with the values from
your CSV file.
Confirm and Publish
Manage HITs and Results
Review and Download Results
Download results to your computer.

You get to process your


results file however you like --
open it in Excel or write a
program to make it look nice.

Approve or reject that worker’s work.


Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your tasks.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Including Audio without Flash
• For audio, you can convert your wavs to mp3, put them on
the web, have the links to the mp3s be your variables in the
CSV file, then force the links to open in a new window.
CSV file
audiofile1,audiofile2
http://etucker.com/a1.mp3,http://etucker.com/a2.mp3

Template HTML
<a target="_blank" href="${audiofi1e1}>Audio1</a>

• If you want something more reliable, embed the audio in a


Flash player, which I am about to describe.
• If you need more control (e.g., you want to prevent the
worker from listening to the wave more than once), you
might need to use something fancier like Javascript.
Including Audio with Flash
• If you donʼt want the audio to open in a new window,
embed the audio in a Flash player.
• I use the Google audio Flash player, which works well and
has nice controls.
• The html will look something like this:
<embed src="http://www.google.com/reader/ui/3523697345-audio-player.swf"
flashvars="audioUrl=${audiofile}" width="400" height="27" quality="best"
type="application/x-shockwave-flash"></embed>

• The input file will look something like this:


audiofile
http://www.csee.ogi.edu/mechturk/audio1.mp3
http://www.csee.ogi.edu/mechturk/audio2.mp3
http://www.csee.ogi.edu/mechturk/audio3.mp3
http://www.csee.ogi.edu/mechturk/audio4.mp3
Including Video
• For videos, I have been using Flash.
• Flash works reliably in all browsers (when it doesnʼt crash
them or take up the whole CPU) and everyone has it.
• If a lot of Workers start using iPads, this might not be a
good solution.
• Itʼs all super easy, so why am I presenting this?
• Because it took me so long to find the best tools and figure
out the best way to do the HTML so that it would work in
MTurk and in all browsers.
Video with Flash: Preparation
1. Convert your videos to .flv format. I have used FLVCrunch:
http://download.cnet.com/FLV-Crunch/3000-2194_4-10909295.html

2. Get a Flash player. I have used the free JW Player:


http://www.longtailvideo.com/players/jw-flv-player

3. Put both the player components (as described in the JW


Player instructions) and your .flv videos on the internet
somewhere. Sean created this directory for me on the
csee.ogi.edu servers:
/vol0/projects/www/CSE/public_html_noredirect/mech

which you can access on the web with this URL:


http://www.csee.ogi.edu/mech
Video with Flash: MTurk Part
4. Include your videos as variables in your CSV file like this:
video1,video2
http://www.csee.ogi.edu/mech/player.swf?file=http://
www.csee.ogi.edu/mech/video/myawesomevideo1.flv,http://
www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/
mech/video/myawesomevideo2.flv

5. In the template for your hit, include a line like this for each
video you want to include in that hit:
<embed height="300" width="300" src="${video1}" name="player1"
id="player1"></embed>
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your first experiment.
• Submitting your tasks.
• Reviewing and approving your results.
4. Getting fancy with the GUI: audio and video.
5. Using the command line tools.
6. Getting fancy with the command line: external pages.
Command Line Tools: Why?
Instead of using the GUI to set up your MTurk experiment,
you can use command line tools.
Advantages:
1. Approval/rejection process is easier when you have lots
of data from lots of workers.
2. More power to manage workers: block workers, set
qualifications for workers.
3. Possible to change properties for HIT already in progress.
4. Can use the sandbox to try out your experiments.
5. With external pages, much more flexibility in what
kind of web stuff you can do, like Javascript.
Command Line Tools: Basics
1. Download and install command line tools from here:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=694

2. Sign up for an AWS account, if you didnʼt before:


https://aws-portal.amazon.com/gp/aws/developer/registration/index.html

3. Associate your installation with your AWS identifiers

a) Find your identifiers:


http://s3.amazonaws.com/mturk/tools/pages/aws-access-identifiers/aws-
identifier.html

b) Enter those identifiers in bin/mturk.properties file:


access_key=[Your AWS Access Key]
secret_key=[Your Secret Key]
Command Line Tools: Documentation
There is some good documentation for the Mechanical Turk
command line tools:
1. The UserGuide.html that comes with the tools: definitely
use it to get started with everything.
2. The samples directory:
• Anything youʼd like to do with the command line tools is
pretty easy to figure out just by copying the samples...
• ...except setting up an external page, which is poorly
documented, which is why that is our next topic.
external_hit.input
External Pages
This is like the input file you used
with•the
GetGUI, but tabusing
started separated
the samples/external_page directory
instead of comma separated.
in your command line tools installation.
-rw-r--r-- 1 emtucker emtucker
external_hit.properties 119 Apr 24 2008 external_hit.input
Title,-rw-r--r-- 1 emtucker emtucker
description, reward,
-rw-r--r-- 1 emtucker emtucker
619 Apr 24 2008 external_hit.properties
621 Feb 8 22:59 external_hit.question
qualifications, time allotted, what
your input variables are called.
-rw-r--r-- 1 emtucker emtucker 2218 Apr 24 2008 externalpage.htm
external_hit.question
-rwxr-xr-x 1 emtucker emtucker 667 Apr 24 2008 approveAndDeleteResults.sh
Link-rwxr-xr-x
to external1 page plus emtucker
emtucker how to 705 Apr 24 2008 getResults.sh
-rwxr-xr-x 1 emtucker emtucker 671 Apr 24 2008 reviewResults.sh
get your input variables into your
-rwxr-xr-x 1 emtucker emtucker 799 Apr 24 2008 run.sh
page. More on this shortly.
*.sh
All of the pre-written scripts for
externalpage.html submitting your HITs, downloading
The external web page itself. More the results, and approving/
on this shortly rejecting the work.
Data Files

external_hit.input

external_hit.properties

external_hit.question
external_hit.question
http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&amp;sent1=${helper.urlencode($sent1)}

The helper.urlencode bit is


The URL to your Then, in your external web
how MTurk puts the
external page, wherever page, you’ll use Javascript (or
values of your input
you decide to put it. something else of your choice)
variables (which it gets
from the .input file) into to read these items out of the
the URL for the page for URL in order to use them in
each HIT. your page where you need
them.

MTurk also automatically inserts the AssignmentID


variable into the URL. That is, if a worker accepts the
HIT, a unique Assignment ID will be created and included
in the URL. You will have to use that information when
you post the results to MTurk in your external page.
The External Page
Needs to have a few important things:
• Javascript (or other) code for extracting the values of your
input variables out of the URL.
• Javascript (or other) code for accessing the Assignment ID
and for posting the workerʼs responses to MTurk.
This is all included in the externalpage.htm file in the
samples/external_page directory of the command line tools
installation.
The example external page is very helpful, but poorly
commented.
External Web Page:
Javascript code for extracting
URL parameters.
External Web Page:
Javascript code for using
extracted URL parameters.

This part is very important! The worker must accept the hit before being able to
complete it. Be sure to include this (or something like it) in your external page.
Command Line Tools: Sandbox
• Good idea to try out your experiments in the sandbox.
• Sandbox lets you see exactly how your HIT will look to
potential workers.

1. In your bin/mturk.properties file, comment out this line:


#service_url=http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester

and uncomment this line:


service_url=http://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester

2. In your external html page, replace references to


http://www.mturk.com/mturk/externalSubmit

with
http://workersandbox.mturk.com/mturk/externalSubmit
Lots of Other Topics
• Using command line tools to interact more closely with
workers, design ways of determining who is a good worker
and recruiting those workers, banning specific workers.
• Using the Amazon Mechanical Turk SDK.

• Practical concerns: What kinds of projects can you do with


Mechanical Turk? Are some projects better carried out with
traditional methods?
• How much money do we save using Mechanical Turk?
Sometimes it might be cheaper and easier to use a few
carefully chosen local workers, or even people currently
employed at OGI.

Das könnte Ihnen auch gefallen