Microsoft shortening task

From crowdresearch
Jump to: navigation, search

The following is the task description from our Microsoft colleague.

Task 1. Ranking shortened alternatives

The input data is hosted on Slack. The files are hosted on Slack. Included are several files --- quotes below from our contact at Microsoft:

  • gudelinesranking.txt: "a very brief description of the new sentence ranking task I would like to do"

"I would like ranking by 2 judges for each input. Some of the input texts are paragraphs consisting of two sentences, and others consist of one sentence only. The payment for each HIT could be based on the number of words as some are much shorter than others. (The two-sentence paragraphs in text_torank.tsv also appear in the source text for the paragraphShortenTask I have sent, so we could compare the quality later on. Or, if new shortened versions are first competed in the paragraphShortenTask, we could add them to the candidates for the second task)."

The other files are from an old version of the task that didn't work very well. We include them for context. They are:

  • ratingguidelines.3way.pdf "my guidelines for the workers" (for the old task). "There is a screenshot of the interface in the pdf."
  • RateShortening.html: 'my old interface for the related rating task" (for the old task).
  • shortened1.tsv: ignore this file, it's old input

Again, we're going to be turning her old ordinal rating task into a ranking task.

Task 2. Shortening paragraphs

Here, she is looking for a different set of workers to do the task. She was unhappy with how the task turned out previously. We are welcome to redesign the task interface to encourage better results. She said "We could keep things very similar to this task definition for the new annotation." The files are hosted on Slack.

She would like one high-quality shortened version of each paragraph. She has five from Microsoft's internal crowdsourcing platform but she's unhappy with the quality, and so would like to compare our workers to theirs.

Included are several files --- quotes below from our contact at Microsoft:

  • shorteninterface.html: "my old interface in html with javascript (using some Microsoft-platform-specific templates). Note that I am also collecting all keystrokes and other actions of the user as we wanted to collect data on the writing process of the workers."
  • masc.2sent.tsv: "input paragraphs for shortening. They are 2 sentence paragraphs but I could get longer ones for the real data."
  • guidelinesshorten2.pdf: "my guidelines to the crowd workers. There is a screenshot of the interface in the pdf."