RPubs

by RStudio

christian_adriano

Christian Medeiros Adriano

Recently Published

K-Nearest Neighbor - Predicting Bug Covering by Ranking

Train a KNN algorithm to identify a bug covering question by looking at the ranking of question. The ranking is based on the number of YES answers received by the question, for instance, the question (s) with the highest number of YES is ranked one, the questions with the second highest number of YES answers are ranked two, and so forth.

about 8 years ago

K-Nearest Neighbor - Predicting Bug Covering by Threshold Voting

Estimate the level of YES votes necessary to predict the code fragments that are related to failure. This minimal level is called the threshold voting metric. I employed the k-nearest neighbor with leaving one out cross validation. Results showed that the best classification came when YES votes were at least 6.

over 8 years ago

Bug prediction based on Majority Voting

Estimate the level of majority vote necessary to predict the code fragments that are related to failure. Majority voting is computed by the difference between the number of YES votes and NO votes. I employed the k-nearest neighbor with leaving one out cross validation. Results showed that best classification came when difference between YES and NO is larger or equal to -2.

over 8 years ago

How are workers distributed by bug covering questions

Since I did not control for workers across different types of questions, I would like to know if there is a staticially significant concentration of highly skilled workers on the questions that cover bugs.

over 8 years ago

Are answers more accurate for easier questions?

Analysis of answer accuracy versus difficulty of questions.

over 8 years ago

Were more some questions answered by more experienced workers?

Analysis of how workers were distributed across questions in terms the number of years of programming experience from workers.

over 8 years ago

How are professions distributed across questions?

Workers self-declared different profession types, e.g., professional developer, graduate student, hobbyist, under-gradutate student, and other. Since workers were randomly allocated to questions, I would like to know if some questions were answered by disproportionate number of workers from certain professions. I am particularly interested the professions that present lower answer accuracy, because such can cause some questions to be overlooked.

over 8 years ago

RPubs

christian_adriano

Christian Medeiros Adriano

Recently Published

K-Nearest Neighbor - Predicting Bug Covering by Ranking

K-Nearest Neighbor - Predicting Bug Covering by Threshold Voting

Bug prediction based on Majority Voting

How are workers distributed by bug covering questions

Are answers more accurate for easier questions?

Were more some questions answered by more experienced workers?

How are professions distributed across questions?

Worker score distribution across questions

How is answer accuracy distributed by question?

Are there meaningful clusters of Mechanical Turk workers grouped by age and years of programming experience?

Sign In

christian_adriano

Christian Medeiros Adriano

Recently Published