Identifying Terms & Conditions Important to Consumers Using Crowdsourcing and Pairwise Comparisons
Unpublished · Xingyu Liu, Annabel Sun, Jonathan Dinu, Alex Sciuto, Sarah Shy, Jason Hong

Terms and conditions (T&Cs) are pervasive on the web and often contain important information for consumers, but are rarely read. In this paper, we aim to help users on the internet better understand the policies they are implicitly agreeing to by surfacing important information for them. We use a combination of crowdsourcing, pairwise comparisons of statements in T&Cs, and the Bradley-Terry model to build a ranking of important statements. We present an analysis suggesting that consumers have high agreement on what they consider “important”, as well as an analysis of accuracy tradeoffs with respect to amount of data needed. We also built a machine learning model reaching 86.8% accuracy, 93.2% recall, and a 82.0% precision in detecting user-labelled important clauses. Lastly, we examine what words and what statements crowd workers considered important.

back · courses · consulting · blog · vita · twitter · youtube · github · colophon