Abandon rates increase the newer a book is, and the lower the average rating. I also consider a model adjusting for covariates (author/average-rating/year), to see what books are most surprisingly often-abandoned given their pedigrees & rating etc. The Witches: Salem, 1692, Stacy Schiff.I fix that to see what more correct rankings look like.Ĭorrecting for both changes the top-5 ranking completely, from ( raw counts): There is also residual error from the winner’s curse where books with fewer ratings are more mis-estimated than popular books. This conflates popularity with probability of being abandoned: a popular but rarely-abandoned book may have more abandoned tags than a less popular but often-abandoned book. The default GoodReads tag interface presents only raw counts of tags, not counts divided by total ratings ( = reads). What books are hardest for a reader who starts them to finish, and most likely to be abandoned? I scrape a crowdsourced tag, abandoned, from the GoodReads book social network on to estimate conditional probability of being abandoned.
0 Comments
Leave a Reply. |