Abstract: |
Online reviews significantly impact consumers' decision-making process and
firms' economic outcomes and are widely seen as crucial to the success of
online markets. Firms, therefore, have a strong incentive to manipulate
ratings using fake reviews. This presents a problem that academic researchers
have tried to solve over two decades and on which platforms expend a large
amount of resources. Nevertheless, the prevalence of fake reviews is arguably
higher than ever. To combat this, we collect a dataset of reviews for
thousands of Amazon products and develop a general and highly accurate method
for detecting fake reviews. A unique difference between previous datasets and
ours is that we directly observe which sellers buy fake reviews. Thus, while
prior research has trained models using lab-generated reviews or proxies for
fake reviews, we are able to train a model using actual fake reviews. We show
that products that buy fake reviews are highly clustered in the
product-reviewer network. Therefore, features constructed from this network
are highly predictive of which products buy fake reviews. We show that our
network-based approach is also successful at detecting fake reviews even
without ground truth data, as unsupervised clustering methods can accurately
identify fake review buyers by identifying clusters of products that are
closely connected in the network. While text or metadata can be manipulated to
evade detection, network-based features are more costly to manipulate because
these features result directly from the inherent limitations of buying reviews
from online review marketplaces, making our detection approach more robust to
manipulation. |