Task

Definition

Type

Metrics

Data

AVE

Given the titles, descriptions, features, and brands of the products, extract values for the specific target attributes.

Information extraction

precision*, recall*, F1*

MAVE based on Amazon Review 2018; OOD: 7 held-out attributes






Product Understanding
PRP

Given the titles of two products, predict their relation from “also buy”, “also view”, and “similar”.

Multi-class classification

accuracy, macro precision, macro recall, macro F1

Amazon Review 2018; OOD: Tools category






PM

Given the titles, descriptions, manufacturers, and prices of the products from two different platforms, predict if they are the same product.

Binary classification

accuracy, precision, recall, F1, specificity, negative prediction rate

Amazon-Google Product

SA

Given a product review by a user, identify the sentiment that the user expressed on the product.

Multi-class classification

accuracy, macro precision, macro recall, macro F1

Amazon Review 2018; OOD: Tools category






User Understanding
SR

Given the interactions of a user over the products, predict the next product that the user would be interested in.

Ranking

HR@1

Amazon Review 2018 and Amazon Review 2014; OOD: Tools category

MPC

Given a query and a product title, predict the relevance between the query and the product.

Multi-class classification

accuracy, macro precision, macro recall, macro F1

Shopping Queries Dataset






Query Product Matching
PSI

Given a user query and a potentially relevant product, predict if the product can serve as a substitute for the user’s query.

Binary classification

accuracy, precision, recall, F1, specificity, negative prediction rate

Shopping Queries Dataset






QPR

Given a user query and a list of potentially relevant products to the query, rank the products according to their relevance to the query.

Ranking

NDCG

Shopping Queries Dataset

Product QA
AP

Given a product-related question and reviews of this product, predict if the question is answerable.

Binary classification

accuracy, precision, recall, F1, specificity, negative prediction rate

AmazonQA; OOD: Cells category






AG

Given a product-related question and reviews as supporting documents, generate the answer to the question.

Generation

PBERT, RBERT, FBERT, BLEURT

AmazonQA; OOD: Cells category