Model
AVE PRP SA SR AP AG






F1* M-F1M-F1HR@1 F1 FBERT
GPT-4 Turbo 0.397 0.392 0.510 0.198 0.680 0.860
Gemini Pro 0.275 0.123 0.454 0.116 0.552 0.856
Claude 2.1 0.410 0.277 0.369 0.036 0.245 0.842
Llama-2 13B-chat 0.000 0.324 0.178 0.050 0.644 0.808
Mistral-7B Instruct-v0.2 0.264 0.327 0.438 0.108 0.608 0.851






EcomGPT 0.001 0.096 0.178 0.023 0.140 0.722






SoTA task-specific model 0.269 0.507 0.567 0.081 0.853 0.860
eCeLLM-L 0.335 0.558 0.629 0.273 0.867 0.841
eCeLLM-M 0.367 0.502 0.6400.280 0.878 0.840
eCeLLM-S 0.302 0.520 0.565 0.241 0.879 0.840
improvement (%, avg: 9.3) -10.5 10.1 14.1 41.4 3.0 -2.2