Recommendations by Concise User Profiles from Review Text: Conclusion

cover
3 May 2024

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Ghazaleh H. Torbati, Max Planck Institute for Informatics Saarbrucken, Germany & [email protected];

(2) Andrew Yates, University of Amsterdam Amsterdam, Netherlands & [email protected];

(3) Anna Tigunova, Max Planck Institute for Informatics Saarbrucken, Germany & [email protected];

(4) Gerhard Weikum, Max Planck Institute for Informatics Saarbrucken, Germany & [email protected].

VI. CONCLUSION

This work addressed text-centric recommender systems in data-poor situations, specifically for the demanding case of book reviewing communities where many users have only few items and exhibit diverse tastes, but post text-rich comments. We presented a Transformer-based framework called CUP, with novel techniques for constructing concise user profiles by selecting informative pieces of user-side text. To mitigate the absence of explicitly negative training points, we employed a technique of picking weighted negative samples from unlabeled data. Our experiments, with both standard evaluation and a search-based mode, show that leveraging user text is beneficial in this data-poor regime, and that our CUP methods clearly outperform state-of-the-art baselines like DeepCoNN, BENEFICT, P5, LLM-Rec and a method based on frozen BERT.

Our main goal has been to shed light into the research questions posed in Section I-B).

For RQ1, we devised various ways of leveraging LMs, and we compared to recent works on using LLM prompts for item ranking. We found that such an LLM-only approach is not viable, as it cannot cope with long-tail items. In contrast, using LLMs to generate concise user profiles, based on our CUP framework, is a competitive solution. However, they do not outperform the simpler CUP variants, like idf-based sentences, and the simpler methods are low-cost.

For RQ2, we limited the number of input tokens for the Transformer to 128, creating a stress-test but also aiming to minimize the computational and energy cost at training time. Our experimental findings confirmed that this tight length for user profiles is indeed viable, yielding good NDCG@5 and P@1 results (relative to the difficulty of this data).

For RQ3, we devised various techniques for selecting informative cues from user reviews. A key observation from the experiments is that a moderate degree of sophistication, like idf-selected sentences, is largely sufficient. Our default configuration CUPidf performs on par with more elaborated variants, such as leveraging SentenceBERT or having T5- generated keywords for concise profiles. Moreover, CUPidf mostly outperforms even the profiles that were generated by ChatGPT with the full review texts as input.