Comparative analysis of unsupervised keyword extraction methods

The paper examines the task of keyword extraction in «cold start» conditions – without annotated data for preliminary training. Three categories of existing methods are compared: classical algorithms (YAKE, TextRank, SingleRank, TopicRank, PositionRank, FRAKE), BERT-based methods (KeyBERT, KBIR-Inspec, ensemble method with KBIR-Inspec and WikiNEuRaL) and open weights large language models (LLM: llama3.1, qwen2.5, t-pro). Additionally, a methodology for automated keyword extraction benchmark preparation is proposed, making use of the proprietary LLM Claude3.5 Haiku. The methods are evaluated using «hard» and «sof» F₁-score metrics for different numbers of keywords. On two custom benchmarks, the open LLM t-pro with a 3-shot prompt showed the best results (F₁ = 0.40, F₁ = 0.35), even without domain-specific fine-tuning; however, it is also the most resource-intensive method (~22 G VRAM). «Lighter» methods show inferior results.

Authors: P. V. Korytov

Direction: Informatics, Computer Technologies And Control

Keywords: keywords, BERT, benchmark preparation, prompt engineering, large language models

View full article