Authors: Hattar Hattar, Zarqa University Ghanbari Ali, University of Science and Technology of Mazandaran Behshahr, Iran Hafez Mohamed, INTI-IU-University;Shinawatra University Karimi Ali, University of Science and Technology of MazandaranPirgazi Jamshid, university of science and technology of mazandaran
Sentence embeddings are fundamental to natural
language processing, as they enable models to capture semantic
meaning beyond surface-level word similarity. The ability to
represent sentences and paragraphs in dense vector spaces
facilitates tasks such as semantic search, paraphrase detection,
and textual inference. In this work, we present a comparative
analysis of eight representative models—paraphrasedistilroberta,
msmarco-roberta, paraphrase-mpnet, paraphrasexlm-
r, LaBSE, e5-base, gte-base, and bge-base—evaluated across
four benchmark datasets (MRPC, QQP, PAWS, and VISLA). Our
experiments highlight strengths and limitations of each model,
providing insights into their effectiveness across diverse semantic
similarity tasks.
Keywords: sentence embeddings,,retrieval,,evaluation
Published in: 2024 Asian Conference on Communication and Networks (ASIANComNet)
Date of Publication: --
DOI: -
Publisher: IEEE