Carla Abreu, Jorge Teixeira, Eugénio Oliveira (2015). “ENCADEAr: ENCADEAmento automático de notícias“, in Simões, Barreiro, Santos, Sousa-Silva & Tagnin (eds.) Linguística, Informática e Tradução: Mundos que se Cruzam, Oslo Studies in Language 7(1), 2015. 153–181. (ISSN 1890-9639 / ISBN 978-82- 91398-12-9)
Abstract: This work aims at defining and evaluating different techniques to automa- tically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords ex- traction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.