HAHA 2019 Dataset: A Corpus for Humor Analysis in Spanish

Tipo

Paper de conferencia

Año

2020

Lugar publicado

Marseille, France

Publisher

European Language Resources Association

ISBN

979-10-95546-34-4

Páginas

5106

Abstract

This paper presents the development of a corpus of 30,000 Spanish tweets that were crowd-annotated with humor value and funniness score. The corpus contains approximately 38.6{%} of humorous tweets with an average score of 2.04 in a scale from 1 to 5 for the humorous tweets. The corpus has been used in an automatic humor recognition and analysis competition, obtaining encouraging results from the participants.

Autores

Aiala Rosá

Santiago Castro

Luis Chiruzzo

Citekey

chiruzzo-etal-2020-haha

URL a la publicación

https://www.aclweb.org/anthology/2020.lrec-1.628

Keywords