Timely identification of event start dates from Twitter

Authors

  • Florian Kunneman Centre for Language Studies, Radboud University Nijmegen
  • Ali H¨urriyetoğlu Centre for Language Studies, Radboud University Nijmegen
  • Nelleke Oostdijk Centre for Language Studies, Radboud University Nijmegen
  • Antal van den Bosch Centre for Language Studies, Radboud University Nijmegen

Abstract

We present a method for the identification of future event start dates from Twitter streams. Taking hashtags or event name expressions as query terms, the method gathers a certain number of tweets about an event and uses clues in these tweets to estimate at what date the event will start. Clues include temporal expressions with knowledge-based and automatically generated estimations, and other predictive words. The estimation is performed either with a machine-learning classifier or by taking a majority vote over the temporal expressions found in the set of tweets. Results show that temporal expressions are indeed strong predictors. The majority-based and machine-learning approaches attain equal performances when trained and tested on a single event type, soccer matches, with an average estimation error of 0.05 days; but when tested on a range of different events, the majority-voting approach shows to be more robust than machine learning for this task, yielding high performance on all events. Still, per-event differences hint at a context in which machine learning might be beneficial.

Downloads

Published

2014-12-01

How to Cite

Kunneman, F., H¨urriyetoğlu, A., Oostdijk, N., & van den Bosch, A. (2014). Timely identification of event start dates from Twitter. Computational Linguistics in the Netherlands Journal, 4, 39–52. Retrieved from https://clinjournal.org/clinj/article/view/39

Issue

Section

Articles

Most read articles by the same author(s)

1 2 > >>