10.5061/DRYAD.70F4T
Saito, Shota
University of Tokyo
Hirata, Yoshito
University of Tokyo
Sasahara, Kazutoshi
Nagoya University
Suzuki, Hideyuki
University of Tokyo
Data from: Tracking time evolution of collective attention clusters in
twitter: time evolving nonnegative matrix factorisation
Dryad
dataset
2016
2016-09-23T00:00:00Z
2016-09-23T00:00:00Z
en
https://doi.org/10.1371/journal.pone.0139085
103567893 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Micro-blogging services, such as Twitter, offer opportunities to analyse
user behaviour. Discovering and distinguishing behavioural patterns in
micro-blogging services is valuable. However, it is difficult and
challenging to distinguish users, and to track the temporal development of
collective attention within distinct user groups in Twitter. In this
paper, we formulate this problem as tracking matrices decomposed by
Nonnegative Matrix Factorisation for time-sequential matrix data, and
propose a novel extension of Nonnegative Matrix Factorisation, which we
refer to as Time Evolving Nonnegative Matrix Factorisation (TENMF). In our
method, we describe users and words posted in some time interval by a
matrix, and use several matrices as time-sequential data. Subsequently, we
apply Time Evolving Nonnegative Matrix Factorisation to these
time-sequential matrices. TENMF can decompose time-sequential matrices,
and can track the connection among decomposed matrices, whereas previous
NMF decomposes a matrix into two lower dimension matrices arbitrarily,
which might lose the time-sequential connection. Our proposed method has
an adequately good performance on artificial data. Moreover, we present
several results and insights from experiments using real data from
Twitter.
Dataset from Tracking Time Evolution of Collective Attention Clusters in
Twitter: Time Evolving Nonnegative Matrix FactorisationThis dataset
contains tweets ID posted before and after one week Tohoku Earthquake and
iPhone 4 announcement. We collected 11,418,600 tweets posted in the
interval of 301 from 4th March 2011 to 16th March and 2,319,874 tweets
posted in the interval of from 1st June 2010 to 17th June 2010 by 438,464
users, which are mainly Japanese tweets, to know the dynamics in Twitter
when Japan had huge earthquakes in 11th March 2011 and iPhone announcement
in 7th June 2010.datatenmf.zip