----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
# KDD step 4: interpretation
golden standard
10-fold cross validation
- taking 1000 tweets
- training 800 tweets
- test 100 tweets
- val 100 tweets
-
compare to baseline scores
- - frequency-baseline
- you expect 80% of the tweets to be neutral(?)
- - informative baseline
- i have a 60% chance that it will rain tomorrow
- --> your result need to be higher
- otherwise --> why do all the work?
from : CLiPS Guy de Pauw, Pattern workshop — Cqrrelations, January 2015
----------------------------------------------------------------------------