---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- # KDD step 4: interpretation golden standard *--> annotated test set * 10-fold cross validation *taking 1000 tweets *training 800 tweets *test 100 tweets *val 100 tweets * compare to baseline scores *- frequency-baseline *you expect 80% of the tweets to be neutral(?) *- informative baseline *i have a 60% chance that it will rain tomorrow *--> your result need to be higher *otherwise --> why do all the work? from : CLiPS ­ Guy de Pauw, Pattern workshop — Cqrrelations, January 2015 ----------------------------------------------------------------------------