Wednesday, November 25, 2015

vehicle DTC text mining in the R tm package

text mining in R how to process ngrams: comparing frequency of DTM across factor levels (like DTC against platform): create DTM create dataframe from DTM: df <- df$platform <- substr(data$vin_last_8,2,2) nice compact summary table: dtc_by_platform <- with(melt(df),tapply(value,list(variable,platform),sum)) and this table can be used as input to a barplot or chisquare test