NLP and data mining resources (based on recommendations by David Langer):
CRAN NLP Task view - self recommending. Information Retrieval book - an older book, but very relevant for text analytics. Natural Language Processing with Python book - based on Python’s NLTK package. Very versatile. Taming Text book contains good theory and covers the OpenNLP Java library which can be also accessed by R via the openNLP package. Tidyverse approach:
R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
You can embed an R code chunk like this:
library("datasets") summary(cars) ## speed dist ## Min. : 4.0 Min. : 2 ## 1st Qu.:12.0 1st Qu.: 26 ## Median :15.0 Median : 36 ## Mean :15.4 Mean : 43 ## 3rd Qu.