Virginia Tech® home

Introduction to Text Analytics in R

Course Dates: Course Not Offered This Semester

Who This Course Is For

Text data is naturally messy and challenging to analyze. Different elements of Natural Language Processing can help show patterns and signal out of text data. In this short course, we will go through an introduction to NLP and show how to create meaningful conclusions from text data. This course includes hands-on exercises in the open source software package R.



Duration:  2 hours

Course Dates: January 11th & 15th, 2021

Venue: Virtual 

Required Software:  R & R Studio (free)

Cost: Free to VT Participating Colleges and Administrative Units

Pre-requisites: Familiarity with R is strongly recommended.  Knowledge of basic descriptive statistics is assumed.

Prework includes downloading course materials, R, and R Studio.

By the end of this course you will be able to:


  • Read in and clean text data for analysis
  • Handle Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) weighting schemes
  • Apply package techniques to manipulate data and find associations between words
  • C    reate customized text data visualizations


This is a hands-on, interactive course where you will submit pre-written code followed by exercises where you write your own code based on examples.  You will leave with code that you can apply to your datasets.