Text as Data – April 2023

Event Phone: 1-610-715-0115

We're sorry, but all tickets sales have ended because the event is expired.

There are no upcoming dates for this event.


Cancellation Policy: If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50).
In the unlikely event that Statistical Horizons LLC must cancel a seminar, we will do our best to inform you as soon as possible of the cancellation. You would then have the option of receiving a full refund of the seminar fee or a credit towards another seminar. In no event shall Statistical Horizons LLC be liable for any incidental or consequential damages that you may incur because of the cancellation.
A 3-Day Livestream Seminar Taught by  Amber Boydstun, Ph.D and Cory Struthers, Ph.D

Text is all around us: from archived court documents to this morning’s social media posts, from transcripts of political ads to terrorist manifestos. Text-as-data methods allow us to use this text to measure and discover phenomena that may be otherwise hard or impossible to represent quantitatively, such as ideological positions of court documents and emotional sentiment in manifestos.

There has never been a more exciting time to learn text-as-data methods. Digital advances have made available text content that even a few years ago would have been difficult to collect and computational text-as-data methods have advanced just as fast. However, because there are now countless text data to explore and a dizzying array of accessible text-as-data tools to apply, understanding which methods are appropriate for what contexts is critically important.

This course will provide an introduction to text-as-data methods, including how they work, how they can be applied, and common pitfalls to avoid. We will focus on linking concepts to measurement through textual data. Topics covered include: manual content analysis; text collection and pre-processing; advanced keyword queries and frequencies; dictionary analysis (including sentiment analysis); text similarity and reuse; topic modeling; and supervised machine learning.

This seminar provides an intensive introduction to text-as-data methods, drawing on social science research and perspectives.

Here are some of the things you will be able to do by the end of this course:

  • Develop a content analysis codebook.
  • Acquire and organize text in R.
  • Pre-process text for analysis.
  • Calculate frequencies of key words or phrases in a corpus.
  • Evaluate the sentiment of a corpus.
  • Apply dictionary methods to a corpus.
  • Identify topics in a corpus.
  • Have the foundational knowledge to learn more about advanced text analysis methods.

Venue: