Teaching Social Media Analytics in PR Classes: Focusing on the Python Program

Editorial Record: Submitted June 4, 2022. Revised October 21, 2022. Revised January 8, 2022. Accepted January 26, 2023. Published May 2023.

Authors

Kim, Seon-Woo
Ph.D. Candidate
Manship School of Mass Communication
Louisiana State University
USA
Email: kr.seonwoo@gmail.com

Chon, Myoung-Gi, Ph.D.
Associate Professor
School of Communication and Journalism
Auburn University
USA
Email: mzc0113@auburn.edu

Abstract
The teaching brief introduces what to teach and how to teach social media analytics for PR educators in a university. It suggests a semester-long curriculum for an independent research method class for both graduates and undergraduates. First, we discuss why students can better learn programming languages over industrial platforms. In addition, we compare three different ways of data collection (crawling, API, and download) and discuss the pros and cons. Then, it presents (1) data collection through API, (2) text mining, and (3) network analysis with the shared Python code on GitHub and the step-by-step tutorial for PR educators who are unfamiliar with programming languages. This brief is expected to help to bridge the gap between the growing demands of programming-based analytics in PR practice and education.

Keywords: Social media analytics, Pedagogy, Python, API, Text mining, Network analysis

Social media has become integral to digital public relations (Ewing et al., 2018). PR companies perceive social media analytics (SMA) as a useful tool to identify who is a target public, understand the current environment around an organization, measure PR campaign outcomes, build relations with stakeholders and influencers, and many more (Kim, 2021). Responding to the growing demands of social media analytics in the PR industry, analytics curricula in PR programs need to be developed to educate PR students (Commission on Public Relations Education, 2018). 

However, public relations educators have faced challenges in learning and teaching social media analytics. Most PR instructors have not had an opportunity to learn computer programming knowledge for analytics during their academic careers, such as Python or R. Moreover, teaching analytics requires understanding new methodologies and data types, such as natural language processing, network theory, and deep learning. Given this background, the current analytics classes in PR programs mostly focus on conceptual knowledge and the use of other commercial tools. 

For example, students have learned how to use proprietary platforms, such as Brandwatch and Sprinklr, and interpret the results on those platforms. It is also common for instructors to ask students to get certificates in Google Analytics and Hootsuite as evidence of their analytics competence (Ewing et al., 2018). Despite the above efforts, PR professionals recommend that PR graduates have programming knowledge for PR work automation and tailored PR services to clients (Szalacsi, 2019; Trafalgar Strategy, 2022). Heavy reliance on industrial analytics platforms would limit students’ SMA competency within the platforms’ modality, thereby preventing them from developing advanced analytical abilities.   

To fill the gap, this teaching brief aims to provide a pedagogical foundation for utilizing Python as an SMA tool. Particularly, this teaching brief explains an SMA class material based on data collection, text mining, and network analysis. We provide Python codings that PR educators can use in classrooms to teach Python programming language. Python is the most frequently used programming language in data science (TIOBE, 2022; Woodie, 2021). The Python codes are designed as simplified as possible for a PR analytics introduction class. We provide step-by-step instructions about the Python codes to help readers understand and follow the programming function. This teaching brief is expected to encourage programming-based SMA classes in public relations classes.

Teaching Objectives

Table 1 summarizes the learning objectives of what this teaching brief delivers and required Python packages. This brief consists of three parts: data collection, text mining, and network analysis. First, students are expected to obtain knowledge about Tweet data collection through API. Because APIs tend to provide free versions and have similar ways of use across social media platforms (e.g., Facebook, Instagram), the practice of collecting tweets via API helps students equipped with social media data collection skills for various SNSs without cost. In addition to this, we introduce ways to find content published by influencers and popular tweets. Lastly, students can learn to save the collected data as a spreadsheet format (i.e., xlsx).

Table 1: The Overview of Learning Objectives

 Learning outcomesPython package
Python downloadhttps://github.com/formulated/PR_education_Python 
Data collection– Learn how to apply for Twitter Academic Research Access
– Apply a Twitter access code to Python
– Create search query, including keyword and date
– Collect Tweets through the shared Python code
– Sort out tweets by the number of likes, retweets, and followers
– Save the collected data as the excel format on your local computer
– pandas
– twarc
– os
– requests
– time
Text mining– Load the collected Twitter data
– Text data cleaning
– Create Word cloud
– Calculate word frequency and visualize text mining results
– pandas
-nltk.corpus – re
Network analysis– Create data for network analysis from the collected Twitter data (i.e., mention relation and retweet relations)
– Network object generator
– Generate a network graph
– Export network data for visualization on Gephi
– Calculate various centrality scores
– pandas
– networkx

Next, students can apply text mining and network analysis to their own data collected by API. Through the text mining section, students learn to load and clean data, and create a word cloud and calculate word frequency with its visualization. The network analysis section introduces a simple conceptual understanding of network data, how to construct network data, how to calculate centrality scores, and visualization preparation through Gephi, a popular network visualization tool utilized in academia and industry. 

Teaching Preparation

To use the shared Python code for teaching, educators need to have some basic Python skills. Also, educators and students must install some Python applications and packages. Python code for this teaching brief is available on GitHub (https://github.com/formulated/PR_education_Python). We assume that readers have already installed Python 3 (https://www.python.org/downloads/) and Jupyter notebook (https://jupyter.org/install). Python is a programming language, and Jupyter is a web-based interactive computational environment for Python programming. If not, we recommend installing Anaconda (Anaconda, n.d.), which includes Python 3 and Jupyter Notebook. After installation of Python 3 and Jupyter, launch Jupyter and open the shared code in a new notebook. 

Then, readers need to install the required Python packages, such as pandas and networkx, for each Lesson (see Table 1). 

When running the shared Python code without installing required packages on a computer, it will show the following message “ModuleNotFoundError:no module named ‘XXX’.” Module is synonymous for package in Python. 

To install a Python package on Mac, open Terminal application and type “pip3 install [package name]”. For example, below is the command for installing pandas (see Figure 1). 

Figure 1

On Windows, open Anaconda Prompt program and type “pip3 install [package name]” as below (see Figure 2). 

Figure 2

This short brief cannot cover every single Python code one by one. Instead, we focus on which codes should be edited to properly run the Python code and serve different learning objectives in classrooms. For example, some classes would focus more on organizational PR while others political PR. In this case, the code may require unique search keywords depending on the subject. 

PR educators should have some basic knowledge in Python (e.g., installation, running code, basic built-in functions) prior to giving students a demonstration of the shared Python coding and modifying the codes for class projects and activities. To develop this basic proficiency, we recommend Python books for beginners (e.g., Codeone Publishing, 2022; Matthes, 2019) and freely-available online resources from YouTube, such as Learn Python in 1 Hour (Programming with Mosh, 2020, September 16). Also, online Python bootcamp courses, such as DataCamp (https://app.datacamp.com), are valuable resources for PR educators and students as they provide interactive web environments of Python for beginners. PR educators may connect the online bootcamp course to a part of their SMA course curriculum as assignments or pre-class activities. 

Lesson 1. Data collection from Twitter through API

There is nothing to analyze without data. Data collection is the start of extracting valuable insight from analytics before moving to gather information by organizing the data. Thus, among many required skills, data collection is the foundation of SMA (Kent et al., 2011). Growing PR jobs require data collection skills from the web and social media (Meganck et al., 2020). Before digitized public relations, PR practitioners had to manually scan and gather the environment around an organization, such as news monitoring and clippings. However, today’s digital society creates a massive number of user-generated contents about organizations on the web and social media, which makes it nearly impossible for PR practitioners to collect them manually. 

There are three main ways to collect web data: crawling, API, and downloading from industrial platforms. Table 2 compares the data collection methods. Web crawling, or scraping, refers to a mechanical collection of web data (e.g., text, image, sound, and video). A web crawler automatically extracts data from a website based on programming. Technically, it is possible to crawl data using freely-available packages in Python for most web pages, such as social media, news media, and web communities. These packages can be implemented for news clipping and issue/crisis monitoring as a daily PR practice.

Table 2: Comparison of Data Collection Methods

 CrawlingAPIDownload
Level of difficultyDifficultModerate – difficultEasy – moderate
PriceFreeFree or paidPricy
LegalityRiskySafeSafe
AlgorithmTransparentTransparentBlackbox
FlexibilityHighHighLow-High
Data accessibilityPartialFull or partialFull or partial
VariablesLimited – SomeSomeMany but blackbox

Writing crawling programming requires advanced programming language and web structure knowledge such as HTML, HTTP, and CSS. Also, they should be updated whenever a website changes its layout and structure. In addition, social media companies present limited, personalized feeds and content to each account based on their algorithms and other variables (e.g., follower network, search history, location). Thus, a web crawler often cannot access the full-archived data because it can only collect data visible on the website, which may raise content representativeness issues. A crawler also cannot get invisible metadata and variables that a social media company provides to API and industrial platforms, such as user profiles (e.g., when an account was created) and metadata (e.g., the name of the app the user posted from). If necessary, you have to construct variables from crawled data. Crawling may face some legal issues if you do not get an agreement from a social media company prior to collecting the data.

Another way to collect data from social media is to use API (application programming interface). Many software companies provide API to let other third-party services and programmers use their service in a convenient way. For example, Apple and Google use weather API to provide weather services to customers without collecting weather data by themselves. Major social media platforms (e.g., Twitter, Meta) also provide API for users to collect data within the companies’ policy and authentication. Thus, it is relatively easier and safer to collect social media data compared to crawling because it is free of committing a violation of a website’s Policies and Terms of Service. 

Free version APIs usually have a basic data access level like a trial version, providing limited requests that you can make within a day and shorter historical data. There are paid API services with more or full-archived data access and functions. Major social media companies have opened their premium API for research and education purposes. For example, Twitter allows researchers to access the full tweet archive through Twitter’s Academic Research Access (Twitter Developer Platform, n.d.). After filling out an application and it being accepted, Twitter will provide an access code. Currently with API access, ten million tweets can be collected per month. Meta also runs CrowdTangle, where PR educators can access Facebook, Instagram, and Reddit data. APIs present some variables, such as message type (e.g., retweet, original), the number of engagements (e.g., likes, shares, comments), and geographic location.

Lastly, industrial platforms, such as Brandwatch (https://www.brandwatch.com/) and Sprinklr (https://www.sprinklr.com/), allow paid subscribers to download social media data from their platforms. Click-based user interfaces do not require programming. However, those platforms are pricey because their business model is B2B with governments, companies, and universities. Due to high prices, a few universities are not subscribing to those services for teaching and research purposes. If a department already subscribes to such a service, they are a good resource for PR analytics teaching. Like API, there are no legal issues in data collection and use within the companies’ policy and authentication, and many industrial platforms provide full historical archive access. Industrial platforms also provide a rich amount of metadata, such as users’ gender, sentiment, and users’ profession or organizations (e.g., journalists, politicians). However, it is not clearly known how the data resellers construct those variables for users (i.e., blackbox). Although some companies provide explanations about their variable construction, researchers typically cannot replicate the variables due to limited information.

Given the pros and cons of the three data collection methods mentioned above, this teaching brief introduces how to collect Twitter by using the Twitter Academic Research Access API. Because most major companies maintain Twitter accounts, and their contents are publicly available, a few researchers and PR practitioners choose Twitter for real-time issue monitoring and reputation management (e.g., Chon & Kim, 2022; Rust et al., 2021). In addition, data collection with API and Python is similar across social platforms. If educators and students understand the code for Twitter data collection, the code can be adjusted to get data from other platform APIs. 

Tutorial. Data collection

The teaching brief here shows how to collect Tweets by using Twitter API and Python. Twitter allows researchers to access the full tweet archive through Twitter Academic Research Access (Twitter Developer Platform, n.d.). After filling out applications, including research interest and affiliation, Twitter gives users access codes to collect ten million tweets per month. 

To run the Python code from the GitHub (Kim, 2022) that the author has created, you need to change the OAuth 2.0 Bearer Token (i.e., credential key or password for Twitter) and the query parameters (e.g., search keyword, date). The Bearer Token is given after achieving Twitter Academic API permission. In the below code line, the coder would insert their Bearer Token. The Bearer Token format is a long combination of alphabets and numbers (see Figure 3).

Figure 3

In query parameters, query indicates search keywords. Hashtags (i.e., #) and mentions (i.e., @) can be used as a search query (e.g., @PR, #PR). Tweet.fields indicates which variables are collected. The coding includes user numeric IDs (i.e., author_id), timestamp (i.e., created_at), and public metrics (i.e., retweet, reply, like, and quote). Also, data period should be set in start_time and end_time. If the code is run, tweets will be collected in excel data format (see Figure 4). 

Figure 4

We use this data for basic text mining and network analysis. There is a code for exporting the below data as an excel file (see Figure 5). 

Figure 5

[Figure 5 Should Be Here]

Next, because the data has variables such as the number of likes and retweets, it can figure out which tweets have the most engagement. Also, the number of tweets posted by user accounts indicates who are active and potentially stimulated publics on social media. Sorting users by the number of followers results in a list of influencers around a topic. 

The code below filters the top 10 most-retweeted tweets (see Figure 6). To get the most-liked tweets, a variable name in sort_value parameter should be changed (e.g., from ‘retweet_count’ to ‘like_count’). Additional codes filter users who wrote tweets the most about the issue and users with the highest number of followers. Depending on PR campaigns and activities, practitioners would edit to yield other valuable information. For example, combining these metrics with the time variable (i.e., created_at) may produce the best time/weekdays to post a social media posting. Practitioners may summarize weekly engagement with publics from social media campaigns by summing or averaging likes, shares, or the number of replies.

Figure 6

Lesson 2. Text mining 

Text mining (i.e., computational text analysis, natural language processing) is one of the most promising areas in public relations for listening to publics and stakeholders. Digitized communication environments continue to create an unlimited number of digital texts. Knowledge discovery from text data is recommended to increase an organization’s performance and efficiency beyond data retrieval. Excellence theory posits that listening to publics is more important than disseminating information (Grunig & Grunig, 2009). When PR practitioners instill publics and stakeholders’ voices into an organization, it can make effective strategic communication, which contributes to organizational success (Kim & Rhee, 2011). 

There are also many possible ways for text mining to assist public relations practices, such as topic discovery and opinion mining. For example, the topic analysis provides insights about the main topic, issue, and trend around an organization based on descriptive analysis (word frequency, co-occurrence) and algorithm (e.g., topic modeling). Opinion mining, or sentiment analysis, can be used to investigate reputations of an organization and a brand, issue, and crisis (Liu, 2011).

Text mining covers collection, preprocessing, analysis, and summary of text data based on mathematical algorithms. Analyzing a large amount of unstructured text requires different statistical methods and tools (Grimmer et al., 2021). For example, texts are unstructured, unlike traditional structured data (e.g., data in excel), so data cleaning is necessary to transform them into a structured format. Conventional statistical tools, such as SPSS and SAS, provide a limited text mining function, as originally designed to analyze structured data. Hence, programming skills in Python and R are preferred for text mining.

Tutorial. Text mining

In the shared Python code, text mining includes (a) loading the Tweet data, (b) text data cleaning (e.g., low transformation, stopwords removal), (c) word cloud, and (d) word frequency calculation and visualization. Also, this code can be used to analyze other text data from social media and other web pages if a data structure is the same (i.e., data with the same column names). Otherwise, the column names in other data should be edited. The first task for text mining is to load data (see Figure 7). For this example, the code imports the excel file collected through the Twitter API. Pandas is one of the best Python packages to load, preprocess, and analyze data. The pandas package is imported with the abbreviated name, pd, with the following code, “import pandas as pd” in the first code cell.

Figure 7

The next step is text cleaning, or preprocessing. Though any data needs some level of data cleaning before analysis, text data requires more effort in preprocessing due to the complexity of human language. User-generated content tends to include noise elements such as emojis, URLs, and stopwords. It is recommended to remove irrelevant elements for analysis purposes to improve computational efficiency and validity (Hickman et al., 2020; Welbers et al., 2017). Stopwords are functional words that have no substantial meaning, such as article (e.g., the, a, an), conjunctions (e.g., and, but), and prepositions (e.g., of, in) (see Figure 8).

Figure 8

Also, as computers are case-sensitive (e.g., computers cannot identify Computer and computer as having the same meaning like a human), text data are often converted to lowercase before analysis. Beyond the simple steps, there are different types of text cleaning methods, such as stemming/lemmatization, dimensionality reductions, bag-of-words, Word2vec, and so on. Text cleaning depends on which type of algorithms would be used and what the purpose is. The shared code removes URLs, emoticons, special characters (e.g., !, @), and stopwords. 

Next, a word cloud is created to visualize the contents. A word cloud is one of the most frequently used visualizations in text mining. It is similar to the descriptive analysis in statistics (e.g., mean, sd). A word cloud is often seen as a preliminary analysis in PR-published papers (e.g., Plessis, 2018; Macnamara, 2016). The size of word fonts is proportional to the word frequencies. The generated word cloud in the example shows that rt, new, year, happy, and prsaroadsafety are prominent in the text data (see Figure 9).

Figure 9

The next code calculates a word frequency and sorts the result in descending order by frequency. Word frequency generates insightful information, such as daily/weekly issues around an organization (see Figure 10). Also, a PR practitioner may evaluate a campaign’s performance by tracking the relevant hashtag frequency over time.

Figure 10

The last code is to make a word frequency visualization. If the index (e.g., from the current 0:20 to 0:50) is changed, the number of words in the graph will accordingly change (see Figure 11).

Figure 11

Lesson 3. Network analysis
Network analysis is gaining much popularity in public relations (Yang & Saffer, 2019). Network analysis deals with “structure and position” (Borgatti et al., 2013, p. 10). The network actor is an individual, group, organization, or inter-organizations. For example, companies have different types of relations (Borgatti et al., 2013), such as similarities (e.g., type of business), business relations (e.g., joint venture, alliance), interactions (e.g., trade), and flows (e.g., technology transfer). Network analysis has been applied to various PR topics such as organization-public/stakeholder relations, employee communication, crisis communication, and CSR (Yang & Saffer, 2019).

Centrality, the classical structural properties of a network, is one of the most commonly used concepts for network analysis and visualization (Freeman, 1978). A few PR studies have used centrality to investigate key publics/stakeholders (Hellsten et al., 2019; Himelboim & Golan, 2019), issues management (Sommerfeldt & Yang, 2017), agenda-setting (Guo, 2012), content diffusion network (Himelboim & Golan, 2019), and CSR performance (Jiang & Park, 2022). 

Also, network analysis can be combined with text mining to figure out how words occur together in text. Specifically, PR practitioners can illustrate brand images and salient issues of an organization by looking at co-occurrence results with the organization name (Gilpin, 2010). In addition, PR practitioners identify a community network (e.g., friends, followers) around influencers and target them to encourage them to pay attention to the PR campaign, which, in turn, may motivate the influencers to share the content (Zhang et al., 2016). Another possible application of network analysis for PR is to identify potential publics who show several advocacy activities with positive sentiments toward a relevant issue but not yet toward a client’s issue. Organizations target them to foster supportive postings on social media.

Tutorial. Network analysis

Loading data is the same in the text mining section. Because network analysis is based on relations, data should have relational information. Relations are expressed in many different ways. You may construct a relationship variable between organizations and/or publics from outside social media data, such as joint ventures, alliances, and NGO coalitions. You may also infer relationships from social media data. For example, follower-following relationships are a relationship example. If User A follows User B, you may use the relationship information for network analysis (e.g., User A → User B). Likewise, if User A mentions or retweets a User B’s tweet, you may set a tie from User A to User B. The tie direction could be reversed depending on your perspective. For example, some people think that the relation should be User B → User A when User A retweets User B’s tweet because User B’s information flows into User A. The example code shows how to make mention relations. “From” indicates users who mention a certain account, while “to” is a mentioned account by “from.” If you want retweet relationships, replace the red text in the first line (i.e., the regular expression) in the below code with r “RT @([A-Za-z]+[A-Za-z0-9-_]+)”. If so, the data indicates that users in the from column retweet posts generated by a user in the to column (see Figure 12).

Figure 12

The following screenshot shows two codes: network object generator (i.e., G) and its visualization in Python (see Figure 13). If there are more than a few nodes (i.e., actor) and edges (e.g., relation), Python network graphs are not visually attractive. Instead, a few researchers use other visualization tools such as Gephi (e.g., Raupp, 2019; Yang et al., 2017). The software is free to use on Windows and Mac (download and see in detail at https://gephi.org).

Figure 13

The code in Figure 14 transforms the network data into an excel for Gephi. To import the excel spreadsheet on Gephi, click file → import spreadsheet → open excel file (Gephi_df.xlsx) → import as “Edges table” in general excel options → finish.

Figure 14

Here, n indicates the number of the relations (i.e., how many times a source mentions a target on Twitter). Compared to a Python graph, Gephi generates visually attractive and easy-to-understand network graphics (see Figure 15).  

Figure 15

Centrality is one of the most frequently used metrics in network analysis. There are many different types of centrality, such as in-degree/out-degree centrality, betweenness centrality, eigenvector centrality, and so on. In the shared code, the NetworkX Python package provides different types of centrality calculations. See more network algorithm parameters at NetworkX (n.d.). For example, when “degree_centrality” in the below code is replaced with “betweenness_centrality,” it generates between-centrality scores for each node (see Figure 16).   

Figure 16

Suggested Curriculum

If an introductory level SMA course is provided within a semester of 16 weeks, it is possible to design the courses as in Table 3. It is critical for students to type and edit the shared codes rather than just read or see them in order to achieve the learning objectives in this teaching brief. The suggested curriculum, therefore, focuses on hands-on experience for PR SMA with Python. The first week introduces the course. Then, the next two weeks teach PR in the digital era, social media and its application in PR, and the SMA case study. After the conceptual understanding of SMA, two weeks would be required to teach each practical programming section: Python basic, data collection, text mining, and network analysis. Finally, the remaining weeks will be used for final projects and presentations. Considering students’ abilities and prerequisite courses, the curriculum would be adjusted to serve unique class demands.

Table 3: Example of PR SMA Course Curriculum

WeekTopicContents
1Introduction– Introduction to Course and Python
– PR in the digital era
– Social media and its application in PR
– Understanding social media analytics
3-4Python programming– Installation and Setup of Python and Jupyter Notebook
– Installing Python packages
– Reading and writing data (e.g., XLXS, CSV)
– data types (e.g., list, dictionary, tuple, JSON)
– Pandas data structure
– Data cleaning (e.g., data selection, merge, recode)
– Basic functions (e.g., define, for, if-else, while)
5-6Data collection– See learning outcome in Table 1 for programming contents
– Three different ways of data collection: crawling, API, and industrial platform)
– Introduction to data collection with API
– Data collection assignment
7-8Text mining– See learning outcome in Table 1 for programming contents
– Conceptual understanding of text mining
– Text mining assignment
9-10Network analysis– See learning outcome in Table 1 for programming contents
– Conceptual understanding of network analysis
– Network analysis assignment
11-13Applications of social media analytics– SMA case study – Social media metrics and evaluations
– Social media campaigns based on SMA
14-15Final project– Final project introduction
– Group Work days
16Student presentation– Final project presentation

What if educators can’t offer a separate class focusing on social media analytics and PR? We suggest a short course in a PR research class. Generally, PR research classes should cover many topics, such as qualitative research and quantitative research. However, research methods in the digital age should teach how to use social media to solve PR problems. PR educators may suggest multiple research methods using qualitative skills (e.g., focus group interview), quantitative skills (e.g., survey), and social media analytics through Python (e.g., text mining and network analysis). Students will be allowed to analyze unstructured data by choosing between text mining and network analysis.   

Assessment of Student Learning

Simply put, students can be assessed via three assignments (15% each worth of final grade) and a final group project (45% worth of final grade) with the remaining 10% points (e.g., attendance) for a semester class. 

Regarding the data collection assignment, students are required to submit a Python code file edited to collect tweets via their search queries. If it works without error, they get full credit. Instructors would consider extra credit when students collect data from other social media or web crawling. The text mining assignment asks students to submit a text mining Python code to create a word cloud and word frequency visualization with the collected data through the data collection assignment. In addition, students would be required to submit a document file analyzing the text mining results, as editing a few codes is too easy of a task for 15% credit. If students conduct additional analysis, such as sentimental analysis and topic modeling, they can be given extra credit. Likewise, network analysis would require a Python code of edited network analysis and a report. Network analysis assignments get extra credit when students present network visualization through Gephi beyond the suggested code. 

Lastly, the final project is group work with a team of three members. Students select a big organization (e.g., S&P 500) so that students can collect large enough social media data. They are asked to conduct (1) traditional formative research, (2) data collection, (3) text mining, (4) network analysis, and (5) social media campaign plan. Table 4 presents an example of the final project rubric. 

Table 4: Final Project Rubric

CriteriaContentsWeight (%)
Traditional formative research– Organizational history & mission
– Industry background & trend
– Identification of stakeholder, public, and society
– Traditional news media analysis
– SWOT analysis
20
Data collection– Social media data collection (e.g., tweets, Facebook)
– Identification of popular social texts
– Identification of key individuals (e.g., influencers)
20
Text mining– Main topics about company, brand, or products
– Sentiment analysis
– Text mining visualization (e.g., word cloud)
20
Network analysis– Identification and network positions of key public and stakeholder
– Network visualization with Gephi
20
Social media campaign planning– Discussion of current PR-related problems from formative research and social media analytics.
– Making three social media assets/tactics with target audiences
– Presentation of expected outcomes and impact on stakeholders, public, and society and measurement plan of campaign success
20

This project allows students to have a chance to apply the skills and knowledge they learn from the suggested SMA class in practice. Through the final project, they would realize the necessities of SMA along with traditional PR formative research (e.g., media coverage). The final project would also be adjusted if students in the class did not take a PR strategy or campaign class.

REFERENCES

Anaconda. (n.d.). Anaconda. https://www.anaconda.com/products/individual

Borgatti, S., Everett, M., & Johnson, J. (2013). Analyzing social network. SAGE Publications. 

Chon, M.-G., & Kim, S. (2022). Dealing with the COVID-19 crisis: Theoretical application of social media analytics in government crisis management. Public Relations Review, 48(3), 102201. https://doi.org/https://doi.org/10.1016/j.pubrev.2022.102201 

Codeone Publishing. (2022). Python programming for beginners: The #1 Python programming crash course to learn Python coding well and fast

Commission on Public Relations Education. (2018). Fast forward: The 2017 report on undergraduate public relations education. http://www.commissionpred.org/wp-content/uploads/2018/04/report6-full.pdf

du Plessis, C. (2018). Social media crisis communication: Enhancing a discourse of renewal through dialogic content. Public Relations Review, 44(5), 829-838. https://doi.org/10.1016/j.pubrev.2018.10.003 

Ewing, M., Kim, C. M., Kinsky, E. S., Moore, S., & Freberg, K. (2018). Teaching digital and social media analytics: Exploring best practices and future implications for public relations pedagogy. Journal of Public Relations Education, 4(2), 51-86. https://journalofpreducation.com/2018/08/17/teaching-digital-and-social-media-analytics-exploring-best-practices-and-future-implications-for-public-relations-pedagogy/

Freeman, L. (1978). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215-239. https://doi.org/10.1016/0378-8733(78)90021-7

Gilpin, D. (2010). Organizational image construction in a fragmented online media environment. Journal of Public Relations Research, 22(3), 265-287. https://doi.org/10.1080/10627261003614393 

Grimmer, J., Roberts, M. E., & Stewart, B. M. (2021). Machine learning for social science: An agnostic approach. Annual Review of Political Science, 24(1), 395-419. https://doi.org/10.1146/annurev-polisci-053119-015921 

Grunig, J. E., & Grunig, L. A. (2009). The excellence theory. In C. H. Botan & V. Hazleton (Eds.), Public relations theory II (pp. 21-62). Routledge.

Guo, L. (2012). The application of social network analysis in agenda setting research: A methodological exploration. Journal of Broadcasting & Electronic Media, 56(4), 616-631. https://doi.org/10.1080/08838151.2012.732148 

Hellsten, I., Jacobs, S., & Wonneberger, A. (2019). Active and passive stakeholders in issue arenas: A communication network approach to the bird flu debate on Twitter. Public Relations Review, 45(1), 35-48. https://doi.org/10.1016/j.pubrev.2018.12.009 

Hickman, L., Thapa, S., Tay, L., Cao, M., & Srinivasan, P. (2020). Text preprocessing for text mining in organizational research: Review and recommendations. Organizational Research Methods, 25(1), 114-146. https://doi.org/10.1177/1094428120971683 

Himelboim, I., & Golan, G. J. (2019). A social networks approach to viral advertising: The role of primary, contextual, and low influencers. Social Media + Society, 5(3). https://doi.org/10.1177/2056305119847516 

Jiang, Y., & Park, H. (2022). Mapping networks in corporate social responsibility communication on social media: A new approach to exploring the influence of communication tactics on public responses. Public Relations Review, 48(1), 102143. https://doi.org/10.1016/j.pubrev.2021.102143 

Kent, M. L., Carr, B. J., Husted, R. A., & Pop, R. A. (2011). Learning web analytics: A tool for strategic communication. Public Relations Review, 37(5), 536-543. https://doi.org/10.1016/j.pubrev.2011.09.011 

Kim, C. M. (2021). Social media campaigns. Strategies for public relations and marketing (2nd ed.). Routledge. 

Kim, J.-N., & Rhee, Y. (2011). Strategic thinking about employee communication behavior (ECB) in public relations: Testing the models of megaphoning and scouting effects in Korea. Journal of Public Relations Research, 23(3), 243-268. https://doi.org/10.1080/1062726X.2011.582204

Kim, S.-W. (2022, March 21). [Twitter API] AcademicTrack. https://github.com/formulated/PR_education_Python/blob/main/Crawling_Twitter%20Academic%20Track/%5BTwitter%20API%5D%20AcademicTrack.ipynb

Liu, B. (2011). Web data mining. Exploring hyperlinks, contents, and usage data (2nd e.d.). Springer. https://doi.org/10.1007/978-3-642-19460-3

Macnamara, J. (2016). Organizational listening: Addressing a major gap in public relations theory and practice. Journal of Public Relations Research, 28(3-4), 146-169. https://doi.org/10.1080/1062726X.2016.1228064 

Matthes, E. (2019). Python crash course. A hands-on, project-based introduction to programming (2nd ed.). No Starch Press. 

Meganck, S., Smith, J., & Guidry, J. P. D. (2020). The skills required for entry-level public relations: An analysis of skills required in 1,000 PR job ads. Public Relations Review, 46(5), 101973. https://doi.org/10.1016/j.pubrev.2020.101973 

NetworkX. (n.d.). Algorithms. https://networkx.org/documentation/stable/reference/algorithms/index.html 

Programming with Mosh. (2020, September 16). Python for beginners. Learn Python in 1 hour. https://www.youtube.com/watch?v=kqtD5dpn9C8&t=24s

Raupp, J. (2019). Crisis communication in the rhetorical arena. Public Relations Review, 45(4), 101768. https://doi.org/10.1016/j.pubrev.2019.04.002 

Rust, R. T., Rand, W., Huang, M.-H., Stephen, A. T., Brooks, G., & Chabuk, T. (2021). Real-time brand reputation tracking using social media. Journal of Marketing, 85(4), 21-43. https://doi.org/10.1177/0022242921995173 

Sommerfeldt, E. J., & Yang, A. (2017). Relationship networks as strategic issues management: An issue-stage framework of social movement organization network strategies. Public Relations Review, 43(4), 829-839. https://doi.org/10.1016/j.pubrev.2017.06.012 

Szalacsi, B. (2019). AI and data science understanding is now a critical path for public relations and communications professionals. https://medium.com/infonation-monthly/ai-and-data-science-understanding-is-now-a-critical-path-for-public-relations-and-communications-1617731a99b0

TIOBE. (2022). TIOBE Index for March 2022. Retrieved March 21 from https://www.tiobe.com/tiobe-index/

Trafalgar Strategy. (2022). PR & Python: Why all PRs can benefit from coding experience. Trafalgar Strategy. https://www.trafalgar-strategy.co.uk/pr-python-why-all-prs-can-benefit-from-coding-experience/

Twitter Developer Platform. (n.d.). Twitter academic research access https://developer.twitter.com/en/products/twitter-api/academic-research 

Welbers, K., Van Atteveldt, W., & Benoit, K. (2017). Text analysis in R. Communication Methods and Measures, 11(4), 245-265. https://doi.org/10.1080/19312458.2017.1387238 

Woodie, A. (2021). What’s driving Python’s massive popularity? Retrieved March 21 from https://www.datanami.com/2021/10/20/whats-driving-pythons-massive-popularity/

Yang, A., & Saffer, A. J. (2019). Embracing a network perspective in the network society: The dawn of a new paradigm in strategic public relations. Public Relations Review, 45(4), 101843. https://doi.org/10.1016/j.pubrev.2019.101843 

Yang, A., Wang, R., & Wang, J. (2017). Green public diplomacy and global governance: The evolution of the U.S–China climate collaboration network, 2008–2014. Public Relations Review, 43(5), 1048-1061. https://doi.org/10.1016/j.pubrev.2017.08.001 

Zhang, K., Bhattacharyya, S., & Ram, S. (2016). Large-scale network analysis for online social brand advertising. MIS Quarterly, 40(4), 849-868. https://www.jstor.org/stable/26629679 

© Copyright 2023 AEJMC Public Relations Division

To cite this article: Kim, S. and Chon, M. (2023). Teaching Social Media Analytics in Public Relations Classes: Focusing on the Python Program. Journal of Public Relations Education, 9(1), 117-146. https://journalofpreducation.com/?p=3663

1 thought on “Teaching Social Media Analytics in PR Classes: Focusing on the Python Program

  1. Pingback: Journal of Public Relations Education, Vol. 9, Issue 1 - Journal of Public Relations Education

Leave a Reply

Your email address will not be published. Required fields are marked *