Rubicon Rehabilitation Center in the Virginia Press 1971-1976

Rubicon Graduates Celebrating: Photo from Rubicon Current vol 2. Box 60 Folder 15, The Papers of Stanley Clay Walker, Special Collections and University Archives, Patricia W. and J. Douglas Perry Library, Old Dominion University Libraries, Norfolk, VA.

By 1971 Rubicon had become the largest in-patient rehabilitation program in the state of Virginia, maintaining extensive partnerships with the Department of Vocational Rehabilitation, the Medical College of Virginia(MCV), Richmond City Health Department, and the Richmond Public School system. Through its partnership with MCV it became the only federally approved methadone program between Washington D.C. and Miami.¹In a time where the merits and demerits of drug abuse treatment were in constant debate internationally Rubicon became the medium through which newspapers throughout the state of Virginia localized rehabilitation issues. By using the text-mining tools in R and a corpus of 80 newspapers from five different cities in Virginia a glimpse of this conversation can be gained.

The Corpus Over Time

The Danville Register, The Harrisonburg Daily Record, The Norfolk Journal and Guide, The Petersburg Progress Index, The Winchester Star

The graph above shows that mentions of Rubicon generally declined over time. This is probably dues to several factor: the decline in novelty, the slowing of intake at Rubicon, and shifting drug control priorities. It also reveals the relationship Rubicon had with Petersburg. Many of Rubicon’s admits were funneled to them through the Petersburg court system. Interestingly enough mentions in the two cities with the Rubicon facilities near there localities drop of in 1973. This is in line with larger statewide drug arrest trends that show a dip in arrests in 1973.

Most Characteristic Words in the Corpus

tf-idf-formula

Using TF-IDF(term frequency-inverse document frequency) statistic to extract key terms from the four newspapers that wrote the most about Rubicon can provide a distant look at the semantic difference between newspapers. For more on TF-IDF see Kan Nashida’s blog.

tfidf

As can be seen TF-IDF produces some interesting results. The Danville newspaper uses words like “cares, “ceremony”, and “morals”, showing an interest in the positive impact of Rubicon. It also uses words like “chain”,”officials”, and “mental” which may reflect an interest in the organizational mechanics of Rubicon. Similarly, Harrisonburg uses words like “experimentation”, “crowded”, and, “designing” that imply and interest in how Rubicon was ran and maintained. The overlap of words between Harrisonburg and Danville may be due to proximity. The two cities were farther away from Rubicon then Norfolk and Petersburg and likely relied on the same AP reports. The Norfolk Journal and Guide is the only historically black newspaper in the corpus and discusses the black panthers more according to the TF-IDF metric. It is also the only newspaper that has a drug word(LSD) in its top ten of most characteristic words. Words like “mediated”, “helped”, and, “intervened” point to the expansion of Rubicon into the Norfolk area in 1973. Words in the Petersburg Progress Index reflect a similar closeness between Petersburg and Rubicon. “Unemployment”, “problems” and the disproportionately frequent use of “their” signify close economic and organizational ties.

Correlations

Correlation Matrices are another text-mining tool that can help shed light on Rubicon without a close reading. Correlations measure the strength of the relationship between variables. A correlation <0 indicates a negative relationship while a correlation>0 indicates a positive relationship.

rubicon-dem-corr

The matrix above shows a close correlation between the word “Rubicon” and the plural “men” across the whole corpus. On the other hand, it also shows a negative correlation between “Rubicon”, and the plural “women”. Surprisingly, race did not play a significant role in the coverage of Rubicon in the newspapers even though it appeared frequently throughout the press during this period.

race

rubicon-and-justice

Rubicon’s relationship with the workings of the justice system is a bit more nuanced. Its important to remember that all of the newspapers mention Rubicon. The fact that Rubicon does not correlate highly with “rehabilitation” and “treatment” shows that Rubicon had reached a level of public notoriety that it no longer had to be described using these terms. Even so there is still a positive correlation between it and the words “arrested” and “court.”

kelly-and-menken

F. John Kelly, the director of the Governor’s Council on Narcotics and Drug Abuse Control, and Ed Menken the director of Rubicon had a sometimes contentious relationship in the press. Menken frequently accused Kelly of taking a soft approach toward drug rehabilitation. The graphic above shows that Kelly correlates more highly with “treatment” but not “rehabilitation” than Menken. This could just be a matter of different word choices between the two after all Kelly is mentioned in 12 different articles while Menken is only mentioned in 6.

The positive correlation between Kelly and Menken denotes the level of dialog between the two. From the view of the frequent newspaper reader Kelly and Menken were locked in constant debate over rehabilitation resources and agendas. This constant pairing would have made Menken seem less like the Director of a private rehab and more like Kelly political equivalent. Another surprise from figure 6 is the lack of correlation between Kelly, Menken, and Rubicon with the word “methadone.” Despite their advocacy for rehabilitation and treatment neither Kelly or Menken wanted to broach the controversial topic of methadone.

Conclusion

correlation

By analyzing the terms that correlate with Rubicon its institutional identity clearly exceeds that of its grassroots activist identity. Clinical terms like “detoxification”, “termed”, “outpatient”, “intensive”, “acute”, “provide”, and “offer” speak the business and medical side of the organization, and perhaps signify its movement toward a rehab ran by medical professionals rather than former addicts. Coverage of Rubicon in the Virginia press neutralized the racial and activist components of the organization, thus helping to perpetuate the image of it as a state institution that both engaged in policy discussions and became a component of the justice system.

Code

library(stringr)
 library(corrplot)
 library(ggplot2)
 Convert Download articles into .txt and place in dataframe
 # folder with article PDFs
 dest <- "C:\\Users\\virgo\\Desktop\\Rubicon"
 # make a vector of PDF file names
 myfiles # convert each PDF file that is named in the vector into a text file
 # text file is created in the same directory as the PDFs
 # use pdftotxt.exe
 lapply(myfiles, function(i) system(paste('"C:\\Users\\Virgo\\Destop\\xpdf/bin64/pdftotext.exe"', paste0('"', i, '"')), wait = FALSE) )
 #create vector of txt file names
 rubiconfiles<-list.files(path = dest, pattern= "txt", full.names = TRUE)
 #turn into a list
 obj_list rubicon<-data.frame(obj_list)
 Clean up rubicon
 ##import rubicon.csv
 ##convert article text into lowercase and turn it into a string
 rubicon$Text<-tolower(rubicon$Text)
 rubicon.string ## split the string into words
 rubicon.string Word.list.df colnames(Word.list.df) ## remove blanks,lower, numbers
 Word.list.df Word.list.df$word<-tolower(Word.list.df[,1])
 Word.list.df ###create DTM
 target.list DTM.df ncol = length(target.list)))
 for (i in seq_along(target.list))
 {
 DTM.df[,i] }
 colnames(DTM.df) #nornalize DTM
 total.words DTM.matrix DTM.matrix DTM.norm.df #For Figure 2
 ###import rubicon mentions.csv and create line graph that shows mention of rubicon overtime
 ggplot(yy, aes(Year,Mentions))+geom_line(aes(colour=Newspaper), size=1.5)+labs(title="Mentions of 'Rubicon' Over Time") + xlab("Year") + ylab("Mentions") +theme_bw()
 For Correlations
 ##correlation
 short.list DTM.norm.mini.df #To get the correlation matrix
 cor.matrix.mini round(cor.matrix.mini, 2) ## rounds off at 2 places
 corrplot(cor.matrix.mini, method="shade",shade.col=NA,tl.col="black",tl.srt=45,addCoef.col="black",order="AOE", type="lower",title="Rubicon and Demographic Correlations",mar=c(0,0,2,0) )
 For Figure 8
 #word associations
 findAssocs(DTM, "rubicon", 0.57)
 #build dataframe for plotting
 toi <- "rubicon" # term of interest
 corlimit rubiconterms Terms = names(findAssocs(DTM, toi, corlimit)[[1]]))
 ggplot(rubiconterms, aes( y = Terms)) +geom_point(aes(x = corr), data = rubiconterms, size=2) +xlab(paste0("Correlation with the term ", "\"", toi, "\""))
 For Figure 3
 library(tm)
 library(RWeka)
 library(stringr)
 #import rubicon.csv and condense into articles by paper
 by.paper<-NULL
 for(paper in unique(rubicon$X4)){
 subset text row by.paper }
 # create corpus
 myReader corpus # pre-process text
 corpus corpus corpus corpus corpus # create term document matrix
 tdm<-TermDocumentMatrix(corpus)
 # remove sparse terms
 tdm. # save as a simple data frame
 count.all count.all$word write.csv(count.all, "C:\\Users\\virgo\\Desktop\\folder\\tdm.csv", row.names=FALSE)
 #normalize
 ## paste the text into one long string
 big.string ## split the string into words
 big.string ## get a dataframe of word frequency
 Word.list.df ## give the dataframe some nice names
 colnames(Word.list.df) ## remove blanks
 Word.list.df ## add \\b so the words are ready for regex searches
 target.list Word.list.df function(x) str_count(by.paper$text, x)
 count.matrix <-
 sapply(X = target.list, FUN = function(x) str_count(by.paper$text, x))
 ## lines below are clean up
 DTM.df colnames(DTM.df) DTM.matrix DTM.matrix DTM.norm.df paper.tfidf.df function(x) x*log(nrow(DTM.norm.df)/(sum(x!=0)+1))))
 rownames(paper.tfidf.df)<-c("Danville","Harrisonburg","Petersburg","Radford","Winchester","Norfolk")
 x<-6
 Tfidf.ten.df ## transpose for easier sorting
 Tfidf.ten.df ## add words
 Tfidf.ten.df$words ## sort and get top ten
 tfidf.ten tfidf.ten$words
 ###plot tfidf
 n p d h mycolors colnames(p)[1]<-"paper"
 colnames(p)[2]<-"word"
 ggplot(p, aes(paper, rank)) +
 geom_point(color="white") +
 geom_label(aes(label=p$word,fill=p$paper), color='white', fontface='bold', size=5) +
 scale_fill_manual(values = mycolors) +
 theme_classic() +
 theme(legend.position=1,plot.title = element_text(size=18), axis.title.y=element_text(margin=margin(0,10,0,0))) +
 labs(title="Most Characteristic Words per Newspaper") +
 xlab("") + ylab("Ranking by TF-IDF") +
 scale_y_continuous(limits=c(-4,10), breaks=c(1,6,10), labels=c("#1","#5", "#10")) +
 annotation_custom(Norfolk, xmin=.5, xmax=1.5, ymin=0, ymax=-4) +
 annotation_custom(Petersburg, xmin=1.5, xmax=2.5, ymin=0, ymax=-4) +
 annotation_custom(Danville, xmin=2.5, xmax=3.5, ymin=0, ymax=-4) +
 annotation_custom(Harrisonburg, xmin=3.5, xmax=4.5, ymin=0, ymax=-4)
 For Figure 5
 #import csv or race articles numbers
 p<-ggplot(race,aes(x=newspaper, y=articles,fill=as.factor(newspaper))) + geom_bar(stat="identity")+facet_wrap(~word, scales = "free")+theme(axis.text.x = element_text(angle = 45, hjust = 1))