Getting data on your government
September 1, 2012 Comments
Tags: R government nytimes sunlightlabs API transparency
I created an R package a while back to interact with some APIs that serve up data on what our elected represenatives are up to, including the New York Times Congress API, and the Sunlight Labs API.
What kinds of things can you do with govdat? Here are a few examples.
How do the two major parties differ in the use of certain words (searches the congressional record using the Sunlight Labs Capitol Words API)?
1 # install_github('govdat', 'schamberlain')
2 library(govdat)
3 library(reshape2)
4 library(ggplot2)
5
6 dems <- sll_cw_dates(phrase = "science", start_date = "1996-01-20", end_date = "2012-09-01",
7 granularity = "year", party = "D", printdf = TRUE)
8 repubs <- sll_cw_dates(phrase = "science", start_date = "1996-01-20", end_date = "2012-09-01",
9 granularity = "year", party = "R", printdf = TRUE)
10 df <- melt(rbind(data.frame(party = rep("D", nrow(dems)), dems), data.frame(party = rep("R",
11 nrow(repubs)), repubs)))
12 df$count <- as.numeric(df$count)
13
14 ggplot(df, aes(yearmonth, count, colour = party, group = party)) + geom_line() +
15 scale_colour_manual(values = c("blue", "red")) + labs(y = "use of the word 'Science'") +
16 theme_bw(base_size = 18) + opts(axis.text.x = theme_text(size = 10), panel.grid.major = theme_blank(),
17 panel.grid.minor = theme_blank(), legend.position = c(0.2, 0.8))

Let's get some data on donations to individual elected representatives.
1 library(plyr)
2
3 # Let's get Nancy Pelosi's entity ID
4 sll_ts_aggregatesearch("Nancy Pelosi")[[1]]
$name
[1] "Nancy Pelosi (D)"
$count_given
[1] 0
$firm_income
[1] 0
$count_lobbied
[1] 0
$seat
[1] "federal:house"
$total_received
[1] 13769274
$state
[1] "CA"
$lobbying_firm
NULL
$count_received
[1] 9852
$party
[1] "D"
$total_given
[1] 0
$type
[1] "politician"
$id
[1] "85ab2e74589a414495d18cc7a9233981"
$non_firm_spending
[1] 0
$is_superpac
NULL
1 # Her entity ID
2 sll_ts_aggregatesearch("Nancy Pelosi")[[1]]$id
[1] "85ab2e74589a414495d18cc7a9233981"
1 # And search for her top donors by sector
2 nancy <- ldply(sll_ts_aggregatetopsectors(sll_ts_aggregatesearch("Nancy Pelosi")[[1]]$id))
3 nancy # but just abbreviations for sectors
sector count amount
1 F 1847 2698672.00
2 P 981 2243050.00
3 H 829 1412700.00
4 K 1345 1409836.00
5 Q 1223 1393154.00
6 N 829 1166187.00
7 B 537 932044.00
8 W 724 760800.00
9 Y 820 664926.00
10 E 201 283575.00
1 data(sll_ts_sectors) # load sectors abbrevations data
2 nancy2 <- merge(nancy, sll_ts_sectors, by = "sector") # attach full sector names
3 nancy2_melt <- melt(nancy2[, -1], id.vars = 3)
4 nancy2_melt$value <- as.numeric(nancy2_melt$value)
5
6 # and lets plot some results
7 ggplot(nancy2_melt, aes(sector_name, value)) + geom_bar() + coord_flip() + facet_wrap(~variable,
8 scales = "free", ncol = 1)

1 ## It looks like a lot of individual donations (the count facet) by
2 ## finance/insurance/realestate, but by amount, the most (by slim margin)
3 ## is from labor organizations.
Or we may want to get a bio of a congressperson. Here we get Todd Akin of MO. And some twitter searching too? Indeed.
1 out <- nyt_cg_memberbioroles("A000358") # cool, lots of info, output cutoff for brevity
2 out[[3]][[1]][1:2]
$member_id
[1] "A000358"
$first_name
[1] "Todd"
1 # we can get her twitter id from this bio, and search twitter using
2 # twitteR package
3 akintwitter <- out[[3]][[1]]$twitter_id
4
5 # install.packages('twitteR')
6 library(twitteR)
7 tweets <- userTimeline(akintwitter, n = 100)
8 tweets[1:5] # there's some gems in there no doubt
[[1]]
[1] "RepToddAkin: Do you receive my Akin Alert e-newsletter? Pick the issues you’d like to get updates on and sign up here!\nhttp://t.co/nZfiRjTF"
[[2]]
[1] "RepToddAkin: If the 2001 & 2003 tax policies expire, taxes will increase over $4 trillion in the next 10 years. America can't afford it. #stopthetaxhike"
[[3]]
[1] "RepToddAkin: A govt agency's order shouldn't defy constitutional rights. I'm still working for #religiousfreedom and repealing the HHS mandate. #prolife"
[[4]]
[1] "RepToddAkin: I am a cosponsor of the bill being considered today to limit abortions in DC. RT if you agree! #prolife http://t.co/Mesrjl0w"
[[5]]
[1] "RepToddAkin: We need to #StopTheTaxHike. Raising taxes like the President wants would destroy more than 700,000 jobs. #4jobs http://t.co/KUTd0M7U"