Working across disciplines, university researchers pursue fresh perspectives
Colin Wilder, Matthew Brashears, John Rose sift through earliest books
Posted on: February 17, 2020; Updated on: February 17, 2020
By Chris Horn, chorn@mailbox.sc.edu, 803-777-3687
The whole is greater than the sum of its parts, and two heads are better than one. The clichés apply in a lot of arenas, but they ring particularly true in research, which is seldom a solo endeavor. Perhaps no one on campus knows that better than Prakash Nagarkatti, the University of South Carolina’s vice president for research.
When Nagarkatti became vice president for research in 2011, he encouraged faculty members to focus on challenges unique to the Palmetto State and assemble university-wide teams to solve the problems.
To incentivize a transdisciplinary approach, Nagarkatti’s office established the ASPIRE II grant program, which offers internal grant funding of up to $100,000 for research proposals that include faculty members from three or more disciplines. In the eight years since the program launched, it has become quite popular, with some 50 proposals submitted every year, each one listing about five faculty members who want to collaborate. Together, the proposals represent about one-fourth of the university’s tenured and tenure-track faculty.
Since 2012, the Office of the Vice President for Research has invested $16.1 million in ASPIRE awards for faculty and postdoctoral scholars. In the same time period, ASPIRE recipients have garnered more than $171.2 million in subsequent extramural funding, including $71.8 million directly attributable to groundwork laid with an ASPIRE award.
Bookworms and broad questions
Colin Wilder admits the idea probably sounded audacious when he first conceived it — combing through millions of digital library records to explore three centuries of European book publishing.
But with an $85,000 ASPIRE II grant and a team that includes faculty from two other departments, plus students from three disciplines, Wilder’s audacious idea seems doable.
“The thing I most want to do is undertake a broad survey,” says Wilder, an assistant professor of history and associate director of the university’s Center for Digital Humanities. “[I want] to ask very broad questions — 300-year questions — like where were the centers of book publishing from 1500 to 1800? Where did they move from and to and at what times?
Because we have such a long longitudinal sweep of data, we’re hoping to develop, down the road, an ability to predict when intellectual movements are dying and when circumstances are ripe for a new intellectual movement to arise.
Matthew Brashears, sociology professor
“I’d like to create a view of the mountains, a topography of book publishing in its first three centuries.”
John Rose, a computer science professor who specializes in data analytics, joined the project “since it looked like there would be some scope for interesting approaches to storing and efficiently querying the data.”
“As it turns out,” says Rose, “our target data source changed to an even richer data set than we had hoped to work with. From my data analytics perspective, having access to a larger data set than you’d planned on is fantastic.”
The data logistics are daunting, but the team has partnered with the Online Computer Library Center of Dublin, Ohio, one of the leading library and information science foundations in the world and creator of WorldCat. Wilder’s group has just received delivery of some 10 million files, each file representing a book published during the 16th to the beginning of the 19th centuries.
Three computer science undergraduates are also part of the research team, focused on decluttering the dataset, eliminating duplicate files and preparing it for the data mining envisioned by Wilder and Matthew Brashears, a sociology associate professor and third faculty partner.
“The main payoff of working on this project is the challenge of working with data on this scale, the sheer number of books published over this long span of time,” says Brashears, a quantitative social scientist. “Because we have such a long longitudinal sweep of data, we’re hoping to develop, down the road, an ability to predict when intellectual movements are dying and when circumstances are ripe for a new intellectual movement to arise.
“It’s possible we can detect patterns in all this data that would allow us to predict future shifts. If we can develop this technique with books, then those same techniques could be applied not just to historical data, retrospectively, but to things we can apply to data in a contemporaneous sense.”
If that sounds a little like social media analytics, which sorts through billions of tweets, Facebook entries and other social media posts to detect present trends, well, it is — sort of.
“What we’re proposing to do is like doing social media extraction backwards,” says Victoria Money, a graduate student in sociology who serves on the research team. “Instead of searching for certain terms, this will be like letting it come to us and seeing what emerges from the data. It will be a little bit more inductive than deductive.”
Brashears says mining the centuries-old data also has a distinct advantage over mining today’s social media.
“One advantage we have going back and looking at this older data relative to a social media approach is that there are no bots generating false data, no elaborate botnets sock puppeting,” he says. “What you see in social media is not what you get. So, with our dataset, we won’t be tricked by that background noise.”
For Samyu Comandur, a computer science and statistics junior, the interdisciplinary aspect of the research team is crucial. “When we look at a record of library data, we could probably handle it on our own, but in some cases we have no idea of the context,” Comandur says. “I’m very grateful for the interdisciplinary nature of this work and to be at the Center for Digital Humanities.”
Other students involved in the project include computer science junior Vasco Madrid and Clay Norris, a computer science senior.