Throughout the last year, I have undertaken a series of small, independent projects in order to learn more about digital humanities tools and techniques.
- Designing a WordPress site
- Designing a Scalar site
- Designing an Omeka site
- Designing a Neatline map
- Designing a StoryMap JS
- Designing a Timeline JS for the Five Colleges of Ohio
- Testing out Voyant with Jude the Obscure and some other random texts
- Testing out my ability to use R to create a Structured Topic Model of Jude the Obscure (whatever that means)
- Testing out the Topic Modeling GUI with Jude the Obscure
- Setting up an OHMS account and learning how to use it
- Creating an alternative version of OHMS on WordPress
- Creating an Illustrated Guide to Doing Oral History
- Designing a Scalar infographic
- Creating a list of DH undergraduate majors and minors
- Maintaining a list of everything I’ve read relating to digital scholarship
- Writing and presenting an opinion piece at the Bucknell University Digital Scholarship Conference
- Learning the very basics of HTML, CSS, and jQuery
- Researching Blockchain and Bitcoin to understand their role in digital society
- Recording and editing videos about some of the projects undertaken on the Five College’s campuses
- Creating an interactive version of a text using jQuery on Scalar
- Designing and executing a research project about free speech on the five college campuses
Instead, I’m going to talk about that free speech research project. What ever happened to that? Why did that go out the window?
A short reminder for those of you who don’t know: I designed a research project in which I would interview students around different college campuses about their opinions on free speech, and then I would transcribe those interviews, analyze them using a topic model, and create some data visualizations from these results. The goal was to determine how students from different political backgrounds talked about free speech on college campuses. What went awry?
The short answer is that I simply didn’t have enough time to work on it once I started helping faculty members with their projects, so I put this project on the back burner and by the time I got back around to it, didn’t have all of the skills I needed in order to do what I wanted with the data.
The long answer is as follows:
I knew going into this project that it would require a lot of time to complete. In fact, I was kind of counting on that. When I started it in the summer, I did not have any foreseeable projects to work on aside from my own, even once the school year started back up again in the fall. There were a few faculty proposals floating around, but nothing concrete that would require my constant attention. So I designed a research project that would take up a lot of my time, knowing that if a big faculty project did come in, I would probably have to spend my time on that instead. That was okay with me. I had never designed a research project before, so doing that part alone, even without executing the project, would be a learning opportunity.
I also purposefully chose topic modeling as my method of analysis because I knew it hadn’t been frequently used in the projects conducted at the Five Colleges of Ohio. That’s my whole job, trying out new things and testing different digital humanities tools and techniques. I also chose topic modeling because it was so frequently mentioned in all of the DH books and articles I was reading (see reading list). Starting out, I think I thought there would be a “lite” version of topic modelling that I could do. I’ve chronicled my struggles with topic modeling in the past, so I’m not going to entirely rehash that now, but here are the main things to know:
- Voyant is helpful, but in all of my experience, it’s only helpful when I know what I’m hoping to see, and when I know the texts that I’m using extremely well. Maybe I’m just using it wrong or maybe I don’t know what I’m not seeing, but I didn’t see Voyant as a viable option for analyzing the data I was going to collect.
- The Topic Modeling GUI is a very cool tool and sometimes I’m able to get the results I’m looking for out of it, but that’s just it. I’m not sure how to use the GUI without manipulating the results to make something I’m hoping to see. That’s not very in line with the scientific method, is it? Besides, the more I interact with it, the less I understand it. I have tried using all kinds of different texts to find topics, and I have guessed and checked with every logical combination of parameters I know, and I still don’t know what that thing does. I’m 98% sure Willy Wonka had a hand in making it.
- There are tutorials on creating topic models from scratch (no GUI, no Voyant). I’ve only ever been able to follow one of these, and that was the STM one that I mentioned in my other blog post (see link above).
- Even when I did succeed in following one of these tutorials, I didn’t understand what I had made, which isn’t really helpful from a research perspective.
What you need to know is that topic modeling is not for the layman – or at least not for this laywoman. You have to know how to code and/or you have to know a lot about statistics. I’m sure there are people out there who know a lot more about this than me who might think there are ways someone without those skills could do topic modeling. If you’re one of those people, great! Comment below and tell me how to figure this out.
Needless to say, by the time I had collected a few interviews and read everything I could find about topic modeling – well, let’s be fair, I couldn’t read some of those resources because I had no idea what was being said. To quote The Good Place,
Chidi: Aren’t there some parts worth salvaging?
Michael: Honestly, man, I don’t even know. I mean that thing is unreadable. I literally learned what headaches were because that thing gave me a headache.“Tahani Al Jamil.” The Good Place. NBC. 22 Sep. 2016. Television.
By the time I’d done this, we were nearing the end of the year, and I knew I wouldn’t be able to complete this project with any semblance of statistical validity, or even attempt to follow the scientific method, or create an in any way functional result. It was at this point that I abandoned ship.
Thankfully, that’s not the end of this blog post. I still learned a lot from this project, even if it didn’t result in any actual findings. Here are some of the things I got out of doing this project:
First, I learned that there are some things I can’t learn on my own. I talked a lot about this in my Bucknell presentation, so I won’t repeat my entire argument here. However, I will reiterate that much of what we do in the liberal arts is “learn how to learn.” Until this year, I had really overlooked part of that process. I had never struggled to figure out what it was that I was unable to understand. Most of the time, especially throughout my college career, I could point at something and say “I do not understand that thing” and the next steps would easily follow. If I didn’t know how to use a formula in calculus, I would ask to be walked through the proof again. If I didn’t know how to start a short story, I would read a bunch of short stories and steal from one of them (haha, but also, yes, I really did this). I knew how to learn something once I had identified what I did not know and I had rarely struggled to identify those gaps in my knowledge. This project presented a new challenge. How do you learn something when you don’t know what it is you need to learn? The reality is that even as the internet expands and more and more people start doing digital humanities work, nothing can replace interaction with another person. This project taught me new ways to learn and showed me my limitations. That’s a really important thing to know.
Second, I came to appreciate the value of an interview in and of itself even more with this project. I had a couple of wonderful participants who shared interesting and well-thought-out ideas about free speech on campus. I found that their insights were a great contribution to my own understanding of students’ thoughts on this issue.
Finally, I gained a lot of skills. I learned how to:
- Construct a research project (and maybe more importantly, what to avoid when doing so)
- Protect sensitive data and interviewees’ privacy
- Communicate a research and data storage plan to others
- Get through IRB and use ethical research techniques
- Make consent and copyright forms
- Use plugin charts and graphs in Power BI (my chosen medium of data visualization)
- Use Trint, Garage Band, and condenser microphones
To look at this project and see it as a total loss or even a loss at all is a mistake. Sure, there were the sunk costs of time and effort, but that’s what this position is all about. It’s about learning how to learn something. To be completely cheesy and not at all original, we learn more from our failure than from our success. This position is all about trial and error. It’s iterative. It’s like trying to make your jQuery work when you don’t know jQuery. Eventually, you’ll get it right, but it’s going to take some time. I just didn’t have enough time to try another iteration of this project. The great thing is that next time I want to start a project like this one, I’ll have a head start. (Although maybe next time I’ll leave the topic modeling to the pros).