Building knowledge tools for the public good

Like many of you, my interest in learning extends beyond the teaching & learning that occurs within formalized educational institutions, which is why I am so interested in Wikipedia. I think Wikipedia is, arguably, the greatest knowledge repository human beings have ever built. Which is why I get so excited when I see projects from academics that make meaningful contributions to Wikipedia. Making Wikipedia better is making the world better by making knowledge more accessible to everyone. Projects like Visualizing Complex Science (found via this Read-only access is not enough blog post on Creative Commons).

The Visualizing Complex Science project was done by Dr. Daniel Mietchen, a Berlin based Researcher & Biophysicist. Dr. Mietchen created a bot that crawls open access science journals looking for multimedia content. When the bot find an image, video or audio clip, it extracts the content & uploads it to the Wikimedia Commons where it can be used by Wikimedia authors to enhance articles.

The bot has uploaded more than 13,000 files to Wikimedia Commons and has been used in more than 135 English Wikipedia articles that together garnered more than three million views.

In addition to the actual project itself, what I find interesting about this project is deconstructing all the conditions that had to exist in order for this project to happen. For me, the recipe for this specific project breaks down to this:

Academic Researcher + Wikipedia + Open Access + hackathon + structured data = jackpot win for human knowledge.

Dissecting this equation a bit, we have an academic researcher who “gets” Wikipedia on a couple of levels. First, he feels it has enough value and importance as a knowledge repository that he is willing to put time into making it better. Second, he understands the technical aspects of the platform well enough that he can build something that massively improves the collection. Finally, he understands that Wikipedia has a massive reach & is a great tool to disseminate complex scientific research in a manner that makes it accessible to everyone. Wikipedia needs more academics like Dr. Mietchen.

Then we have Wikipedia itself, imbued with the value of open on a number of levels. First, open to contributions from anyone. Without allowing anyone to contribute, Dr. Mietchen might very well have had to jump through many bureaucratic layers to make a contribution. Also, those who built the software for Wikipedia made the platform open enough so that people like Dr. Mietchen could build bots capable of doing projects like this.

The next critical piece is Open Access. Without having openly licensed and openly accessible research articles, the bot wouldn’t have any data to mine. And, even if technically it could mine proprietary research journals, they could not legally be shared to the Wikimedia Commons because they would be protected from reuse by copyright.

Now, there are a few things in that equation that seem especially interesting. First, the hackathon. What role did a hackathon have in the success of this project? Well, when you listen to Dr. Mietchen talk about the project, you’ll hear him explain how he was inspired to create the automated Wikipedia bot after attending hackathons and seeing what programmers could do in a short period of time.

The other bit I find interesting is the role that structured data (everybody’s favorite sexy topic) played in making this happen. Without structured metadata explaining to the machines what that content is, whether it is in the correct technical format, and categorizing it correctly in the Wikimedia Commons, the bot just wouldn’t work.


I think it is important to point out that these conditions were not put in place to make this project happened; the project happened because these conditions were already in place. It’s a crucial distinction, and a common story worth repeating when it comes to working with technology. It points to the importance of generativity in both Wikipedia and Open Access.

Generativity is a system’s capacity to produce unanticipated change through unfiltered contributions from broad and varied audiences. Jonathon Zittrain, The Future of the Internet — And How To Stop It 

Both Wikipedia and Open Access have high degrees of generativity. And because of that generativity, Dr. Mietchen was able to build a tool that neither could have anticipated when they were created. I am sure that the architects of both Wikipedia and Open Access hoped that projects like this would happen. But neither knew that they would. Instead, they built in the capacity to enable projects like this to emerge from the community. And, as a result, improve knowledge for all. 


Clint Lalonde

Just a guy writing some stuff, mostly for me these days on this particular blog. For my EdTech/OpenEd stuff, check out


5 thoughts on “Building knowledge tools for the public good

  1. I had a colleague send a wikipedia article to another colleague, who accused her of being unprofessional and unintelligent for using wikipedia!! I was shocked.

    Back to OERs, I am doing a review now for a class that uses an OER. It’s horrible. One problem is the teacher is teaching APA when she should be teaching MLA. The information on how to conduct research is scant at best. I don’t know whether the instructor chose her own OER or if the institution did, but I am not impressed.


  2. Most professors I know expressly prohibit students from using Wikipedia as a source because of credibility issues. Your thoughts?

    1. I generally agree with that. But as another avenue to help a student understand a topic it can be another powerful source, perhaps even adding a context that the instructor hadn’t which makes the student understand the material better.

      What I would ultimately like to see is those same faculty who don’t allow students to use Wikipedia become part of the Wikipedia community and steward those resources that they don’t allow their students to use. Because whether or not you officially sanction the use of Wikipedia in the classroom, their students are using it as a source. Maybe not a citeable source in term papers, but as a source nonetheless. So, why not contribute your expertise? Ultimately your students will benefit from it because they are looking there anyway.

Comments are closed.