I just came across the MIT Lecture Browser and am a bit smitten.
Essentially, it’s a combination speech to text converter and search engine for video lectures at MIT. Enter in a word and the search engine will not only find the videos that the word is used in, but it will also take you to the exact spot within the video where that word was used and give you a running transcript.
With more than 100 million videos online and another 100,000 being uploaded each day, there is an awful lot of great content that is, for the most part, hidden away from search engines that do nothing but search on tags, keywords and descriptions.
I imagine this kind of search will be much more common in the next couple of years – in fact, there are already some options emerging in this area.
The speech to text recognition isn’t perfect. I used the example search term of “wine”. one of the videos returned was from Nicholas Negroponte, talking about the hundred dollar laptop at the 2005 MIT Emerging Technologies Conference. I was intrigued. Where in that address did Negroponte talk about wine? Well, here’s the text quote:
show you a few slides so wine doubt laptop it does you could that that creek slips out and see sells word eat else can slip and
And here is Negroponte’s actual quote:
show you a few slides. It’s a wind up laptop. (bit of a stumble) that crank slips out and c cells or d cells could slip in
So the speech to text is a work in progress.
But besides the technology, the ability to access and search the MIT lecture library for content is also very cool. However, it looks like there isn’t a heck of a lot of content there yet. Do a search for “television” in the category “Media” and you only get a single video returned. I suspect the word “television” might be used in a few more classes than that in a Media Studies program.
Maybe part of the reason I am smitten is that it reminded me of a project I was working on about 5 years ago. We have a large collection of digitized audio and I was trying to build a web based application using an XML based language called SMIL that would do something similar with audio clips. That project eventually died, and I was a bit sad that the use of SMIL was never really mainstreamed, despite being a W3C technology. It looks like the MIT site uses some SMIL program and that brought back some warm fuzzies of my bygone project.
Really, how can you not love a programming language called SMIL?
Funny timing…in Tuesday’s issue of Spark on CBC radio, Nora Young speaks to Jim Glass about the MIT Lecture Browser.
http://www.cbc.ca/spark/blog/2008/01/show_notes_january_30_2_2008.html
Ah, cell phones. I never even though of that but yeah, it’s an obvious fit.
Thanks for this, great find. Speech recognition is going to play an increasing role in many interfaces (think cell phones) over the next few years, and this is a great example of a related benefit.