Every semester I try and work with some students at UTD by facilitating a ‘capstone’ project. It’s another dimension of my support for STEM education. Yesterday, they gave their presentation to their professor and class.
This semester the project was creating an Android based speech recognition solution to facilitate a Voice-based Inspection and Evaluation Framework. We shied away from using Google’s speech recognition, since we wanted off-line capabilities, as well as enhanced security/privacy. Addressing this expectation was one of the first issues the team had to conquer.
They were able to identify and implement an open source library providing the speech recognition (PocketSphinx). They also used Android.Speech.tts for text-to-speech interaction with the user.
The team created a visual programming environment to graphically define a flowchart and export that to an XML file that the mobile device was able to use to facilitate the inspection process. The mobile application could have a number of these stored for later use.
The end product was able to handle a range of speech recognition needs:
- Yes/no
- Answer from a list of valid responses (e.g., States)
- Answer with a number (range checked)
- Free form sound capture
Overall, I was very impressed with what these students were able to accomplish during the semester and the quality of the Software Life Cycle work products they were able to produce. Naturally, since we didn’t know exactly what they were going to be able to accomplish they used a modified agile approach – since they still had to produce the work products require for the class based on a predefined time table. We incorporated the concept of designing specific sprints around producing those work products as well as the typical need to define, document and validate requirements.
I started the project while working at HP and Dave Gibson and Cliff Wilke helped facilitate it to the end (they are still with HP).