Capgemini Machine Vision

As part of my MSc dissertation, I collaborated with a team from Capgemini on their “TechChallenge 2020” hackathon, which was focused on creating solutions for hidden disabilities. Our team, “The Visionaries,” set out to develop a machine learning-powered screen reader. My role was to research and select the most appropriate machine learning implementation and then build the proof-of-concept.

I evaluated a number of different options, including Google Cloud Vision, Microsoft Azure Cognitive Services, and Amazon Web Services (AWS). I ultimately chose to use AWS, as its Textract API provided the most accurate and reliable text extraction capabilities for our use case.

AWS Machine Vision

The project was a great success. We were able to develop a working prototype that could accurately extract text from images and screenshots, and we were awarded 3rd place out of 40 competing teams. The project also formed the basis of my MSc dissertation, for which I received a Distinction Star, the highest attainable grade.