Southern Utah University Junior/Senior EDGE Project (2019)
Junior/Senior Year EDGE Project (SUU 2019): Photo and Voice Recognition software intended to be a very basic Artificial Intelligence with basic Machine Learning.
The software is a Visual Studio Project written in C#. It utilizes a microphone and camera in order to learn people's faces and names. Once a person is added it will attempt to remember that individual the next time it sees them.
This software is currently only designed for Windows systems, specifically Windows 10 with .NET.
Clone or fork tybayn/edge-project to your local machine. Extract the files if necessary.
This software was built in the Microsoft Visual Studio Community 2019 IDE, and is intended to be compiled in the same IDE. It will likely work with newer versions of the same software.
Opening "tybaynEDGEproject.sln" will open the entire solution.
Many parts of this software are linked to or dependent on other libraries and source code. Any source code that is not compatible with the MIT license directly has been excluded from this repository and must be downloaded/included separately.
Any and all external libraries are owned and copyrighted by their respective authors and licensed under their respective licenses and ARE NOT included under the MIT license of this repository. They are used in an unaltered state and simply linked to.
The following libraries can simply be added using Visual Studio's NuGet Package manager:
The following are not available on NuGet and the .dll should be downloaded and linked within Visual Studio:
The "WebCam_Capture.dll" library is licensed under MIT and is included. It is in "tybaynEDGEproject/lib/WebCam_Capture.dll" and just needs to be linked within Visual Studio.
PLEASE READ THE "NOTICE" DOCUMENT FOR MORE INFORMATION ON THE EXTERNAL LIBRARIES REFERENCED AND THEIR LICENSING.
Ensure that the following file structure is maintained as the project is downloaded:
tybaynEDGEproject.sln tybaynEDGEproject/ | App.config | AudioCompareHandler.cs | AudioHandler.cs | FileHandler.cs | Form1.cs | Form1.Designer.cs | Form1.resx | Helper.cs | PitchShift.cs | Program.cs | tybaynEDGEPproject.csproj | VideoCompareHandler.cs | VideoHandler.cs | WaveFormVisualizer.cs | WebCam.cs | └──bin/ | └──data/ | | └──resources/ | | | | faceOverlay.png | | | | | ... | ... | └──lib/ | | WebCam_Capture.dll | ...
This project is intended to be run with Visual Studio. To run the software, open the 'tybaynEDGEproject.sln' in Visual Studio and run the project from there.
By using this software you confirm that you have read this document, the NOTICES file pertaining to external and excluded libraries, and the LICENSE document and agree to all the terms within.
This next section is the documents and research done regarding the project required by the University (literally, all of it). So unless you enjoy walls of text and what essentially amounts to journal entries, then you are good to leave at this point!
For this project I will design, create, and code an Artificial Intelligence (AI) written in C# (.NET framework). This artificial intelligence will be able to record a person's face and voice and remember them. Then at a later time be able to recall that data and be able to compare it to a live feed of a person's face and voice and compare the two sources to see if they match.
This project encompasses ideas of self learning and the ability of a computer to recognize valid input apart from invalid input. To see the full project description, all the steps along the way, the final product, and my reflections about the project, see the sections below.
For my EDGE project I will design, create, and code an Artificial Intelligence (AI) written in C#. This artificial intelligence will be able to record a person's face and voice and remember them. Then at a later time be able to recall that data and be able to compare it to a live feed of a person's face and voice and compare the two sources to see if they match. This AI will allow for the personal discovery of how data is represented and stored in a computer, how to compare audio and video as it is represented in a computer, and how AI’s can be used in different scenarios and for different purposes. I plan to build this AI to prove to myself that I have the ability to learn and push myself further than what is required and be able to use this as a standout point on my resume. This will also allow me to have an experience in programming something that is greatly needed in the work world, but not largely taught in universities.
I have chosen to build this project to give me experience in writing, coding, and creating an Artificial Intelligence. It will allow me to gain an understanding of how computers view live images and digital conversions of analog audio and how to compare them.
For my EDGE project I will create an Interactive Artificial Intelligence that will correctly identify and recognize faces and voice (video and audio). The AI will compare live audio feed from a microphone and live video feed from a webcam and compare it to stored audio and video files. The AI will be able to learn new people and recognize them later.
I will deliver a zip file that contains the actual AI software executable files alongside all the needed dependency files and “memory” files. I will also provide the paperwork with the design process and any subprograms used for testing functionality of parts of the software.
Artificial Intelligence is not a subject that is taught at SUU or many other universities. But the ideas of each of those topics are needed greatly. By creating an AI I will be able to have a deeper understanding of a topic that is largely expanding and companies of all types are starting to look toward to optimize and accelerate processes. With this project I will be able to include AI creation on my resume and have a starting ground when I need to create other AI’s in the future.
This section simply depicts the expected timeline and design requirements. The actual timeline and design documents are outlined below.
Audio Comparator
Compare Audio{ -Will receive two samples and will use .NET library to compare the wavelengths -Returns a double from 0 - 1 of the percentage value of how similar they are }
Audio Handler
Initialize Microphone{ -Call existing library to start live feed -Put data into Global Variable } Adjust { -Receives volume to adjust the microphone object } Display Thread{ -While(true){ -Get audio from microphone -Display waveform to gui display -Refresh gui } } Get sample{ -Get audio from microphone -Store in temporary memory -Return audio (or audio address) } Start New User{ -Display sentence to screen -Get audio from microphone -Return audio (or audio address) }
Data Handler
Constructor{ -Create Video Comparator -Create Audio Comparator } Store Image{ -Receives image -Stores image data (maybe with tree?) -Associate image with name } Store audio{ -Receives sample -Stores sample data -Associate sample with name } Compare Video{ -Passes image to Video Comparator -Returns true or false } Compare Audio{ -Passes sample to audio Comparator -Returns true or false }
Driver
Main{ -Create Data Handler -Create Audio Handler -Create Video Handler -Start running Listening Thread } Listen Thread{ -While(true){ -Get image from webcam -If( image contains a new form) -Prompt User to speak sentence or say name -Call Video Comparator to see if image matches -If(image is not found) -Goto Add new user -Else -Call Audio Comparator for audio matches -If( audio is not found) -Goto Add new user -Else -Get user name -Prompt user if they are this person -If(not the user) -Goto Add new User -} } Add new User{ -Prompt user if they want to be added -If( they want to be added) -Prompt User for Name -Have user line up on camera, take photo -Have user say their name/sentence -Call Data Storage to store name, image, and audio -return } Speech to text{ -Receives a string passes it to the speech to text handler }
Speech to Text
Speak{ -Receives a string and uses the .NET library to -convert it to an audio clip }
Video Comparator
Compare Video{ -Will receive two photos and will use the .NET video compare library to compare them -Returns a double from 0 – 1 of the percentage value of how similar they are. }
Video Handler
Initialize Camera{ -Call existing library to start live feed -Put data into Global Variable } Adjust { -Reinitialize the camera to adjust for light and focus } Display Thread{ -While(true){ -Get image from camera -Display image to gui display -Refresh gui } } Get image{ -Get image from camera -Store in temporary memory -Return image (or image address) } Start New User{ -Place face template over camera image -Get image from camera -Return image }
GUI Design 1 (Wireframe)
GUI Design 2 (Wireframe)
05/06/2019
05/08/2019
05/13/2019
05/14/2019
05/15/2019
05/16/2019
05/20/2019
05/22/2019
05/23/2019
05/24/2019
05/26/2019
05/27/2019
05/29/2019
06/02/2019
The actual code and software project are located on my GitHub account. tybayn/edge-project
This was one of my first big projects that I did by myself with no guidance or requirements. It allowed my to branch out and to figure out what I know, what I was able to learn, and what I still need to work on. In the end, I am proud of the results and the software created.
I have found that this maybe truly isn't an AI, but more a machine learning in its most basic definition. I still have a lot to learn, but I found this project to be enjoyable.
Initial Release