Project

# Title Team Members TA Documents Sponsor
25 A.I.dan: ChatGPT Integrated Virtual Assistant
Andrew Scott
Brahmteg Minhas
Leonardo Garcia
Hanyin Shao design_document1.pdf
final_paper1.pdf
photo1.jpg
photo2.png
presentation1.pptx
proposal1.pdf
Team Members:
- Andrew Scott (ajscott5)
- Leonardo Garcia (lgarci91)
- Brahmteg Minhas (bminhas2)




# Problem
Current virtual assistants (Amazon’s Alexa, Apple’s Siri, etc) all use google as their primary mechanism for answering questions posed to them. While they may have other functionality, like integration with Amazon.com or spotify, their primary function is as assistants who answer questions based on audio I/O. With the advent of Chat GPT-3, Google is now an outdated information gathering mechanism, and needs to be replaced within the virtual assistant space.
# Solution
Our solution combines the convenience of a virtual assistant with the power of chatGPT to create a more powerful and useful home-assistant for answering questions. We will use a Speech-to-Text module to convert user voice input to text. This interaction, taking in user sound and responding shall be facilitated by a “cue word”, like “Hey A.I.dan”, or similar. To ask a question, a user will say the cue word and then ask their question. Once they have stopped speaking, A.I.dan will send the message through to ChatGPT, and once it gets back ChatGPT’s response, use Text To Speech (TTS) to relay it to the user as well as display it on the screen.




## Control Unit
Utilizes an ESP32 microcontroller with a Raspberry Pi RP2040. Software on the microcontroller interfaces with the audio I/O, the screen, and the through Wi-Fi to a PC which handles the chatGPT API, as well as the Speech-to-Text and Text-to-Speech modules. The microcontroller will also receive the information to be output to the screen and microphone from the PC.




## Audio I/O
The mechanism through which a user will interact with our device is with their voice. To facilitate this, both a speaker and microphone will be added to our PCB. Any post processing we want to do in order to clean up the audio to increase accuracy will also be done onboard. Any audio input to the microphone will go to the RP2040 for the detection of a wake word. Once a wake word is detected, the microcontroller will stream audio to a PC through Wi-Fi. Once the PC returns the chatGPT output after it has been passed through the text-to-speech module, it is played through the microphone.
## Screen
Many outputs that ChatGPT has are not easily understood through an audio description. The best example of this is code segments, which are always formatted as a markdown. In order to provide this particular functionality, a screen shall be added externally to our assistant, connected by SPI to the PCB.
# Criterion For Success
To consider this fully successful, at least 75% of attempted basic interactions should be successful. Basic interactions are questions that are based entirely on words included in our pre-trained speech to text model.
Code (Markdown) as well as traditional text answers must display/speak properly given a successful question. This can be tested by asking the same question to chatGPT on a separate device.


# Resources:
[Example of ESP32 to PC Audio Streaming]( https://github.com/MinePro120/ESP32-Audio-Streamer)


[Example of PC to ESP32 Audio Streaming](https://www.hackster.io/julianfschroeter/stream-your-audio-on-the-esp32-2e4661)

Wireless IntraNetwork

Daniel Gardner, Jeeth Suresh

Wireless IntraNetwork

Featured Project

There is a drastic lack of networking infrastructure in unstable or remote areas, where businesses don’t think they can reliably recoup the large initial cost of construction. Our goal is to bring the internet to these areas. We will use a network of extremely affordable (<$20, made possible by IoT technology) solar-powered nodes that communicate via Wi-Fi with one another and personal devices, donated through organizations such as OLPC, creating an intranet. Each node covers an area approximately 600-800ft in every direction with 4MB/s access and 16GB of cached data, saving valuable bandwidth. Internal communication applications will be provided, minimizing expensive and slow global internet connections. Several solutions exist, but all have failed due to costs of over $200/node or the lack of networking capability.

To connect to the internet at large, a more powerful “server” may be added. This server hooks into the network like other nodes, but contains a cellular connection to connect to the global internet. Any device on the network will be able to access the web via the server’s connection, effectively spreading the cost of a single cellular data plan (which is too expensive for individuals in rural areas). The server also contains a continually-updated several-terabyte cache of educational data and programs, such as Wikipedia and Project Gutenberg. This data gives students and educators high-speed access to resources. Working in harmony, these two components foster economic growth and education, while significantly reducing the costs of adding future infrastructure.