Project
# | Title | Team Members | TA | Documents | Sponsor |
---|---|---|---|---|---|
30 | Search and Identify |
Ruidi Zhou Shitian Yang Yilai Liang Yitao Cai |
design_document2.pdf design_document3.pdf proposal1.pdf proposal2.pdf |
Howard Yang | |
# Team members: Yang, Shitian sy39 Cai, Yitao yitaoc3 Zhou, Ruidi ruidi2 Liang, Yilai yilail2 # Title: Search and Identify # Problem: There is a noticeable gap in the availability of assistive applications tailored for homes, businesses, and individuals with mobility challenges. These groups lack efficient tools to swiftly locate everyday items, creating a significant inconvenience. This absence of specialized support not only hampers the day-to-day functionality within households and corporate environments but also poses a considerable barrier to independence for those with physical disabilities. Addressing this need with innovative solutions could dramatically improve the quality of life and operational efficiency by ensuring that vital items can be found quickly and easily, without unnecessary delay or reliance on others. # Solution overview: In solving the problem of accurately identifying specific items based on a user's immediate request, we are developing an innovative service-oriented robot capable of interactive processing. Our robot is equipped with a rotatable wireless camera mounted on a 360° steering engine which is controlled by a STM32F103c8t6 microcontroller under the drivetrain and power system, allowing it to visually scan its surroundings on receiving the user’s voice inquiry. When the software observes objects like cups, pencil cases, flowers, toys, and helmets, it processes the images to create an attention map. This map guides the robot to focus on and identify the specific object in question. Software component gives the object recognition feedbacks, the sensing system in the hardware component will output the 0/1 signal through signal control module and send it to the 360° steering engine. Such objects are easily achieved and can provide a suitable testing environment. # Solution Component: ## Software Component: 1. Speech recognition: Transform audio instructions given by users into text tasks. Prompt key item recognition: Simplify the text task prompt into a keyword or phrase. 2. Vision model: The vision model should take in the text prompt, and search for the object that matches the description best. 3. Algorithm: Our software parts will use a project on github called AbsVit as baseline, and we will remove the noise from the heatmap, and try to modify it to get the detail target object. The AbsVit is a algorithm and model for language-vision attention model. ## Hardware Component: 1. A drivetrain and power system: including a 360° steering engine, a wireless camera, a STM32F103c8t6 microcontroller, a 12V power source and a voltage converter. This system can rotate the camera to capture the pictures of its surroundings. 2. Control system: PC inputs the program into STM32F103c8t6 microcontroller, it will control the angles we want the camera to rotate each time and can control the time intervals between each rotation. 3. Storing system: including SD card which can store the pictures that the camera captures before, after software component finds that the object is found in the last picture, we can compare the pictures before and the picture which includes the object to verify the result. 4. Sensing system: including a signal control module. It can process the software component output into high-level or low-level signals and input 0/1 signals into drivetrain and power system to help it judge if it should operate or stop. # Criterion for Success 1. Capable of identifying and navigating in indoor spaces, which have varying lighting situations including bright natural sunlight to dim artificial lights, and obstacles such as furniture and shelves. 2. The voice response system should also be easy to use, so it must respond timely and interact in natural language with the user (Users don’t need to learn the extra commands). The voice response system should also be easy to use, so it must respond timely and interact in natural language with the user. 3. When search request received from PC, our microcontroller of STM32F103c8t6 should send the correct impulse signal to control the 360° steering engine to automatically stop when the desired object is detected by the camera attached to it. The difference of the direction of the camera to the actual direction of the desired object should be within 3°. 4. Steering engine can rotate uniformly, smoothly, and continuously when no commands are given in a balanced and room temperature environment. # Distribution of work Yang, Shitian and Cai, Yitao: Voice Recognition and Software Development: Responsible for developing and testing the voice recognition system. Yang, Shitian and Zhou, Ruidi: Vision Module and Software Development: Focus on developing and testing the vision module for object identification. Zhou, Ruidi and Liang, Yilai: Hardware and Microcontroller Development: Responsible for developing and testing the hardware components, including the steering engine and microcontroller. Cai, Yitao and Liang, Yilai: Integration and Testing: Oversee the integration of software and hardware components and conduct comprehensive testing of the entire system. |