Project

# Title Team Members TA Documents Sponsor
15 Automated Pour-over Coffee Machine with Imitation Learning
Jie Wang
Jingyuan Huang
Rucheng Ke
William Qiu
design_document3.pdf
final_paper2.pdf
final_paper3.pdf
photo1.jpg
proposal2.pdf
Said Mikki
# RFA for Automated Pour-over Coffee Machine with Imitation Learning

# Problem

The art of pour-over coffee brewing, famous for its complex flavor and high quality, is heavily dependent on the skills and experience of a barista. This craftsmanship leads to variability in coffee quality due to human inconsistency. Additionally, it is challenging for common coffee enthusiasts to replicate professional barista techniques at home or in non-specialized settings.

# Solution Overview

We propose the development of **an intelligent Automated Pour-over Coffee Machine leveraging imitation learning algorithms**. This machine will mimic the techniques of professional baristas, ensuring consistency and high-quality in every cup. The project will involve designing a mechanical structure integrated with sensors and developing sophisticated software algorithms.

# Solution Components

## Component 1: Mechanical Design

- **Purpose:** To create a machine that can physically replicates the movements and precision of a barista.
- **Features:** An adjustable nozzle for water flow control, a mechanical arm for simulating hand movements, and a stable structure to house the coffee dripper.
- **Challenges:** Ensuring precise movement and durability of moving parts, and integrating the mechanical system with electronic controls for seamless operation.
- **Expectation:** A workable, fixed coffee machine first, then upgrade it.

## Component 2: Sensors and Data Collection

- **Purpose:** To gather precise data on barista techniques for the learning algorithm.
- **Features:** High-precision sensors capturing data on water flow, angle, speed, and trajectory during the pour-over process.
- **Challenges:** Accurately capturing the nuanced movements of a professional barista and ensuring sensor durability under varying conditions.

## Component 3: Imitation Learning Algorithm

- **Purpose:** To analyze and learn from the collected data, enabling the machine to replicate these actions.
- **Features:** Advanced algorithms processing visual and sensory data to mimic barista techniques, this requires to duplicate the state-of-the-art research result from Robotics field.
- **Challenges:** Developing an algorithm capable of adapting to different styles and ensuring it can be updated as it learns from new data.

## Optional Components:

- **Multimodal Origin Information Pre-Processing:** To adjust settings based on different coffee beans and grind sizes.
- **User Interface Design:** An intuitive interface for user customization and selection of coffee preferences.
- **ChatGPT Enhanced Custom Coffee Setting**: To make the machine more intelligent and like a human barista, SOTA artificial intelligence like LLMs should be involved to make it more a sort of an agent than a regular machine.

# Criterion for Success

- **Mechanical Precision:** The machine must accurately control water flow and replicate barista movements.
- **Algorithm Effectiveness:** The machine should consistently brew coffee that matches or surpasses the quality of a professional barista.
- **User Experience:** The interface should be user-friendly, allowing customization without overwhelming the user.
- **Reliability and Durability:** The machine should operate consistently over time with minimal maintenance.
- **Taste Test Approval:** The coffee produced must be favorably reviewed in taste tests against traditional pour-over coffee.

Autonomous Behavior Supervisor

Shengjian Chen, Xiaolu Liu, Zhuping Liu, Huili Tao

Featured Project

## Team members

- Xiaolu Liu (xiaolul2)

- Zhuping Liu(zhuping2)

- Shengjian Chen(sc54)

- Huili Tao(huilit2)

## Problem:

In many real-life scenarios, we need AI systems not only to detect people, but also to monitor their behavior. However, today's AI systems are only able to detect faces but are still lacking the analysis of movements, and the results obtained are not comprehensive enough. For example, in many high-risk laboratories, we need to ensure not only that the person entering the laboratory is identified, but also that he or she is acting in accordance with the regulations to avoid danger. In addition to this, the system can also help to better supervise students in their online study exams. We can combine the student's expressions and eyes, as well as his movements to better maintain the fairness of the test.

## Solution Overview:

Our solution for the problem mentioned above is an Autonomous Behavior Supervisor. This system mainly consists of a camera and an alarm device. Using real-time photos taken by the camera, the system can perform face verification on people. When the person is successfully verified, the camera starts to monitor the person's behavior and his interaction with the surroundings. Then the system determines whether there is a dangerous action or an unreasonable behavior. As soon as the system determines that there are something uncommon, the alarm will ring. Conversely, if the person fails verification (ie, does not have permission), the words "You do not have permission" will be displayed on the computer screen.

## Solution Components:

### Identification Subsystem:

- Locate the position of people's face

- Identify whether the face of people is recorded in our system

The camera will capture people's facial information as image input to the system. There exists several libraries in Python like OpenCV, which have lots of useful tools. The identification progress has 3 steps: firstly, we establish the documents of facial information and store the encoded faceprint. Secondly, we camera to capture the current face image, and generate the face pattern coding of the current face image file. Finally, we compare the current facial coding with the information in the storage. This is done by setting of a threshold. When the familiarity exceeds the threshold, we regard this person as recorded. Otherwise, this person will be banned from the system unless he records his facial information to our system.

### Supervising Subsystem

- Capture people's behavior

- Recognize the interaction between human and object

- Identify what people are doing

This part is the capture and analysis of people's behavior, which is the interaction between people and objects. For the algorithm, we decided initially to utilize that based on VSG-Net or other developed HOI models. To make it suitable for our system or make some improvement, we need analysis and adjustment of the models. For the algorithm, it is a multi-branch network: Visual Branch: extracting visual features from people, objects, and the surrounding environment. Spatial Attention Branch: Modeling the spatial relationship between human-object pairs. Graph Convolutional Branch: The scene was treated as a graph, with people and objects as nodes, and modeling the structural interactions. This is a computational work that needs the training on dataset and applies to the real system. It is true that the accuracy may not be 100% but we will try our best to improve the performance.

### Alarming Subsystem

- Staying normal when common behaviors are detected

- Alarming when dangerous or non-compliant behaviors are detected

It is an alarm apparatus connected to the final of our system, which is used to report dangerous actions or behaviors that are not permitted. If some actions are detected in supervising system like "harm people", "illegal experimental operation", and "cheating in exams", the alarming system will sound a warning to let people notice that. To achieve this, a "dangerous action library" should be prepared in advance which contains dangerous behaviors, when the analysis of actions in supervising system match some contents in the action library, the system will alarm to report.

## Criteria of Success:

- Must have a human face recognition system and determine whether the person is in the backend database

- The system will detect the human with the surrounding objects on the screen and analyze the possible interaction between these items.

- Based on the interaction, the system could detect the potentially dangerous action and give out warnings.

## DIVISION OF LABOR AND RESPONSIBILITIES

All members should contribute to the design and process of the project, we meet regularly to discuss and push forward the process of the design. Each member is responsible for a certain part but it doesn't mean that this is the only work for him/her.

- Shengjian Chen: Responsible for the facial recognition part of the project.

- Huili Tao: HOI algorithm modification and apply that to our project

- Zhuping Liu: Hardware design and the connectivity of the project

- XIaolu Liu: Detail optimizing and test of the function.

Project Videos