Skip to content

Design Documentation

SHEOM edited this page Dec 1, 2023 · 52 revisions

Table of Contents

Document Revision History

version 1.0 (2023/10/1): initial version
version 1.1 (2023/10/15): updated backend class diagram and implementation details
version 1.2 (2023/11/5): updated acceptance testing
version 1.3 (2023/11/16): added multiplayer section
version 1.4 (2023/11/28): added deploying animated drawings section
version 1.4 (2023/12/02): added design patterns section

System Design

System Architecture

System Architecture

For frontend, we used Android Java. For backend, we used Django along with Django REST Framework. Django REST Framework was applied to ensure RESTful APIs, which are critical for efficient, standard communication between frontend and backend.

Our server is deployed on an AWS EC2 instance. For storage of drawing and GIF files, we used AWS s3 bucket. The database management is handled through AWS RDS with MySQL.

For animated drawings feature, we implemented API wrapper for Facebook Research AnimatedDrawings library. AnimatedDrawings is separately deployed on the Bacchus Kubernetes cluster on a A100 GPU to enable fast animate process of drawings. The use of NVIDIA's latest flagship GPU provides quick and reliable inference with its compute capabilities.

A key feature of our system is multiplayer drawing functionality. We reduced synchronization latency and server load by the use of socket channel. By using sockets, we enable real-time, responsive interactions among multiple users. Further implementation details are introduced in a later section, "Making Multiplayer More Reliable".

Class Diagrams and Data Models

Frontend Class Diagram

app_arch  drawio

Our Android project primarily consists of two main layers: the UI Layer and the Data Layer.

The UI Layer is responsible for constructing the interface visible to users. It handles tasks such as displaying information, managing user interactions, and processing inputs. Its role is to ensure seamless interaction between users and the app's functionalities.

On the other hand, the Data Layer is responsible for providing the data required by the UI Layer. This layer performs tasks such as fetching data from servers, processing information, and storing data within the app's local storage. Essentially, it acts as a bridge connecting the UI and the underlying data sources.

By segregating responsibilities between these layers, our Android app maintains a clear separation of concerns, enabling efficient development, testing, and maintenance of the application.

Backend Class Diagram

1 drawio

Data Models

SWPP-LittleStudio (5)

The above diagram shows the data models of our project. The main tables consist of Drawing, User, and Family. Drawing table has image and gif urls fields which store s3 bucket addresses of final image and gif files. Drawing table's type field indicates the current stage of the drawing. User table contains information needed for authentication and family role. Family table exists to group users in units of family. DrawingUser and FamilyUser tables exist to establish many to many relationship between Drawing and User, and Family and User. DrawingUser table stores participants of drawing, while FamilyUser table stores members of family.

Implementation Details

Description Endpoint Method Details
List Drawings /drawing GET Retrieve multiple drawings.
Create Drawing /drawing POST Create a new drawing.
Get Single Drawing /drawing/{id} GET Retrieve a single drawing by its ID.
Join Drawing /drawing/{id}/join POST Join a drawing created by a different user.
Submit Single Drawing /drawing/{id}/submit Put Submit a drawing after completion.
Upload Real-Time Drawing /drawing/{id}/canvas POST Upload real-time drawing updates.
Register User /user POST Register a new user.
Login User /user/login POST Log in as a user.
Logout User /user/logout POST Log out the current user.
List Family Members /family GET Retrieve a list of the user's family members.

GET /drawing

Request
interface DrawingListRequest {
    user_id: number;
}
Response

Retrieve multiple drawings.

interface DrawingListResponse {
    drawings: Drawing[];
}

interface Drawing {
    id: number;  // Primary Key
    title: string;
    description: string;
    image_url: string;
    ai_image_url: string;
    gif_url: string;
    type: 'raw' | 'processed' | 'animated';
    host_id: number;
    participants: FamilyMemberResponse[];
    voice_id: number;
    created_at: Date;
    updated_at: Date;
}

interface FamilyMemberResponse {
    username: string;
    gender: 'Male' | 'Female' | 'Other';
    type: 'Parent' | 'Child';
}

POST /drawing

Request
interface DrawingCreateRequest {
    host_id: number;
}
Response

Submit a new drawing.

interface DrawingCreateResponse {
    id: number; 
    host_id: number; 
    invitation_code: string; // drawing의 id를 hash해서 생성 
    created_at: Date;
}

GET /drawing/{id}

Response

Retrieve a single drawing by its ID. Returns a Drawing object.

interface Drawing {
    id: number;  // Primary Key
    title: string;
    description: string;
    image_url: string;
    ai_image_url: string;
    gif_url: string;
    type: 'raw' | 'processed' | 'animated';
    user_id: number;
    voice_id: number;
    created_at: Date;
    updated_at: Date;
}

POST /drawing/{id}/join

Request
interface DrawingJoinRequest {
    user_id: number; 
    invitation_code: string;
}

POST /drawing/{id}/submit

Request
interface DrawingSubmitRequest {
    file: File; 
    title: string;
    description: string;
    host_id: number;
    voice_id: number;
}
Response

Submit a drawing after completion.

interface Drawing {
    id: number;  // Primary Key
    title: string;
    description: string;
    image_url: string;
    ai_image_url: string;
    gif_url: string; 
    type: 'RAW' | 'PROCESSED' | 'ANIMATED';
    host_id: number;
    voice_id: number;
    created_at: Date;
    updated_at: Date;
}

POST /drawing/{id}/canvas

Request
interface DrawingCanvasRequest {
    id: number;  // Primary Key
    title: string;
    description: string;
    image_url: string;
    ai_image_url: string;
    gif_url: string;
    type: 'raw' | 'processed' | 'animated';
    user_id: number;
    voice_id: number;
    created_at: Date;
    updated_at: Date;
}

POST /user

Request
interface UserCreateRequest {
    username: string;
    password: string;  // sha256
    gender: 'Male' | 'Female' | 'Other';
    type: 'Parent' | 'Child';
}
Response

Register a new user.

interface User {
    id: number;  // Primary Key
    username: string;
    password: string;  // sha256
    gender: 'Male' | 'Female' | 'Other';
    type: 'Parent' | 'Child';
    family_id: number;
    created_at: Date;
}

POST /user/login

Request
interface UserLoginRequest {
    username: string;
    password: string;
}
Response

Log in as a user. Returns a User object.

interface User {
    id: number;  // Primary Key
    username: string;
    password: string;  // sha256
    gender: 'Male' | 'Female' | 'Other';
    type: 'Parent' | 'Child';
    family_id: number;
    created_at: Date;
}

POST /user/logout

Response

Log out the current user.

GET /family

Response

Retrieve a list of the user's family members.

interface FamilyResponse {
    users: User[];
}

interface User {
    id: number;  // Primary Key
    username: string;
    password: string;  // sha256
    gender: 'Male' | 'Female' | 'Other';
    type: 'Parent' | 'Child';
    family_id: number;
    created_at: Date;
}

Technical Complexity

Making Multiplayer More Reliable

The technical challenge we faced when designing this service was to implement multiplayer collaboration of drawings. How can two or more people view the same, synchronized drawing at the same time? How can we minimize the real-time delay while placing light burden on the server?

image

The first thought that came to our mind was API polling. The idea was to store the drawing on the server, and the client would request the drawing from the server at regular intervals. When the client modified the drawing, the client would send a request to the server to upload the modified part, and the server would combine the layers of the drawing from different clients into one and store it in s3 bucket.

However, this method had two critical drawbacks. The first was that it resulted in slow real-time update. The client has no way of knowing if a picture has been modified, so it has to make API requests to the server at regular intervals, which inevitably introduces a delay. The second problem was that these requests put too much load on the server. If we decrease the time interval to reduce the delay, the server will be overwhelmed with requests, which will grow exponentially as the number of participants increases. Also, merging operation of several layers of drawings was very heavy, and we didn't like the fact that the intermediate stages of the drawings kept piling up in the storage. With these clear drawbacks, we thought about how to improve it.

To solve the first problem of slow real-time update, we decided to implement socket. Unlike HTTP communication, socket maintains a connection over a port, which allows for real-time, two-way communication. This makes sockets much more real-time than API polling.

We implemented this using the websocket library pusher. The workflow is as follows.

  1. Create a pusher channel on the server
  2. Client subscribes to channel
  3. Server delivers event to channel
  4. All subscribed clients receive the event

Drawing modifications were passed to the server via the POST API.

To solve the second problem, which was placing too heavy load on server, instead of sending the entire drawing file as we originally thought, we decided to send only the drawing stroke data and not store the drawing on the server at all until the drawing is complete.

If you think about it, if the client has rendered all the strokes from start to finish, there's no need to store a separate drawing file. If we only send and share data for one stroke each time we draw it, instead of polling the API every few hours, there's no need to process it on the server. Then we don't have to worry about merging existing drawings and layers in the repository, and we don't have to store intermediate drawing files.

Untitled (1)

The final flow we took with this approach is as follows.

  1. When you draw a stroke, you pass the stroke information to the server via the POST API
  2. The server passes the drawing stroke information as an event to the pusher channel
  3. All clients subscribed to the channel receive the stroke information

By making the transferred data light and reducing server operation, you can draw on device 1 and have that modification immediately visible on device 2.

Deploying Animated Drawings On Kubernetes

We incorporated facebook research's AnimatedDrawings to animate the drawing. The repository provides two parts: 1) torchserve dockerfile for humanoid pose detection, and 2) graphics code which creates motion gifs based on detected pose.

Instead of having backend server handle machine learning and graphics workloads, we decided it was a better design to separately create an inference server and enable communication between them. Our team implemented flask API wrapper on top of AnimatedDrawings. Because Amazon EC2 was not an efficient option for our AI model, we deployed our inference server on an A100 GPU server provided by Bacchus. We created docker images of torchserve and API wrapper and uploaded them to harbor registry.

We faced several difficulties in incorporating Animated Drawings to our service. The most critical difficulty was in environment setting with Kubernetes and CUDA version.

First main problem was in setting up Kubernetes. Within Animated Drawings, two projects are communicating on localhost, so we created a single pod with two containers. One container hosted torchserve docker image, and the other hosted API-wrapped docker image. Two containers communicated via port forwarding. External requests were handled in container with API wrapper, and this container called torchserve container for ML inference. After receiving inference response, container with API wrapper handles creation of gifs and return the results.

There was also an issue with exposing the port of the pod, so that the inference server could be accessed from outside of the cluster. To solve this problem, we created service and deployment yaml file for Kubernetes and applied them through following command.

kubectl apply -f service.yaml
kubectl apply -f deployment.yaml

We configured node port and target port in service.yaml to properly deliver the request to the inference server.

Second main problem was in properly setting the torchserve container. Although we built an image of torchserve and uploaded the image to harbor registry, torchserve constantly failed and did not return inference response properly. The cause was found to be the missing CUDA compiler and version compatibility problems. Some of the library versions provided in the repository were inaccurate or outdated, which resulted in tedious debugging.

We first checked the state of GPU through following command:

nvidia-smi
image

Above command displays GPU's model name and driver version. Then, we installed CUDA, driver, torch, mmcv, and python versions that were compatible with GPU and to each other.

conda install cuda -c nvidia/label/cuda-11.6.0
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
pip install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/cu116/torch1.12/index.html

Finally, there was a problem with PyOpenGL, a computer graphics library for python. We set error_checker to None within the library file and installed following packages:

sudo apt-get install libosmesa6 libosmesa6-dev

After the above environment setting process, our inference server finally reliably produced animated gifs for our creation. We also implemented several optimizations in frame rate, image size, and data preprocessing to reduce inference time.

Testing Plan

Unit Test & Integration Testing

The following frameworks will be used for unit tests.

  • Android: JUnit

  • Django: pytest-django

Integration tests will be conducted through Espresso.

In client, we will strive for 70% code coverage for both unit and integration tests. In server, we will strive for 75% code coverage for all API endpoints in integration tests.

Acceptance Testing

Plans are to test following user stories.

  1. As an end user, I want to 1) sign up 2) login, so that I can access and make use of the services provided by LittleStudio.
  • Scenario: The end users clicks on the 1) “Sign Up” 2) “Login” button
    • Given: The end user is in the 1) sign up 2) login page
    • When: The end user 1) completes the registration process 2) inputs the correct credentials
    • Then: The end user should login and be redirected to the gallery page

2a. As an end user, I want to create a drawing and draw in real time, so that I can build shared memories and artworks.

  • Scenario: The end user is in the “My Gallery” page
    • Given: The end user successfully logs in
    • When: 1) The end user clicks on the “plus” button via the menu bar, then clicks on the “Create a drawing” button 2) All of the (desired) collaborators join the waiting room and the end user clicks on the “Start Drawing” button
    • Then: The end user should be 1) redirected to the waiting room page where they can see a list of collaborators 2) able to draw simultaneously in real time with the collaborators

2b. As an end user, I want to join a drawing and draw in real time, so that I can build shared memories and artworks.

  • Scenario: The end user is in the “My Gallery” page
    • Given: The end user successfully logs in
    • When: 1) The end user clicks on the “plus” button via the menu bar, clicks on the “Join a drawing” button, and inputs the invitation code. 2) All of the (desired) collaborators join the waiting room and the host (end user who has created the drawing (refer to scenario 3a)) clicks on the “Start Drawing” button
    • Then: The end user should be 1) redirected to the waiting room page where they can see a list of collaborators 2) able to draw simultaneously in real time with the collaborators
  1. As an end user, I want to add a title and description to a finished drawing, so that I can provide context and narration for my drawing.
  • Scenario: The end user should be able to add context to their drawing
    • Given: The end user finishes drawing
    • When: The end user 1) clicks on the “Finish” button 2) adds a title and description of the drawing and clicks on the “Submit” button
    • Then: The end user should be 1) redirected to the submit drawing page 2) redirected to the view drawing page (refer to the “Then” section in scenario 5) and the submitted drawing should be added to the gallery page
  1. As an end user, I want to view a specific drawing, so that I can appreciate the creativity provided by the animated drawing and view the details of that specific drawing.
  • Scenario: The end user should be able to see their drawing with the details
    • Given: The end user is in the “My Gallery” page
    • When: The end user clicks on a specific drawing
    • Then: The end user should be able to see the original collaborative drawing, animated versions of the drawing, and the information about the drawing, including the title, date created, participants, and description
  1. As an end user, I want to see a list of my family members and logout.
  • Scenario: The end user has successfully logged in
    • Given: The end user clicks on the “My Family Page” button on the navigation bar
    • When: 1) The end user is in the “My Family Page” page 2) clicks on the “Logout” button
    • Then: The end user should be 1) able to see a list of their family members 2) redirected to the login page

Design Patterns

Structural Patterns

Adapter Patterns

The Adapter pattern can be used when a client using an incompatible interface wants to use another interface. In this case, the adapter acts as an intermediary. In our project, when displaying the people in the gallery and waiting room, the data is variable and the total length is hard to predict. Also, the items that display each person and image are independent and have different shapes. In order to optimize each of these items with a unified logic and display them in a RecyclerView, the Adapter pattern is suitable. After defining the type of item we want, each item can implement the RecyclerView.Adapter interface to display the screen with its own optimization code in the Android view. This reduces performance concerns by using an already guaranteed library, while still allowing us to apply arbitrary views through the adapter.

If we create an adapter to make our view compatible with the format of RecyclerView through a JoinAdapter and DrawAdapter that implements RecyclerView.Adapter as shown below, we will be able to use recyclerview on android.

image

image

The structure of this Adapter Pattern is shown below. Above is the general structure of the adapter pattern, and below is the actual implementation in our application.

제목 없는 다이어그램 drawio

Behavioral Patterns

Observer Patterns

The Observer pattern is a design pattern in which an observer is notified whenever the state of a participant it is observing changes and takes action. In our project, we have many real-time processing which needs careful handling. Whenever there is a change in the behavior of each user, the server needs to recognize it and synchronize it in real time. These include synchronizing users in waiting room, drawing strokes, submissions, and others. For above reasons, the observer pattern is very useful in our project. Without the observer pattern, there would have been problems in displaying synchronized drawing screen to multiple users and latency. We implemented the observer pattern through socket communication through the pusher library. The following is our synchronization process:

  1. Each user subscribes to the pusher's observer, which is pre-implemented on the server.

Untitled (1)

  1. Each user reports its state changes to the server's observer via a POST request.

Untitled (2)

  1. The server receives the post request and notifies the participants who are subscribed to it over the socket.

Untitled (3)

  1. Each user receives notifications through socket communication and synchronizes in real time.

Untitled (6)

With the observer pattern above, we were able to implement the socket communication implementation in clean code.

Clone this wiki locally