Skip to content
This repository has been archived by the owner on Dec 24, 2023. It is now read-only.

Latest commit

 

History

History
329 lines (229 loc) · 18 KB

jason.md

File metadata and controls

329 lines (229 loc) · 18 KB

Jason:

Functionality Of Web Application

  • Login and Register

login and register feature

  • 1:1 Chat

chat feature

  • Search (for users, comments, and posts)

search feature

  • Notifications

notifications feature

  • File uploading logic
    • Including chunked uploading due to the 32MB limit per request for Google Cloud Run.
  • HTML embeds for popular websites like YouTube

embeds feature

spoof content moderation feature

Implemented

  • Google Cloud Platform (GCP) asynchronous capable Python codes
    • Since most of the GCP's APIs Python libraries are not asynchronous capable, I had to write my own asynchronous capable Python codes for the various APIs to improve the performance of the web application.
    • Additionally, even if the library had async support, it had some issues with the Python's asyncio event loop.

Data Security

  • Storage of Secrets
    • All secrets of the web applications such as API access tokens, database credentials, etc. are stored in GCP Secret Manager.
  • Encryption of Data

    • Using Using Google Cloud Platform (GCP) Key Management Service (KMS).
    • Encrypted on the Application Layer using AES-256-GCM.
    • User's data that are encrypted:
      • Phone numbers
        • As it is only used for SMS 2FA
      • Argon2id password hashes
        • as pepper
      • Shared Time-based One-Time Password (TOTP) secrets
        • Used for 2FA with Google Authenticator or other compatible apps
      • Chat messages
    gcp kms encryption keys
  • Middlewares
    • CSRF
      • HMAC-SHA256
      • Uses the header and cookie for CSRF validation to prevent CSRF attacks.
      • Uses GCP KMS Cloud HSM to generate a high entropy bytes for the CSRF token.
    • Session
      • HMAC-SHA256
      • Flexible as compared to the in-built FastAPI/Starlette session middleware.
        • Able to become session cookie (without expiry date but the session ID lasts for a day) when the user does not check the stay signed in checkbox.
        • If the user checks the stay signed in checkbox, the session will become a persistent cookie that expires after 2 weeks.
        • Uses GCP KMS Cloud HSM to generate a high entropy bytes for the session ID.
    • Cache control middleware for the web application endpoints.
      • For better performance and availability.
  • Role-based Access Controls (RBAC) logic
    • Uses FastAPI's dependency injection feature.
    • Clears invalid sessions
    • Redirects the user to its default endpoint if not authorised.
    • For sensitive routes like the admin pages, it will raise 404 HTTP error if the user is not authorised.
  • URL redirect confirmation for external links posted by users.

    • Intgerated with Calvin's URL analysis feature for suspicious or malicious URLs.
    url redirect confirmation demo
  • Account Security

    • Google and Facebook OAuth2 login.
    google and facebook oauth2 demo
    • Forgot Password
    reset password demo
    • Voluntary revocation of the user's sessions.
    sessions page
    • Alerting users when their passwords are leaked in data breaches using reCAPTCHA Enterprise API.
      • Takes the username or a canonicalised email and the user's password and pass it through a Scrypt hash function and then sends it to the reCAPTCHA Enterprise API to check if it is in their database of compromised passwords.
    password breach alert demo
    • Added password policy.
    password policy demo
    • 2FA using Authenticator app or SMS (using Twilio API).
    sms and authenticator 2fa setup demo 2fa sms login demo 2fa authenticator app login demo
    • 2FA backup single use code to be used to disable their 2FA in the event that they lose access to their device.
    2fa backup code demo
    • Location-based login 2FA if the user is logging in from a new location and does not have 2FA enabled.
    location-based 2fa demo
  • Cloud Functions

    • Create Signed URL (Golang)
      • Uses the Golang's Google Cloud Storage (GCS) library to create a signed URL for the user to view the file.
        • During the process of signing the GCS URL, it can also contain an expiry time for a short-lived signed URL which will expire and become invalid.
      • Used in posts and chat messages for confidentiality.
    signed url demo
    • Sending Emails (Golang)
      • Since aiosmtplib Python library takes a while (~5 mins) for the user to receive the emails, I had to make a Cloud Function to send the emails which is coded in golang which helped to reduce the time taken for the user to receive the emails to ~15 seconds.
  • Scheduled Cloud Functions

    gcp cloud scheduler configurations
    • Re-encrypt Database (Golang)
      • Automated the re-encryption of the user's data when the encryption key in GCP Key Management Service (KMS) is rotated.
    • Database Cleanups (Golang)
      • As per data retention policy
        • Delete expired chat messages.
        • Delete orphan comments (comments that are not attached to any posts as the post was deleted).
        • Delete the user's data if the user has not logged in for 2 years.
        • Delete the user's data if the user has not verified their email for a month.
        • Delete the admin's account if the admin has been inactive for more than a month.
          • It also due to security reasons such as to minimise the risk of the admin's account being compromised.
    cloud scheduler
  • Chat Security

    chat data security settings
    • Allow users to add a chat password for extra security.
      • If the user forgets the password, they can reset it by clicking on the "Forgot Password" button which will send an email to the user's email address with a link to disable their chat password protection.
    chat password demo forgot chat password demo
    • Disappearing messages that can be configured by either the sender or the receiver.
      • Will take the one with the shortest duration to be used for the message's self-destruct timer.
    disappearing messages demo
  • End-to-end integrity

    • Chat messages are checked using CRC32C and MD5 checksums for integrity checks and also with performance in mind.
      • No need for SHA256 as it is already sent via WebSocket Secure (WSS) which is encrypted and ensures the integrity of the data.
    • Web application/API server to GCS server integrity checks are done by sending the file's MD5 and CRC32C checksums to GCS for file integrity validations on Google's end.
    gcs integrity check python codes
  • Pagination

    • Implemented in the chat as to prevent:
      • The server's and the client's browser memory from being overloaded with too many chat messages which can cause either the server or the client's browser to crash.
      • Overloading or getting rate limited by the GCP KMS API.
    • Search results are also paginated to prevent:
      • The server's and the client's browser memory from being overloaded with too many search results which can cause either the server or the client's browser to crash.
    chat pagination demo
  • Image Validation & Compression

    • Helps to reduce the size of the image files uploaded by the user.
    • Also checks for decompression bomb attacks which can cause the client's browser to crash.
      • Uses Python's Pillow library to compress and do the image decompression bomb attack checks.
      • The attacks are executed by uploading a very large resolution file (E.g. 10,000 x 10,000 pixels) which can cause severe lag and can even crash the client's browser.
      • For posts, the large resolution image will be blocked from being uploaded and the user will be notified.
      • For chat messages, the large resolution image will be treated as a normal file and will not be displayed as an image on the user's browser.
        • This approach is safer as the user can still view the image by downloading it and viewing it directly on his/her device without crashing the client's browser.
    image payload (20098 x 13280) chat message image validation demo post image validation demo
    • Original images can be viewed by clicking on the "View Original" button or removing the ?compress=true query parameter from the image URL.
    original image demo compressed image demo

Integration

  • Helped to deploy the Cloud Functions developed by my group members to GCP.
  • Integrated Eden's PassportEye OCR with the file uploading logic.
  • Integrated Eden's data security enhancements
    • Separate database servers for user-related data and admin-related data
    • RBAC configurations for the web application endpoints
  • Helped to clean up the code and fix bugs in the web application and API.
  • Helped to develop asynchronous capable Python codes for the GCP APIs if needed for my group members' features such as the GCP Web Risk API for Calvin's URL analysis.

Research

  • MongoDB Configurations for Data Security
    • MongoDB sharding which allows the database to scale horizontally by splitting the data into chunks and distributing them across multiple servers. This helps to provide higher availability and scalability.
    • Encryption at rest for the MongoDB database.
    • Automatic backups for the MongoDB database (which is also encrypted at rest).
    • Multiple nodes for the MongoDB database for automatic failover to provide higher availability.