Samsung-PRISM-Audio-Captioning-Engine

Audio Captioning Engine demonstration for Samsung PRISM worklet VI70, aimed to collect audio from multiple sources.

Source code for our paper titled "Web Framework for Enhancing Automated Audio Captioning Performance for Domestic Environment".

Tech Stack

Client: HTML, CSS, JS

Server: Node.js, Express.js

Other libraries: multiparty, multer, object-to-csv, browserify

Environment Variables

To run this project, you will need to add the following environment variables to your .env file

PORT default: 3000

Installation

Install dependencies with npm

  npm install

Build libraries for use with

  npm run build

Create a folder called uploads in the ROOT project folder

Deployment

To deploy this project run

  npm start

API Reference

Login

  GET /

Returns the landing page for the web app

  POST /file_upload

Parameter	Type	Description
`file`	`multipart/form-data`	(optional) encoded data of file upload
`cap1`	`multipart/form-data`	(optional) user caption
`cap2`	`multipart/form-data`	(optional) user caption
`cap3`	`multipart/form-data`	(optional) user caption
`cap4`	`multipart/form-data`	(optional) user caption
`cap5`	`multipart/form-data`	(optional) user caption

Appendix

The project meets the following requirements -

Data Collection Page

Design and development of a web app, which can be used for labelled data collection. ✅
User can record/upload the audio, and play it back on web page. ✅
User can write 1-5 captions for the audio, and save them. ✅
Similar to the Clotho dataset, we restrict storage to only lossless file types. ✅

Demo Page

In this web page user can record/upload the audio, and using the pre-trained model the prediction caption will be generated. (Model training in progress) ✅
This predicted label can be shown on the web page. (In progress) ✅

Notes

The path separator on Windows is \\ and on Linux is /
The uploaded file details can be seen on the console

Authors

@Arushi
Navaneeth B R

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
public		public
.gitignore		.gitignore
README.md		README.md
audioConverter.js		audioConverter.js
myaudio.js		myaudio.js
package-lock.json		package-lock.json
package.json		package.json
sample.env		sample.env
server.js		server.js
videoBlobDuration.js		videoBlobDuration.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Samsung-PRISM-Audio-Captioning-Engine

Tech Stack

Environment Variables

Installation

Deployment

API Reference

Login

Appendix

Data Collection Page

Demo Page

Notes

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

AJ1479/Samsung-PRISM-Audio-Captioning-Engine

Folders and files

Latest commit

History

Repository files navigation

Samsung-PRISM-Audio-Captioning-Engine

Tech Stack

Environment Variables

Installation

Deployment

API Reference

Login

Appendix

Data Collection Page

Demo Page

Notes

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages