Skip to content

Project Title: Developing Interactive Jupyter Notebooks to run on the “AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment” (ICICLE) project.

License

Notifications You must be signed in to change notification settings

sdsc-hpc-students/REHS2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2024 SDSC Research Experience for High School Students Project

Project Title: Developing Interactive Jupyter Notebooks to run on the “AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment” (ICICLE) project.

Part of the 2023 Research Experience for High School Students Program: https://education.sdsc.edu/studenttech/rehs/


Contents:

Participants

  • Project Lead: Mary Thomas, Ph.D., SDSC HPC/CI Training lead, and Computational Data Scientist in the Data-Enabled Scientific Computing Division.

  • REHS Students:

    • ParJupNB-Expanse Collection, David Bond
    • ParJupNB-AI, Anusha Khobare
    • ParJupNB-AI, Ronit Thomas
    • ParJupNB-AI, Ryan Leschensky
    • ICICLE Training Catalog, Soham Kamat
  • SDSC Collaborators/Mentors:

    • Martin Kandes, Ph.D (Computational Data Scientist, SDSC)
    • Peter Rose, Ph.D. (Director of the Structural Bioinformatics Laboratory, SDSC(
    • Paul Rodriquez, Ph.D. (Computational Data Scientist, SDSC)
    • Mahidhar Tatineni, Ph.D (User Support Group Lead, SDSC)
  • External Collaborators/Mentors:

    • Christian Garcia, Engineering Scientist Associate (Texas Advanced Computing Center)
    • Carlos Guzman, ICICLE project, Ohio State University
    • David Lee, Senior, XXX High School
    • Sahil Samar, Undergraduate, Georgia Tech
    • Joe Stubbs, Ph.D. (Manager, Cloud & Interactive Computing, Texas Advanced Computing Center)

    Back to Top


Project Description

Overview

This project involves running Jupyter Notebooks on the NSF funded Expanse high-performance (HPC) system [1] and testing software to be used on the NSF funded ICICLE AI project [3]. Expanse is SDSC's newest supercomputer. The result of a $10M National Science Foundation (NSF) award, Expanse delivers over 5,2 peak petaFlOps of computing power to scientists, engineers, and researchers all around the world [2]. Expanse provides three kinds of HPC/CI resources: General Computing Nodes, NVIDIA GPU Nodes, and the petascale Luster filesystem. Thousands of users have accessed these high-performance computing (HPC) resources via traditional runs from the command line and using batch queuing systems.

The National Science Foundation funded AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) [3] will build the next generation of Cyberinfrastructure to render Artificial Intelligence (AI) more accessible to everyone and drive its further democratization in the larger society. ICICLE will develop intelligent cyberinfrastructure with transparent and high-performance execution on diverse and heterogeneous environments. It will advance plug-and-play AI that is easy to use by scientists across a wide range of domains, promoting the democratization of AI.

Scientists using HPC Systems working with interactive HPC tools such as Jupyter notebooks to implement computational and data analysis functions and workflows [5]. Jupyter notebooks are web applications that allow you to create and share documents that contain live code, equations, visualizations and narrative text. These notebooks part of a general trend in research computing away from command-line style interfaces and towards browser-based and graphical interfaces. Jupyter notebooks are especially useful for interactivity: the development, testing, and exploration of data sets or as an instructional resource [6]. Users working interactively expect a timely response, both for initial application startup and during the course of a session.

The goals of this research project will be to:

  1. Learn the basics of High Performance Computing on Expanse, using Jupyter notebooks
  2. Learn the basics of AI models running in Jupyter notebooks on Expanse
  3. Contribute Jupyter Notebooks to the ICICLE Model Commons Collection

The research components will be to:

  1. Contribute to the body of knowledge needed for hosting live, dynamic, interactive services that interface to HPC systems
  2. Develop interactive AI notebooks that run on Expanse and potentially the ICICLE system.
  3. Students will optionally have the opportunity to publish their results on an open share site such as arXiv.org (see REHS22 publications [7]).

Current project details and examples from previous projects can be found in [8], [9], and [10].

Description of the plan to integrate the student researcher into the group's activities:

Prior to beginning the REHS program, the selected student team members will be provided with recommended programming exercises to help build the skills they will need to learn in order to successfully complete this project. Dr. Thomas and other mentors will be available via email to provide guidance to the students on how to approach these exercises. During the first week of the REHS program, the student team will then work closely with Dr. Thomas and other mentors to build a research plan that clearly defines the milestones of the project in order to meet its goals. In addition, the students will have the opportunity to interact with other REHS students and undergraduate or graduate interns that will be working on similar projects.

List of student prerequisites for the research project:

  • Applicants must have a demonstrated interest in computer science and mathematics.
  • In addition, previous experience in: Jupyter Notebooks
  • Some exposure to Artificial Intelligence (AI) methods
  • programming in Python (preferred); exposure to the Linux/Unix operating system.

Publications and Presentations

  • TBD

Back to Top


References:

  1. https://www.sdsc.edu/News%20Items/PR20190716_Expanse.html
  2. https://www.sdsc.edu/services/hpc/expanse/
  3. The Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) Project: https://icicle.osu.edu/
  4. https://www.tacc.utexas.edu/home
  5. The Jupyter Notebook Project Website, https://jupyter.org/
  6. Zonca, A. and R.S. Sinkovits, Deploying Jupyter Notebooks at scale on XSEDE for Science Gateways and workshops. Available at: https://zonca.github.io/docs/pearc18_slides_zonca_sinkovits.pdf
  7. Samar, S., Chen, M., Garcia, C., Karpinski, J., Lange, M., Ray, M., … Thomas, M. (2023). Development of Authenticated Clients and Applications for ICICLE CI Services. ArXiv, 1–8. Retrieved from https://arxiv.org/abs/2304.11086
  8. https://github.com/sdsc-hpc-students/REHS2024
  9. https://github.com/sdsc-hpc-students/REHS2023
  10. https://github.com/sdsc-hpc-students/REHS2022

Back to Top


About

Project Title: Developing Interactive Jupyter Notebooks to run on the “AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment” (ICICLE) project.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published