Skip to content

Commit 4882956

Browse files
committed
Streamlit poc
1 parent e0cf8df commit 4882956

File tree

5 files changed

+648
-93
lines changed

5 files changed

+648
-93
lines changed

README.md

Lines changed: 20 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,37 @@
11
# Sequence Extractor
22

3-
Implemented as a simple UI application built with Python and [Tkinter](https://docs.python.org/3/library/tkinter.html#module-tkinter).
3+
A streamlit application for processing Excel files.
44

5-
## Requirements
5+
## Description
66

7-
- Python 3.13 or later
8-
- [uv](https://github.com/astral-sh/uv) - Fast Python package installer and resolver
9-
10-
## Installing uv
11-
12-
Before you can use this project, you need to install uv on your Mac:
13-
14-
```bash
15-
# Using Homebrew (recommended)
16-
brew install uv
17-
18-
# Or using pip
19-
pip install uv
20-
```
21-
22-
For more installation options, see the [uv installation guide](https://github.com/astral-sh/uv#installation).
7+
This application allows users to:
8+
- Upload a single "RS totales" Excel file
9+
- Upload multiple "Variant tables" Excel files
10+
- Process these files and view statistics on the data
2311

2412
## Installation
2513

26-
1. Clone this repository:
27-
```bash
28-
git clone <repository-url>
29-
cd sequence_extractor
30-
```
31-
32-
2. Create a virtual environment using uv:
33-
```bash
34-
uv venv
35-
```
36-
37-
3. Activate the virtual environment:
38-
```bash
39-
source .venv/bin/activate
40-
```
41-
42-
4. Install dependencies:
43-
```bash
44-
uv pip install -e .
45-
```
46-
47-
## Running the Application
14+
### Requirements
15+
- Python 3.13 or higher
4816

49-
After installation, you can run the application:
17+
### Setup with UV
5018

19+
Declare new dependencies:
5120
```bash
52-
uv run main.py
21+
uv add streamlit pandas openpyxl
22+
uv add --dev pytest
5323
```
5424

55-
## Managing Dependencies
25+
## Usage
5626

57-
### Adding New Dependencies
58-
59-
To add a new dependency:
60-
61-
1. Add it to the `dependencies` list in `pyproject.toml`:
62-
```toml
63-
dependencies = [
64-
"package-name>=1.0.0",
65-
]
66-
```
67-
68-
2. Install the updated dependencies:
69-
```bash
70-
uv pip install -e .
71-
```
72-
73-
### Adding Development Dependencies
74-
75-
For development-only dependencies:
76-
77-
1. Add a `[project.optional-dependencies]` section to `pyproject.toml`:
78-
```toml
79-
[project.optional-dependencies]
80-
dev = [
81-
"pytest>=7.0.0",
82-
"black>=23.0.0",
83-
]
84-
```
85-
86-
2. Install development dependencies:
87-
```bash
88-
uv pip install -e ".[dev]"
89-
```
90-
91-
## Running Tests
27+
Run the application:
28+
```bash
29+
uv run -m streamlit run main.py
30+
```
9231

93-
If you've added pytest as a development dependency:
32+
## Development
9433

34+
Run tests:
9535
```bash
96-
uv run pytest
36+
pytest
9737
```

main.py

Lines changed: 46 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,50 @@
1-
from tkinter import *
2-
from tkinter import ttk
1+
import streamlit as st
2+
import pandas as pd
3+
import time
34

4-
# def main():
5-
# print("Hello from sequence-extractor!")
5+
def read_excel_file(file):
6+
"""Read an Excel file and return the DataFrame."""
7+
return pd.read_excel(file)
68

9+
def count_total_rows(dataframes):
10+
"""Count total rows across all dataframes."""
11+
return sum(len(df) for df in dataframes if df is not None)
712

8-
# if __name__ == "__main__":
9-
# main()
13+
def main():
14+
st.title("Excel File Sequence Extractor")
1015

11-
root = Tk()
12-
frm = ttk.Frame(root, padding=10)
13-
frm.grid()
14-
ttk.Label(frm, text="Hello World!").grid(column=0, row=0)
15-
ttk.Button(frm, text="Quit", command=root.destroy).grid(column=1, row=0)
16-
root.mainloop()
16+
# First input for single file
17+
st.subheader("Input Files")
18+
rs_totales_file = st.file_uploader("RS totales", type=["xlsx"])
19+
20+
# Second input for multiple files
21+
variant_tables_files = st.file_uploader("Variant tables", type=["xlsx"], accept_multiple_files=True)
22+
23+
# Submit button
24+
if st.button("Submit"):
25+
if rs_totales_file is None or not variant_tables_files:
26+
st.error("Please upload all required files.")
27+
else:
28+
# Show spinner while processing
29+
with st.spinner("Processing files..."):
30+
# Read the RS totales file
31+
rs_df = read_excel_file(rs_totales_file)
32+
33+
# Read all variant tables files
34+
variant_dfs = []
35+
for file in variant_tables_files:
36+
variant_dfs.append(read_excel_file(file))
37+
38+
# Combine all dataframes for counting
39+
all_dfs = [rs_df] + variant_dfs
40+
41+
# Simulate processing time
42+
time.sleep(1)
43+
44+
# Display results
45+
total_rows = count_total_rows(all_dfs)
46+
st.success(f"Processing complete!")
47+
st.metric("Total rows parsed across all files", total_rows)
48+
49+
if __name__ == "__main__":
50+
main()

pyproject.toml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@ version = "0.1.0"
44
description = "Add your description here"
55
readme = "README.md"
66
requires-python = ">=3.13"
7-
dependencies = []
7+
dependencies = [
8+
"openpyxl>=3.1.5",
9+
"pandas>=2.2.3",
10+
"streamlit>=1.43.2",
11+
]
812

913
[project.optional-dependencies]
1014
dev = [
@@ -13,3 +17,8 @@ dev = [
1317
"black>=23.9.1",
1418
"flake8>=6.1.0",
1519
]
20+
21+
[dependency-groups]
22+
dev = [
23+
"pytest>=8.3.5",
24+
]

tests/test_main.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,17 @@
11
import pytest
2+
import io
3+
from main import read_excel_file, count_total_rows
24

35
def test_basic_sum():
46
"""A simple placeholder test that demonstrates how to write a test."""
57
# This is just a placeholder - replace with actual tests
68
assert 1 + 1 == 2
9+
10+
def test_count_total_rows():
11+
"""Test the function that counts total rows across dataframes."""
12+
df1 = pd.DataFrame({'A': [1, 2, 3]})
13+
df2 = pd.DataFrame({'B': [1, 2, 3, 4, 5]})
14+
df3 = pd.DataFrame({'C': [1, 2]})
15+
16+
total = count_total_rows([df1, df2, df3])
17+
assert total == 10

0 commit comments

Comments
 (0)