Skip to content

Commit 05b8e30

Browse files
authored
Update DemoPage.jsx
1 parent 66e70ad commit 05b8e30

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/components/DemoPage.jsx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ if __name__ == "__main__":
101101
</div>
102102

103103
<div style={{ display: 'flex', justifyContent: 'center', gap: '15px', marginBottom: '20px' }}>
104-
<a href="https://arxiv.org" target="_blank" rel="noopener noreferrer" className="start-chat-button" style={{ padding: '12px 12px', fontSize: '16px', display: 'flex', alignItems: 'center' }}>
104+
<a href="https://arxiv.org/abs/2505.10831" target="_blank" rel="noopener noreferrer" className="start-chat-button" style={{ padding: '12px 12px', fontSize: '16px', display: 'flex', alignItems: 'center' }}>
105105
<FaFileAlt style={{ marginRight: '0.5rem', fontSize: '18px' }} /> Paper
106106
</a>
107107

@@ -376,7 +376,7 @@ if __name__ == "__main__":
376376
Figure: GUMs are generally well calibrated. When errors occur, GUMs are underconfident in their propositions---the actual model's predictions lie above perfect calibration. In the user modeling setting, this is ideal. We should underestimate propositions to avoid eroding user trust.
377377
</p>
378378

379-
We then deploy GUMBO with N=5 participants for 5 days, with the system observing the participants' screens. This longitudinal evaluation replicated our results with the underlying GUM. Additionally, participants identified a meaningful number of useful and well-executed suggestions completed by GUMBO. Two of the five participants found particularly high value in the system and asked to continue running it on their computer after the study concluded. Our evaluations also highlight limitations and boundary conditions of GUM and GUMBO, including privacy considerations and overly candid propositions. Please read our <a href="https://arxiv.org" target="_blank" rel="noopener noreferrer" style={{ color: '#ff9d9d' }}>paper</a> for more details!
379+
We then deploy GUMBO with N=5 participants for 5 days, with the system observing the participants' screens. This longitudinal evaluation replicated our results with the underlying GUM. Additionally, participants identified a meaningful number of useful and well-executed suggestions completed by GUMBO. Two of the five participants found particularly high value in the system and asked to continue running it on their computer after the study concluded. Our evaluations also highlight limitations and boundary conditions of GUM and GUMBO, including privacy considerations and overly candid propositions. Please read our <a href="https://arxiv.org/abs/2505.10831" target="_blank" rel="noopener noreferrer" style={{ color: '#ff9d9d' }}>paper</a> for more details!
380380

381381
</p>
382382

0 commit comments

Comments
 (0)