Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iOS] Dictation #2119

Open
2 tasks done
raineorshine opened this issue Jul 7, 2024 · 0 comments
Open
2 tasks done

[iOS] Dictation #2119

raineorshine opened this issue Jul 7, 2024 · 0 comments
Labels
feature New feature or request
Milestone

Comments

@raineorshine
Copy link
Contributor

raineorshine commented Jul 7, 2024

Add dictation support leveraging iOS’s built-in dictation feature.

(Some support was previously added in f3560eb but is now broken. An alternative design is now proposed.)

1. Demonstrate feasibility

UPDATE: I confirmed that it's possible to modify innerHTML and switch focus without interrupting dictation mode.

Source: https://codesandbox.io/p/sandbox/dftgtf
Direct link: https://dftgtf.csb.app

  • Write a small demo that proves that the front of a contenteditable can be edited during dictation without interrupting dictation. The caret will need to be updated to continue dictating at the end of the contenteditable. This demonstrates that #2. Basic Functionality and 3. Newline/New paragraph are feasible.

  • Write a small demo that proves that the caret can be moved to a different contenteditable without interrupting dictation. This demonstrates that #4. Commands is feasible.

2. Basic Functionality

UPDATE: After altering the innerHTML during dictation, new words stop being rendered until dictation mode ends. The original demo did not alter the innerHTML enough to trigger this behavior. Continuous dictation in the same contenteditable while activating New Thought does not appear to be possible. Therefore, it should be treated like any other command, and will result in a small pause as the browser selection moves to the new thought in the usual way.

Steps to Reproduce

  1. Create a new thought.
  2. Enter Dictation mode by pressing the microphone icon in the bottom right corner of the virtual keyboard.
  3. Say "Hello"
  4. Say "New Thought"
  5. Say "Goodbye"
  6. Close the keyboard.

Current Behavior

Dictated text is added literally to a single thought.

- Hello New Thought Goodbye

Expected Behavior

When "New Thought" is dictated create a new thought above the current thought as follows. This is done so that the caret can stay on the cursor thought, which allows for continuous dictation. If New Thought were executed

  1. Create a new thought above without moving the cursor or caret. Initialize its value should be all the text up to, but excluding, “New Thought.”
  2. Delete the text up to, and including, “New Thought” in the cursor thought.
  3. If there is any text after “New Thought” (which may be possible when speaking quickly), preserve it so that dictation can continue.
- Hello
- Goodbye

3. Newline/New paragraph

iOS dictation has a number of builtin dictation commands.

Detect "newline" and "new paragraph" (from the text that is generated) and replace them with the following commands:

“newline” → New Thought
“new paragraph” → New Subthought

4. Commands

Generalize dictation to any command.

Add a dictationPhrase?: string | string[] property to Shortcut. It represents the dictated phrase that will trigger the command. An array of strings specifies multiple utterances that can trigger the command (e.g. “New Subthought” or “New Sub Thought”). It should use a case insensitive comparison.

Add a dictationExec?: ... property that matches the exec type. This will allow custom execution behavior for a given command (for now, just New Thought). If dictationExec is not defined, default to exec.

Enable default dictation for the following commands. They should not require any custom logic, just the addition of the dictationPhrase property:

[label] → [dictationPhrase]

  • archive → "Archive Thought"
  • bindContext → "Bind Context"
  • bold → "Format Bold"
  • bumpThoughtDown → "Bump Thought Down"
  • clearThought → "Clear Thought"
  • collapseContext → "Collapse Context"
  • commandPalette
  • copyCursor → "Copy Cursor"
  • cursorBack → "Cursor Back"
  • cursorDown → "Cursor Down"
  • cursorForward → "Cursor Forward"
  • cursorNext → "Cursor Next"
  • cursorPrev → "Cursor Previous"
  • cursorUp → "Cursor Up"
  • customizeToolbar
  • delete → "Remove Thought"
  • deleteEmptyThoughtOrOutdent
  • devices
  • exportContext → "Export Thought"
  • extractThought → "Extract Thought"
  • favorite
  • fontSizeDown → "Font Size Down"
  • fontSizeUp → "Font Size Up"
  • generateThought → "Generate Thought"
  • headings
  • help
  • home
  • indent → "Indent Thought"
  • italic → "Format Italics"
  • join → "Join Thought"
  • jumpBack → "Cursor Jump Back"
  • jumpForward → "Cursor Jump Forward"
  • moveCursorBackward → "Move Cursor Backward"
  • moveCursorForward → "Move Cursor Forward"
  • moveThoughtDown → "Move Thought Down"
  • moveThoughtUp → "Move Thought Up"
  • newGrandChild → "New Grandchild Thought"
  • newSubthought → ["New Subthought", "New Sub Thought"]
  • newSubthoughtTop → ["New Top SubthoughtTop", "New Top Sub Thought"]
  • newThought → "New Thought"
  • newThoughtAbove → "New Above Thought"
  • newUncle → "New Uncle"
  • note → "Thought Note"
  • outdent → "Outdent Thought"
  • pin → "Pin Thought"
  • pinAll → "Pin All Thoughts"
  • proseView
  • redo → "Redo Command"
  • search
  • selectAll → "Select All Thoughts"
  • settings
  • splitSentences → "Split Thought Sentences"
  • strikethrough → "Format Strikethrough"
  • subcategorizeAll → ["Subcategorize All Thoughts", "Sub categorize All Thoughts"]
  • subcategorizeOne → ["Subcategorize One Thought", "Sub categorize One Thought"]
  • swapNote → "Swap Thought Note"
  • swapParent → "Swap Thought Parent"
  • textColor
  • toggleContextView → "Context View"
  • toggleDone → "Thought Done"
  • toggleHiddenThoughts
  • toggleSidebar
  • toggleSort → "Sort Thoughts"
  • toggleSplitView
  • toggleTableView → "Table View"
  • underline → "Format Underline"
  • undo → "Undo Command"
@raineorshine raineorshine added the feature New feature or request label Jul 7, 2024
@raineorshine raineorshine added this to the 📱 iOS milestone Jul 7, 2024
@raineorshine raineorshine changed the title [iOS] Speech-to-text v2 [iOS] Dictation Nov 11, 2024
mykhailonehelia added a commit to mykhailonehelia/em that referenced this issue Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant