diff --git a/README.md b/README.md
index da3fec700..c9dbd223c 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,8 @@ See also:
 - https://github.com/OpenAdaptAI/pynput
 - https://github.com/OpenAdaptAI/atomacos
 
-# OpenAdapt: AI-First Process Automation with Large Multimodal Models (LMMs).
+# OpenAdapt: Open Source Generative Process Automation.
+## AI-First Process Automation with Large Multimodal Models (LMMs).
 
 **OpenAdapt** is the **open** source software **adapt**er between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs).
 
@@ -35,9 +36,8 @@ with the power of Large Multimodal Modals (LMMs) by:
 - Recording screenshots and associated user input
 - Aggregating and visualizing user input and recordings for development
 - Converting screenshots and user input into tokenized format
-- Generating synthetic input via transformer model completions
-- Generating task trees by analyzing recordings (work-in-progress)
-- Replaying synthetic input to complete tasks (work-in-progress)
+- Generating and replaying synthetic input via transformer model completions
+- Generating process graphs by analyzing recording logs (work-in-progress)
 
 The goal is similar to that of
 [Robotic Process Automation](https://en.wikipedia.org/wiki/Robotic_process_automation),
@@ -165,37 +165,6 @@ pointing the cursor and left or right clicking, as described in this
 [open issue](https://github.com/OpenAdaptAI/OpenAdapt/issues/145)
 
 
-### Capturing Browser Events
-
-To capture (record) browser events in Chrome, follow these steps:
-
-1. Go to: [Chrome Extension Page](chrome://extensions/)
-
-2. Enable `Developer mode` (located at the top right):
-
-![image](https://github.com/OpenAdaptAI/OpenAdapt/assets/65433817/c97eb9fb-05d6-465d-85b3-332694556272)
-
-3. Click `Load unpacked` (located at the top left).
-
-![image](https://github.com/OpenAdaptAI/OpenAdapt/assets/65433817/00c8adf5-074a-4655-b132-fd87644007fc)
-
-4. Select the `chrome_extension` directory:
-
-![image](https://github.com/OpenAdaptAI/OpenAdapt/assets/65433817/71610ed3-f8d4-431a-9a22-d901127b7b0c)
-
-5. You should see the following confirmation, indicating that the extension is loaded:
-
-![image](https://github.com/OpenAdaptAI/OpenAdapt/assets/65433817/7ee19da9-37e0-448f-b9ab-08ef99110e85)
-
-6. Set the flag to `true` if it is currently `false`:
-
-![image](https://github.com/user-attachments/assets/8eba24a3-7c68-4deb-8fbe-9d03cece1482)
-
-7. Start recording. Once recording begins, navigate to the Chrome browser, browse some pages, and perform a few clicks. Then, stop the recording and let it complete successfully.
-
-8. After recording, check the `openadapt.db` table `browser_event`. It should contain all your browser activity logs. You can verify the data's correctness using the `sqlite3` CLI or an extension like `SQLite Viewer` in VS Code to open `data/openadapt.db`.
-
-
 ### Visualize
 
 Quickly visualize the latest recording you created by running the following command:
@@ -243,6 +212,7 @@ Other replay strategies include:
 - [`StatefulReplayStrategy`](https://github.com/OpenAdaptAI/OpenAdapt/blob/main/openadapt/strategies/stateful.py): Early proof-of-concept which uses the OpenAI GPT-4 API with prompts constructed via OS-level window data.
 - (*)[`VisualReplayStrategy`](https://github.com/OpenAdaptAI/OpenAdapt/blob/main/openadapt/strategies/visual.py): Uses [Fast Segment Anything Model (FastSAM)](https://github.com/CASIA-IVA-Lab/FastSAM) to segment active window.
 - (*)[`VanillaReplayStrategy`](https://github.com/OpenAdaptAI/OpenAdapt/blob/main/openadapt/strategies/vanilla.py): Assumes the model is capable of directly reasoning on states and actions accurately. With future frontier models, we hope that this script will suddenly work a lot better.
+- (*)[`BrowserReplayStrategy`](https://github.com/OpenAdaptAI/OpenAdapt/blob/main/openadapt/strategies/browser.py): Uses the browser extension to read the visible DOM, and refers to recorded browser events to identify target elements.
 
 
 The (*) prefix indicates strategies which accept an "instructions" parameter that is used to modify the recording, e.g.:
@@ -253,6 +223,22 @@ python -m openadapt.replay VanillaReplayStrategy --instructions "calculate 9-8"
 
 See https://github.com/OpenAdaptAI/OpenAdapt/tree/main/openadapt/strategies for a complete list. More ReplayStrategies coming soon! (see [Contributing](#Contributing)).
 
+### Browser integration
+
+To record browser events in Google Chrome (required by the `BrowserReplayStrategy`), follow these steps:
+
+1. Go to your Chrome extensions page by entering [chrome://extensions](chrome://extensions/) in your address bar.
+
+2. Enable `Developer mode` (located at the top right).
+
+3. Click `Load unpacked` (located at the top left).
+
+4. Select the `chrome_extension` directory in the OpenAdapt repo.
+
+5. Make sure the Chrome extension is enabled (the switch to the right of the OpenAdapt extension widget is turned on).
+
+6. Set the `RECORD_BROWSER_EVENTS` flag to `true` in `openadapt/data/config.json`.
+
 ## Features
 
 ### State-of-the-art GUI understanding via [Segment Anything in High Quality](https://github.com/SysCV/sam-hq):
@@ -306,13 +292,6 @@ We're looking forward to your contributions. Let's build the future 🚀
 
 ## Contributing
 
-### Notable Works-in-progress (incomplete, see https://github.com/OpenAdaptAI/OpenAdapt/pulls and https://github.com/OpenAdaptAI/OpenAdapt/issues/ for more)
-
-- [Video Recording Hardware Acceleration](https://github.com/OpenAdaptAI/OpenAdapt/issues/570) (help wanted)
-- [Audio Narration](https://github.com/OpenAdaptAI/OpenAdapt/pull/346) (help wanted)
-- [Chrome Extension](https://github.com/OpenAdaptAI/OpenAdapt/pull/364) (help wanted)
-- [Gemini Vision](https://github.com/OpenAdaptAI/OpenAdapt/issues/551) (help wanted)
-
 ### Replay Problem Statement
 
 Our goal is to automate the task described and demonstrated in a `Recording`.
diff --git a/chrome_extension/background.js b/chrome_extension/background.js
index a747b8669..24e6203fb 100644
--- a/chrome_extension/background.js
+++ b/chrome_extension/background.js
@@ -1,33 +1,28 @@
 /**
  * @file background.js
- * @description Creates a new background script that listens for messages from the content script
- * and sends them to a WebSocket server.
-*/
+ * @description Background script that maintains the current mode and communicates with content scripts.
+ */
 
 let socket;
+let currentMode = null; // Maintain the current mode here
 let timeOffset = 0; // Global variable to store the time offset
 
-/* 
- * TODO: 
-  * Ideally we read `WS_SERVER_PORT`, `WS_SERVER_ADDRESS` and 
-  * `RECONNECT_TIMEOUT_INTERVAL` from config.py, 
-  * or it gets passed in somehow. 
-*/
+/*
+ * Note: these need to match the corresponding values in config[.defaults].json
+ */
 let RECONNECT_TIMEOUT_INTERVAL = 1000; // ms
 let WS_SERVER_PORT = 8765;
 let WS_SERVER_ADDRESS = "localhost";
 let WS_SERVER_URL = "ws://" + WS_SERVER_ADDRESS + ":" + WS_SERVER_PORT;
 
-
 function socketSend(socket, message) {
   console.log({ message });
   socket.send(JSON.stringify(message));
 }
 
-
 /*
  * Function to connect to the WebSocket server.
-*/
+ */
 function connectWebSocket() {
   socket = new WebSocket(WS_SERVER_URL);
 
@@ -38,11 +33,34 @@ function connectWebSocket() {
   socket.onmessage = function(event) {
     console.log("Message from server:", event.data);
     const message = JSON.parse(event.data);
+
+    // Handle mode messages
+    if (message.type === 'SET_MODE') {
+      currentMode = message.mode; // Update the current mode
+      console.log(`Mode set to: ${currentMode}`);
+
+      // Send the mode to all active tabs
+      chrome.tabs.query(
+        {
+          active: true,
+        },
+        function(tabs) {
+          tabs.forEach(function(tab) {
+            chrome.tabs.sendMessage(tab.id, message, function(response) {
+              if (chrome.runtime.lastError) {
+                console.error("Error sending message to content script in tab " + tab.id, chrome.runtime.lastError.message);
+              } else {
+                console.log("Message sent to content script in tab " + tab.id, response);
+              }
+            });
+          });
+        }
+      );
+    }
   };
 
   socket.onclose = function(event) {
     console.log("WebSocket connection closed", event);
-    // Reconnect after 5 seconds if the connection is lost
     setTimeout(connectWebSocket, RECONNECT_TIMEOUT_INTERVAL);
   };
 
@@ -66,3 +84,32 @@ chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
     sendResponse({ status: "WebSocket connection not open" });
   }
 });
+
+/* Listen for tab activation */
+chrome.tabs.onActivated.addListener((activeInfo) => {
+  // Send current mode to the newly active tab if it's not null
+  if (currentMode) {
+    const message = { type: 'SET_MODE', mode: currentMode };
+    chrome.tabs.sendMessage(activeInfo.tabId, message, function(response) {
+      if (chrome.runtime.lastError) {
+        console.error("Error sending message to content script in tab " + activeInfo.tabId, chrome.runtime.lastError.message);
+      } else {
+        console.log("Message sent to content script in tab " + activeInfo.tabId, response);
+      }
+    });
+  }
+});
+
+/* Listen for tab updates to handle new pages or reloading */
+chrome.tabs.onUpdated.addListener((tabId, changeInfo, tab) => {
+  if (changeInfo.status === 'complete' && currentMode) {
+    const message = { type: 'SET_MODE', mode: currentMode };
+    chrome.tabs.sendMessage(tabId, message, function(response) {
+      if (chrome.runtime.lastError) {
+        console.error("Error sending message to content script in tab " + tabId, chrome.runtime.lastError.message);
+      } else {
+        console.log("Message sent to content script in tab " + tabId, response);
+      }
+    });
+  }
+});
diff --git a/chrome_extension/content.js b/chrome_extension/content.js
index a08daabb8..f9cb163cf 100644
--- a/chrome_extension/content.js
+++ b/chrome_extension/content.js
@@ -1,4 +1,116 @@
 const DEBUG = true;
+
+if (!DEBUG) {
+  console.debug = function() {};
+}
+
+let currentMode = "idle"; // Default mode is 'idle'
+let recordListenersAttached = false; // Track if record listeners are currently attached
+let replayObserversAttached = false; // Track if replay observers are currently attached
+
+// Listen for messages from the background script or Python process
+chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
+  console.log("Received message:", message);
+  if (message.type === 'SET_MODE') {
+    currentMode = message.mode;
+    console.log(`Mode set to: ${currentMode}`);
+
+    // Attach or detach listeners based on mode
+    if (currentMode === 'record') {
+      if (!recordListenersAttached) attachRecordListeners();
+      if (replayObserversAttached) disconnectReplayObservers(); // Detach replay observers if needed
+    } else if (currentMode === 'replay') {
+      debounceSendVisibleHTML('setmode');
+      if (!replayObserversAttached) attachReplayObservers();
+      if (recordListenersAttached) detachRecordListeners(); // Detach record listeners if needed
+    } else if (currentMode === 'idle') {
+      if (recordListenersAttached) detachRecordListeners();
+      if (replayObserversAttached) disconnectReplayObservers();
+    }
+  }
+});
+
+// Attach event listeners for recording mode
+function attachRecordListeners() {
+  if (!recordListenersAttached) {
+    attachUserEventListeners();
+    attachInstrumentationEventListeners();
+    recordListenersAttached = true;
+  }
+}
+
+// Attach user-generated event listeners
+function attachUserEventListeners() {
+  console.log("attachUserEventListeners()");
+  const eventsToCapture = ['click', 'keydown', 'keyup'];
+
+  eventsToCapture.forEach(eventType => {
+    document.body.addEventListener(eventType, handleUserEvent, true);
+  });
+}
+
+// Attach instrumentation event listeners
+function attachInstrumentationEventListeners() {
+  console.log("attachInstrumentationEventListeners()");
+  const eventsToCapture = ['mousedown', 'mouseup', 'mousemove'];
+
+  eventsToCapture.forEach(eventType => {
+    document.body.addEventListener(eventType, trackMouseEvent, true);
+  });
+}
+
+// Detach all event listeners for recording mode
+function detachRecordListeners() {
+  const eventsToCapture = [
+    'click', 'keydown', 'keyup', 'mousedown', 'mouseup', 'mousemove'
+  ];
+
+  eventsToCapture.forEach(eventType => {
+    document.body.removeEventListener(eventType, handleUserEvent, true);
+    document.body.removeEventListener(eventType, trackMouseEvent, true);
+  });
+
+  recordListenersAttached = false;
+}
+
+// Attach observers for replay mode
+function attachReplayObservers() {
+  if (!replayObserversAttached) {
+    setupIntersectionObserver();
+    setupMutationObserver();
+    setupScrollAndResizeListeners();
+    replayObserversAttached = true;
+  }
+}
+
+// Disconnect observers for replay mode
+function disconnectReplayObservers() {
+  if (window.intersectionObserverInstance) {
+    window.intersectionObserverInstance.disconnect();
+  }
+  if (window.mutationObserverInstance) {
+    window.mutationObserverInstance.disconnect();
+  }
+  window.removeEventListener('scroll', handleScrollEvent, { passive: true });
+  window.removeEventListener('resize', handleResizeEvent, { passive: true });
+
+  replayObserversAttached = false;
+}
+
+// Handle scroll events
+function handleScrollEvent(event) {
+  debounceSendVisibleHTML(event.type);
+}
+
+// Handle resize events
+function handleResizeEvent(event) {
+  debounceSendVisibleHTML(event.type);
+}
+
+/*
+ * Record
+ */
+
 const RETURN_FULL_DOCUMENT = false;
 const MAX_COORDS = 3;
 const SET_SCREEN_COORDS = false;
@@ -123,19 +235,25 @@ function sendMessageToBackgroundScript(message) {
 }
 
 function generateElementIdAndBbox(element) {
+  console.debug(`[generateElementIdAndBbox] Processing element: ${element.tagName}`);
+
   // ignore invisible elements
   if (!isVisible(element)) {
+    console.debug(`[generateElementIdAndBbox] Element is not visible: ${element.tagName}`);
     return;
   }
 
   // set id
   if (!elementIdMap.has(element)) {
     const newId = `elem-${elementIdCounter++}`;
+    console.debug(`[generateElementIdAndBbox] Generated new ID: ${newId} for element: ${element.tagName}`);
     elementIdMap.set(element, newId);
     idToElementMap.set(newId, element); // Reverse mapping
     element.setAttribute('data-id', newId);
   }
 
+  // TODO: store bounding boxes in a map instead of in DOM attributes
+
   // set client bbox
   let { top, left, bottom, right } = element.getBoundingClientRect();
   let bboxClient = `${top},${left},${bottom},${right}`;
@@ -143,6 +261,7 @@ function generateElementIdAndBbox(element) {
 
   // set screen bbox
   if (SET_SCREEN_COORDS) {
+    // XXX TODO: support in replay mode, or remove altogether
     ({ top, left, bottom, right } = getScreenCoordinates(element));
     if (top == null) {
       // not enough data points to get screen coordinates
@@ -214,17 +333,17 @@ function cleanDomTree(node) {
   }
 }
 
-function getVisibleHtmlString() {
+function getVisibleHTMLString() {
   const startTime = performance.now();
 
   // Step 1: Instrument the live DOM with data-id and data-bbox attributes
   instrumentLiveDomWithBbox();
 
   if (RETURN_FULL_DOCUMENT) {
-    const visibleHtmlDuration = performance.now() - startTime;
-    console.log({ visibleHtmlDuration });
-    const visibleHtmlString = document.body.outerHTML;
-    return { visibleHtmlString, visibleHtmlDuration };
+    const visibleHTMLDuration = performance.now() - startTime;
+    console.log({ visibleHTMLDuration });
+    const visibleHTMLString = document.body.outerHTML;
+    return { visibleHTMLString, visibleHTMLDuration };
   }
 
   // Step 2: Clone the body
@@ -234,12 +353,12 @@ function getVisibleHtmlString() {
   cleanDomTree(clonedBody);
 
   // Step 4: Serialize the modified clone to a string
-  const visibleHtmlString = clonedBody.outerHTML;
+  const visibleHTMLString = clonedBody.outerHTML;
 
-  const visibleHtmlDuration = performance.now() - startTime;
-  console.log({ visibleHtmlDuration });
+  const visibleHTMLDuration = performance.now() - startTime;
+  console.debug({ visibleHTMLDuration });
 
-  return { visibleHtmlString, visibleHtmlDuration };
+  return { visibleHTMLString, visibleHTMLDuration };
 }
 
 /**
@@ -277,20 +396,20 @@ function validateCoordinates(event, eventTarget, attrType, coordX, coordY) {
   }
 }
 
-function handleUserGeneratedEvent(event) {
+function handleUserEvent(event) {
   const eventTarget = event.target;
   const eventTargetId = generateElementIdAndBbox(eventTarget);
   const timestamp = Date.now() / 1000;  // Convert to Python-compatible seconds
 
-  const { visibleHtmlString, visibleHtmlDuration } = getVisibleHtmlString();
+  const { visibleHTMLString, visibleHTMLDuration } = getVisibleHTMLString();
 
   const eventData = {
     type: 'USER_EVENT',
     eventType: event.type,
     targetId: eventTargetId,
     timestamp: timestamp,
-    visibleHtmlString,
-    visibleHtmlDuration,
+    visibleHTMLString,
+    visibleHTMLDuration,
   };
 
   if (event instanceof KeyboardEvent) {
@@ -324,7 +443,7 @@ function attachUserEventListeners() {
   ];
 
   eventsToCapture.forEach(eventType => {
-    document.body.addEventListener(eventType, handleUserGeneratedEvent, true);
+    document.body.addEventListener(eventType, handleUserEvent, true);
   });
 }
 
@@ -339,6 +458,118 @@ function attachInstrumentationEventListeners() {
   });
 }
 
-// Initial setup
-attachUserEventListeners();
-attachInstrumentationEventListeners();
+/*
+ * Replay
+ */
+
+let debounceTimeoutId = null; // Timeout ID for debouncing
+const DEBOUNCE_DELAY = 10;
+
+function setupIntersectionObserver() {
+  const observer = new IntersectionObserver(handleIntersection, {
+    root: null, // Use the viewport as the root
+    threshold: 0 // Consider an element visible if any part of it is in view
+  });
+
+  document.querySelectorAll('*').forEach(element => observer.observe(element));
+}
+
+function handleIntersection(entries) {
+  let shouldSendUpdate = false;
+  entries.forEach(entry => {
+    if (entry.isIntersecting) {
+      shouldSendUpdate = true;
+    }
+  });
+  if (shouldSendUpdate) {
+    debounceSendVisibleHTML('intersection');
+  }
+}
+
+function setupMutationObserver() {
+  const observer = new MutationObserver(handleMutations);
+  observer.observe(document.body, {
+    childList: true,
+    // XXX this results in continuous DOM_EVENT messages on some websites (e.g. ChatGPT)
+    subtree: true,
+    attributes: true
+  });
+}
+
+function handleMutations(mutationsList) {
+  const startTime = performance.now(); // Capture start time for the instrumentation
+  console.debug(`[handleMutations] Start handling ${mutationsList.length} mutations at ${startTime}`);
+
+  let shouldSendUpdate = false;
+
+  for (const mutation of mutationsList) {
+    console.debug(`[handleMutations] Mutation type: ${mutation.type}, target: ${mutation.target.tagName}`);
+    for (const node of mutation.addedNodes) {
+      if (node.nodeType === Node.ELEMENT_NODE) {
+        console.debug(`[handleMutations] Added node: ${node.tagName}`);
+
+        // Uncommenting this freezes some websites (e.g. ChatGPT).
+        // It should not be necessary to call this here since it is also called in
+        // getVisibleHTMLString.
+        //generateElementIdAndBbox(node); // Generate a new ID and bbox for the added node
+
+        if (isVisible(node)) {
+          shouldSendUpdate = true;
+          break; // Exit the loop early
+        }
+      }
+    }
+    if (shouldSendUpdate) break; // Exit outer loop if update is needed
+
+    for (const node of mutation.removedNodes) {
+      console.log(`[handleMutations] Removed node: ${node.tagName}`);
+      if (node.nodeType === Node.ELEMENT_NODE && idToElementMap.has(node.getAttribute('data-id'))) {
+        shouldSendUpdate = true;
+        break; // Exit the loop early
+      }
+    }
+    if (shouldSendUpdate) break; // Exit outer loop if update is needed
+  }
+
+  const endTime = performance.now();
+  console.debug(`[handleMutations] Finished handling mutations. Duration: ${endTime - startTime}ms`);
+
+  if (shouldSendUpdate) {
+    debounceSendVisibleHTML('mutation');
+  }
+}
+
+function debounceSendVisibleHTML(eventType) {
+  // Clear the previous timeout, if any
+  if (debounceTimeoutId) {
+    clearTimeout(debounceTimeoutId);
+  }
+
+  console.debug(`[debounceSendVisibleHTML] Debouncing visible HTML send for event: ${eventType}`);
+  // Set a new timeout
+  debounceTimeoutId = setTimeout(() => {
+    sendVisibleHTML(eventType);
+  }, DEBOUNCE_DELAY);
+}
+
+function sendVisibleHTML(eventType) {
+  console.debug(`Handling DOM event: ${eventType}`);
+  const timestamp = Date.now() / 1000;  // Convert to Python-compatible seconds
+
+  const { visibleHTMLString, visibleHTMLDuration } = getVisibleHTMLString();
+
+  const eventData = {
+    type: 'DOM_EVENT',
+    eventType: eventType,
+    timestamp: timestamp,
+    visibleHTMLString,
+    visibleHTMLDuration,
+  };
+
+  sendMessageToBackgroundScript(eventData);
+}
+
+function setupScrollAndResizeListeners() {
+  window.addEventListener('scroll', handleScrollEvent, { passive: true });
+  window.addEventListener('resize', handleResizeEvent, { passive: true });
+}
diff --git a/openadapt/browser.py b/openadapt/browser.py
index aa8f1cd6b..c8894bcae 100644
--- a/openadapt/browser.py
+++ b/openadapt/browser.py
@@ -1,16 +1,17 @@
 """Utilities for working with BrowserEvents."""
 
 from statistics import mean, median, stdev
+import json
 
-from bs4 import BeautifulSoup
 from copy import deepcopy
 from dtaidistance import dtw, dtw_ndim
-from loguru import logger
 from sqlalchemy.orm import Session as SaSession
 from tqdm import tqdm
 import numpy as np
+import websockets.sync.server
 
 from openadapt import models, utils
+from openadapt.custom_logger import logger
 from openadapt.db import crud
 
 # action to browser
@@ -79,6 +80,18 @@
 ]
 
 
+def set_browser_mode(
+    mode: str, websocket: websockets.sync.server.ServerConnection
+) -> None:
+    """Send a message to the browser extension to set the mode."""
+    logger.info(f"{type(websocket)=}")
+    VALID_MODES = ("idle", "record", "replay")
+    assert mode in VALID_MODES, f"{mode=} not in {VALID_MODES=}"
+    message = json.dumps({"type": "SET_MODE", "mode": mode})
+    logger.info(f"sending {message=}")
+    websocket.send(message)
+
+
 def add_screen_tlbr(browser_events: list[models.BrowserEvent]) -> None:
     """Computes and adds the 'data-tlbr-screen' attribute for each element.
 
@@ -96,29 +109,17 @@ def add_screen_tlbr(browser_events: list[models.BrowserEvent]) -> None:
 
     # Iterate over the events in reverse order
     for event in reversed(browser_events):
-        message = event.message
-
-        event_type = message.get("eventType")
-        if event_type != "click":
-            continue
-
-        visible_html_string = message.get("visibleHtmlString")
-        if not visible_html_string:
-            logger.warning("No visible HTML data available for event.")
+        try:
+            soup, target_element = event.parse()
+        except AssertionError as exc:
+            logger.warning(exc)
             continue
 
-        # Parse the visible HTML using BeautifulSoup
-        soup = BeautifulSoup(visible_html_string, "html.parser")
-
-        # Fetch the target element using its data-id
-        target_id = message.get("targetId")
-        target_element = soup.find(attrs={"data-id": target_id})
-
         if not target_element:
-            logger.warning(f"No target element found for targetId: {target_id}")
             continue
 
         # Extract coordMappings from the message
+        message = event.message
         coord_mappings = message.get("coordMappings", {})
         x_mappings = coord_mappings.get("x", {})
         y_mappings = coord_mappings.get("y", {})
@@ -195,7 +196,7 @@ def add_screen_tlbr(browser_events: list[models.BrowserEvent]) -> None:
         target_element["data-tlbr-screen"] = new_screen_coords
 
         # Write the updated element back to the message
-        message["visibleHtmlString"] = str(soup)
+        message["visibleHTMLString"] = str(soup)
 
     logger.info("Finished processing all browser events for screen coordinates.")
 
@@ -235,7 +236,7 @@ def identify_and_log_smallest_clicked_element(
     Args:
         browser_event: The browser event containing the click details.
     """
-    visible_html_string = browser_event.message.get("visibleHtmlString")
+    visible_html_string = browser_event.message.get("visibleHTMLString")
     message_id = browser_event.message.get("id")
     logger.info("*" * 10)
     logger.info(f"{message_id=}")
@@ -246,8 +247,7 @@ def identify_and_log_smallest_clicked_element(
         logger.warning("No visible HTML data available for click event.")
         return
 
-    # Parse the visible HTML using BeautifulSoup
-    soup = BeautifulSoup(visible_html_string, "html.parser")
+    soup = utils.parse_html(visible_html_string, "html.parser")
     target_element = soup.find(attrs={"data-id": target_id})
     target_area = None
     if not target_element:
diff --git a/openadapt/events.py b/openadapt/events.py
index 5866a3656..c90edeba4 100644
--- a/openadapt/events.py
+++ b/openadapt/events.py
@@ -180,7 +180,6 @@ def make_parent_event(
     children = extra.get("children", [])
     browser_events = [child.browser_event for child in children if child.browser_event]
     if browser_events:
-        assert len(browser_events) <= 1, len(browser_events)
         browser_event = browser_events[0]
         event_dict["browser_event"] = browser_event
 
diff --git a/openadapt/models.py b/openadapt/models.py
index 1df82c45e..faf72bca0 100644
--- a/openadapt/models.py
+++ b/openadapt/models.py
@@ -8,6 +8,7 @@
 import io
 import sys
 
+from bs4 import BeautifulSoup
 from oa_pynput import keyboard
 from PIL import Image, ImageChops
 import numpy as np
@@ -147,6 +148,14 @@ class ActionEvent(db.Base):
         "available_segment_descriptions",
         sa.String,
     )
+    _active_browser_element = sa.Column(
+        "active_browser_element",
+        sa.String,
+    )
+    _available_browser_elements = sa.Column(
+        "available_browser_elements",
+        sa.String,
+    )
     mouse_button_name = sa.Column(sa.String)
     mouse_pressed = sa.Column(sa.Boolean)
     key_name = sa.Column(sa.String)
@@ -193,6 +202,7 @@ def __init__(self, **kwargs: dict) -> None:
         for key, value in properties.items():
             setattr(self, key, value)
 
+    # TODO: rename "available" to "target"
     @property
     def available_segment_descriptions(self) -> list[str]:
         """Gets the available segment descriptions."""
@@ -210,6 +220,53 @@ def available_segment_descriptions(self, value: list[str]) -> None:
             value
         )
 
+    @property
+    def active_browser_element(self) -> BeautifulSoup | None:
+        if not self._active_browser_element:
+            return None
+        return utils.parse_html(self._active_browser_element)
+
+    @active_browser_element.setter
+    def active_browser_element(self, value: BeautifulSoup) -> None:
+        if not value:
+            logger.warning(f"{value=}")
+            return
+        self._active_browser_element = str(value)
+
+    @property
+    def available_browser_elements(self) -> BeautifulSoup | None:
+        # https://www.crummy.com/software/BeautifulSoup/bs4/doc/#navigating-the-tree
+        # The value True matches every tag it can. This code finds all the tags in the
+        # document, but none of the text strings
+        if not self._available_browser_elements:
+            return None
+        return utils.parse_html(self._available_browser_elements)
+
+    @available_browser_elements.setter
+    def available_browser_elements(self, value: BeautifulSoup | None) -> None:
+        if not value:
+            logger.warning(f"{value=}")
+            return
+        try:
+            self._available_browser_elements = str(value)
+        except Exception as exc:
+            # something myterious is going on, because this works:
+            #   self._available_browser_elements = value
+            # and so does this:
+            #   self._available_browser_elements = 'foo'
+            # but sometimes this:
+            #   self._available_browser_elements = value
+            # produces:
+            #   'NoneType' object is not callable
+            # apparently, so does this:
+            #   BeautifulSoup(soup.prettyify())
+            # XXX TODO: fix this
+            #logger.error(exc)
+            #self._available_browser_elements = '?'
+            #return self.available_browser_elements
+            import ipdb; ipdb.set_trace()
+            foo = 1
+
     children = sa.orm.relationship("ActionEvent")
     # TODO: replacing the above line with the following two results in an error:
     #     AttributeError: 'list' object has no attribute '_sa_instance_state'
@@ -482,6 +539,8 @@ def to_prompt_dict(self) -> dict[str, Any]:
         Returns:
             dictionary containing relevant properties from the ActionEvent.
         """
+        if self.active_browser_element:
+            import ipdb; ipdb.set_trace()
         action_dict = deepcopy(
             {
                 key: val
@@ -497,10 +556,20 @@ def to_prompt_dict(self) -> dict[str, Any]:
             for key in ("mouse_x", "mouse_y", "mouse_dx", "mouse_dy"):
                 if key in action_dict:
                     del action_dict[key]
+        # TODO XXX: add target_segment_description?
+
+        # Manually add properties to the dictionary
         if self.available_segment_descriptions:
             action_dict["available_segment_descriptions"] = (
                 self.available_segment_descriptions
             )
+        if self.active_browser_element:
+            action_dict["active_browser_element"] = str(self.active_browser_element)
+        if self.available_browser_elements:
+            action_dict["available_browser_elements"] = str(self.available_browser_elements)
+
+        if self.active_browser_element:
+            import ipdb; ipdb.set_trace()
         return action_dict
 
 
@@ -649,10 +718,10 @@ def __str__(self) -> str:
         # Create a copy of the message to avoid modifying the original
         message_copy = copy.deepcopy(self.message)
 
-        # Truncate the visibleHtmlString in the copied message if it exists
-        if "visibleHtmlString" in message_copy:
-            message_copy["visibleHtmlString"] = utils.truncate_html(
-                message_copy["visibleHtmlString"], max_len=100
+        # Truncate the visibleHTMLString in the copied message if it exists
+        if "visibleHTMLString" in message_copy:
+            message_copy["visibleHTMLString"] = utils.truncate_html(
+                message_copy["visibleHTMLString"], max_len=100
             )
 
         # Get all attributes except 'message'
@@ -668,6 +737,41 @@ def __str__(self) -> str:
         # Return the complete representation including the truncated message
         return f"BrowserEvent({base_repr}, message={message_copy})"
 
+    def parse(self) -> tuple[BeautifulSoup, BeautifulSoup | None]:
+        """Parses the visible HTML and optionally extracts the target element.
+
+        This method processes the browser event to parse the visible HTML and,
+        if the event type is "click", extracts the target HTML element that was
+        clicked.
+
+        Returns:
+            A tuple containing:
+            - BeautifulSoup: The parsed soup of the visible HTML.
+            - BeautifulSoup | None: The target HTML element if the event type is
+                "click"; otherwise, None.
+
+        Raises:
+            AssertionError: If the necessary data is missing.
+        """
+        message = self.message
+
+        visible_html_string = message.get("visibleHTMLString")
+        assert visible_html_string, "Cannot parse without visibleHTMLstring"
+
+        # Parse the visible HTML using BeautifulSoup
+        soup = BeautifulSoup(visible_html_string, "html.parser")
+
+        event_type = message.get("eventType")
+        target_element = None
+
+        if event_type == "click":
+            # Fetch the target element using its data-id
+            target_id = message.get("targetId")
+            target_element = soup.find(attrs={"data-id": target_id})
+            assert target_element, f"No target element found for targetId: {target_id}"
+
+        return soup, target_element
+
     # # TODO: implement
     # @classmethod
     # def get_active_browser_event(
diff --git a/openadapt/record.py b/openadapt/record.py
index 27eb9e578..eef25c7c8 100644
--- a/openadapt/record.py
+++ b/openadapt/record.py
@@ -24,6 +24,7 @@
 from pympler import tracker
 import av
 
+from openadapt.browser import set_browser_mode
 from openadapt.build_utils import redirect_stdout_stderr
 from openadapt.custom_logger import logger
 from openadapt.models import Recording
@@ -1192,24 +1193,27 @@ def read_browser_events(
     """
     utils.set_start_time(recording.timestamp)
 
+    # set the browser mode
+    set_browser_mode("record", websocket)
+
     logger.info("Starting Reading Browser Events ...")
 
     while not terminate_processing.is_set():
-        for message in websocket:
-            if not message:
-                continue
-
-            timestamp = utils.get_timestamp()
-
-            data = json.loads(message)
-
-            event_q.put(
-                Event(
-                    timestamp,
-                    "browser",
-                    {"message": data},
-                )
+        try:
+            message = websocket.recv(0.01)
+        except TimeoutError:
+            continue
+        timestamp = utils.get_timestamp()
+        data = json.loads(message)
+        event_q.put(
+            Event(
+                timestamp,
+                "browser",
+                {"message": data},
             )
+        )
+
+    set_browser_mode("idle", websocket)
 
 
 @logger.catch
diff --git a/openadapt/strategies/__init__.py b/openadapt/strategies/__init__.py
index fecc7c056..916843f3b 100644
--- a/openadapt/strategies/__init__.py
+++ b/openadapt/strategies/__init__.py
@@ -5,6 +5,7 @@
 # flake8: noqa
 
 from openadapt.strategies.base import BaseReplayStrategy
+from openadapt.strategies.browser import BrowserReplayStrategy
 
 # disabled because importing is expensive
 # from openadapt.strategies.demo import DemoReplayStrategy
diff --git a/openadapt/strategies/base.py b/openadapt/strategies/base.py
index ea7fb68c5..aac897504 100644
--- a/openadapt/strategies/base.py
+++ b/openadapt/strategies/base.py
@@ -21,6 +21,7 @@ def __init__(
         self,
         recording: models.Recording,
         max_frame_times: int = MAX_FRAME_TIMES,
+        include_a11y_data: bool = True,
     ) -> None:
         """Initialize the BaseReplayStrategy.
 
@@ -34,6 +35,7 @@ def __init__(
         self.screenshots = []
         self.window_events = []
         self.frame_times = []
+        self.include_a11y_data = include_a11y_data
 
     @abstractmethod
     def get_next_action_event(
@@ -67,7 +69,10 @@ def run(self) -> None:
                     continue
 
             self.screenshots.append(screenshot)
-            window_event = models.WindowEvent.get_active_window_event()
+            window_event = models.WindowEvent.get_active_window_event(
+                # TODO: rename
+                include_window_data=self.include_a11y_data,
+            )
             self.window_events.append(window_event)
             try:
                 action_event = self.get_next_action_event(
diff --git a/openadapt/strategies/browser-extended.py b/openadapt/strategies/browser-extended.py
new file mode 100644
index 000000000..bd37786c0
--- /dev/null
+++ b/openadapt/strategies/browser-extended.py
@@ -0,0 +1,412 @@
+"""
+Implements a replay strategy for browser recordings.
+
+TODO:
+- re-use approach from visual.py: segment each screenshot, prompt for descriptions
+"""
+
+from pprint import pformat
+from threading import Event, Thread
+import json
+import queue
+
+from bs4 import BeautifulSoup
+from websockets.sync.server import ServerConnection
+
+from openadapt import adapters, config, models, utils, strategies
+from openadapt.custom_logger import logger
+
+# Define ws_server_instance at the top scope
+ws_server_instance = None
+
+# Define a whitelist of essential attributes
+WHITELIST_ATTRIBUTES = [
+    'id', 'class', 'href', 'src', 'alt', 'name', 'type', 'value', 'title', 'data-*', 'aria-*'
+]
+
+
+class BrowserReplayStrategy(strategies.BaseReplayStrategy):
+    """ReplayStrategy using HTML and replay instructions."""
+
+    def __init__(
+        self,
+        recording: models.Recording,
+        instructions: str,
+    ) -> None:
+        """Initialize the BrowserReplayStrategy.
+
+        Args:
+            recording (models.Recording): The recording object.
+            instructions (str): Natural language instructions for how recording
+                should be replayed.
+        """
+        super().__init__(recording, include_a11y_data=False)
+        self.event_q = queue.Queue()
+        self.terminate_processing = Event()
+        self.recent_visible_html = ""
+        add_browser_elements(recording.processed_action_events)
+        self.browser_event_reader = Thread(
+            target=run_browser_event_server,
+            args=(self.event_q, self.terminate_processing),
+        )
+        self.browser_event_reader.start()
+
+        self.instructions = instructions
+        self.action_history = []
+        self.modified_actions = self.apply_replay_instructions(
+            recording.processed_action_events,
+            instructions
+        )
+        # Ensure browser elements are set for modified actions
+        add_browser_elements(self.modified_actions)
+        self.action_event_idx = 0
+
+    def get_recent_visible_html(self) -> str:
+        """Get the most recent visible DOM from the event queue.
+
+        Returns:
+            str: The most recent visible DOM.
+        """
+        num_messages_read = 0
+        while not self.event_q.empty():
+            event = self.event_q.get()
+            num_messages_read += 1
+            self.recent_visible_html = event.data["message"]["visibleHTMLString"]
+
+        if num_messages_read:
+            logger.info(f"{num_messages_read=} {len(self.recent_visible_html)=}")
+        return self.recent_visible_html
+
+    def get_next_action_event(
+        self,
+        screenshot: models.Screenshot,
+        window_event: models.WindowEvent,
+    ) -> models.ActionEvent | None:
+        """Get the next ActionEvent for replay.
+
+        Args:
+            screenshot (models.Screenshot): The screenshot object.
+            window_event (models.WindowEvent): The window event object.
+
+        Returns:
+            models.ActionEvent or None: The next ActionEvent for replay or None
+              if there are no more events.
+        """
+        # First, try the direct approach based on planned sequence.
+        try:
+            action = self._execute_planned_action(
+                screenshot=screenshot,
+                current_window_event=window_event
+            )
+            if action:
+                return action
+        except Exception as e:
+            logger.warning(f"Direct generation approach failed: {e}")
+
+        # Fallback to the planning approach if the direct approach fails.
+        try:
+            action = self._generate_next_action_plan(
+                screenshot=screenshot,
+                window_event=window_event,
+                recorded_actions=self.recording.processed_action_events,
+                replayed_actions=self.action_history,
+                instructions=self.instructions,
+            )
+            return action
+        except Exception as e:
+            logger.error(f"Planning approach also failed: {e}")
+            return None
+
+    def _execute_planned_action(
+        self,
+        screenshot: models.Screenshot,
+        current_window_event: models.WindowEvent,
+    ) -> models.ActionEvent | None:
+        """Try to execute the next planned action assuming it matches reality.
+
+        Args:
+            screenshot (models.Screenshot): The current state screenshot.
+            current_window_event (models.WindowEvent): The current state window data.
+
+        Returns:
+            models.ActionEvent or None: The next action event if the planned target exists.
+        """
+        if self.action_event_idx >= len(self.modified_actions):
+            return None  # No more actions to replay.
+
+        planned_action = self.modified_actions[self.action_event_idx]
+        self.action_event_idx += 1
+
+        # Find target element in the current DOM.
+        recent_visible_html = self.get_recent_visible_html()
+        soup, target_element = self._find_element_in_dom(planned_action, recent_visible_html)
+
+        if target_element:
+            planned_action.active_browser_element = target_element
+            self.action_history.append(planned_action)
+            return planned_action
+        else:
+            raise ValueError("Target element not found in the current DOM.")
+
+    def _generate_next_action_plan(
+        self,
+        screenshot: models.Screenshot,
+        window_event: models.WindowEvent,
+        recorded_actions: list[models.ActionEvent],
+        replayed_actions: list[models.ActionEvent],
+        instructions: str,
+    ) -> models.ActionEvent | None:
+        """Fallback method to dynamically plan the next action event.
+
+        Args:
+            screenshot (models.Screenshot): The current state screenshot.
+            window_event (models.WindowEvent): The current state window data.
+            recorded_actions (list[models.ActionEvent]): List of action events from the recording.
+            replayed_actions (list[models.ActionEvent]): List of actions produced during current replay.
+            instructions (str): Proposed modifications in natural language instructions.
+
+        Returns:
+            models.ActionEvent or None: The next action event if successful, otherwise None.
+        """
+        prompt_adapter = adapters.get_default_prompt_adapter()
+        system_prompt = utils.render_template_from_file("prompts/system.j2")
+        prompt = utils.render_template_from_file(
+            "prompts/generate_action_event--browser.j2",  # Updated template file name
+            current_window=window_event.to_prompt_dict(),
+            recorded_actions=[action.to_prompt_dict() for action in recorded_actions],
+            replayed_actions=[action.to_prompt_dict() for action in replayed_actions],
+            replay_instructions=instructions,
+        )
+
+        content = prompt_adapter.prompt(
+            prompt,
+            system_prompt=system_prompt,
+            images=[screenshot.image],
+        )
+        action_dict = utils.parse_code_snippet(content)
+        logger.info(f"{action_dict=}")
+        if not action_dict:
+            return None
+
+        return models.ActionEvent.from_dict(action_dict)
+
+    def apply_replay_instructions(
+        self,
+        action_events: list[models.ActionEvent],
+        replay_instructions: str,
+    ) -> list[models.ActionEvent]:
+        """Modify the given ActionEvents according to the given replay instructions.
+
+        Args:
+            action_events: list of action events to be modified in place.
+            replay_instructions: instructions for how action events should be modified.
+
+        Returns:
+            list[models.ActionEvent]: The modified list of action events.
+        """
+        action_dicts = [action.to_prompt_dict() for action in action_events]
+        actions_dict = {"actions": action_dicts}
+        system_prompt = utils.render_template_from_file("prompts/system.j2")
+        prompt = utils.render_template_from_file(
+            "prompts/apply_replay_instructions--browser.j2",  # Updated template file name
+            actions=actions_dict,
+            replay_instructions=replay_instructions,
+            # TODO: remove
+            exceptions=[],
+        )
+        print(prompt)
+        import ipdb; ipdb.set_trace()
+        prompt_adapter = adapters.get_default_prompt_adapter()
+        content = prompt_adapter.prompt(prompt, system_prompt=system_prompt)
+        content_dict = utils.parse_code_snippet(content)
+
+        try:
+            action_dicts = content_dict["actions"]
+        except TypeError as exc:
+            logger.warning(exc)
+            action_dicts = content_dict  # OpenAI sometimes returns a list of dicts directly.
+
+        modified_actions = []
+        for action_dict in action_dicts:
+            action = models.ActionEvent.from_dict(action_dict)
+            modified_actions.append(action)
+        return modified_actions
+
+    def _find_element_in_dom(self, planned_action: models.ActionEvent, html: str):
+        """Locate the target element in the current HTML DOM.
+
+        Args:
+            planned_action (models.ActionEvent): The planned action with target element info.
+            html (str): The current HTML content.
+
+        Returns:
+            Tuple[BeautifulSoup, Tag or None]: Parsed HTML and the target element or None.
+        """
+        soup = BeautifulSoup(html, 'html.parser')
+        target_selector = planned_action.active_browser_element  # Assuming selector or similar identifier is used.
+        target_element = soup.select_one(target_selector)  # Simplify finding elements.
+
+        return soup, target_element
+
+    def __del__(self) -> None:
+        """Clean up resources and log action history."""
+        self.terminate_processing.set()
+        action_history_dicts = [action.to_prompt_dict() for action in self.action_history]
+        logger.info(f"action_history=\n{pformat(action_history_dicts)}")
+
+
+def clean_html_attributes(element: BeautifulSoup) -> str:
+    """Retain only essential attributes from an HTML element based on a whitelist.
+
+    Args:
+        element: A BeautifulSoup tag element.
+
+    Returns:
+        A string representing the cleaned HTML element.
+    """
+    whitelist_attrs = []
+
+    # Go through each attribute in the element and keep only whitelisted ones
+    for attr_name, attr_value in element.attrs.items():
+        if attr_name in WHITELIST_ATTRIBUTES or attr_name.startswith('data-') or attr_name.startswith('aria-'):
+            whitelist_attrs.append((attr_name, attr_value))
+        else:
+            logger.debug(f"Removing attribute from <{element.name}>: {attr_name}='{attr_value}'")
+    
+    # Update the element with only whitelisted attributes
+    element.attrs = dict(whitelist_attrs)
+    return str(element)
+
+
+def filter_and_clean_html(soup: BeautifulSoup) -> str:
+    """Filter out irrelevant elements, clean attributes, and log removed elements.
+
+    Args:
+        soup: BeautifulSoup object of the parsed HTML.
+
+    Returns:
+        A string representing the cleaned HTML.
+    """
+    # Define relevant elements for action replay
+    relevant_tags = ['a', 'button', 'div', 'span', 'input', 'img', 'form', 'iframe']
+    relevant_elements = []
+
+    # Find relevant elements and log removal of irrelevant ones
+    for el in soup.find_all():
+        if el.name in relevant_tags:
+            relevant_elements.append(el)
+        else:
+            logger.debug(f"Removing element <{el.name}> with attributes: {el.attrs}")
+
+    # Clean each relevant element
+    cleaned_elements = [clean_html_attributes(el) for el in relevant_elements]
+
+    # Recreate a simplified HTML structure with only the cleaned elements
+    return ''.join(cleaned_elements)
+
+
+def add_browser_elements(action_events: list) -> None:
+    """Set the ActionEvent.active_browser_element where appropriate and log actions.
+
+    Args:
+        action_events: list of ActionEvents to modify in-place.
+    """
+    action_browser_tups = [
+        (action, action.browser_event)
+        for action in action_events
+        if action.browser_event
+    ]
+    for action, browser in action_browser_tups:
+        soup, target_element = browser.parse()
+        if not target_element:
+            logger.warning(f"{target_element=}")
+            continue
+
+        # Convert BeautifulSoup object to cleaned HTML strings
+        action.active_browser_element = clean_html_attributes(target_element)
+        action.available_browser_elements = filter_and_clean_html(soup)
+
+        # Verify the cleaned elements
+        assert action.active_browser_element, action.active_browser_element
+        assert action.available_browser_elements, action.available_browser_elements
+
+        import ipdb; ipdb.set_trace()
+        foo = 2
+
+
+def run_browser_event_server(
+    event_q: queue.Queue,
+    terminate_processing: Event,
+) -> None:
+    """Run the browser event server.
+
+    Params:
+        event_q: A queue for adding browser events.
+        terminate_processing: An event to signal the termination of the process.
+
+    Returns:
+        None
+    """
+    global ws_server_instance
+
+    def run_server() -> None:
+        global ws_server_instance
+        with ServerConnection(
+            lambda ws: read_browser_events(ws, event_q, terminate_processing),
+            config.BROWSER_WEBSOCKET_SERVER_IP,
+            config.BROWSER_WEBSOCKET_PORT,
+            max_size=config.BROWSER_WEBSOCKET_MAX_SIZE,
+        ) as server:
+            ws_server_instance = server
+            logger.info("WebSocket server started")
+            server.serve_forever()
+
+    server_thread = Thread(target=run_server)
+    server_thread.start()
+    terminate_processing.wait()
+    logger.info("Termination signal received, shutting down server")
+
+    if ws_server_instance:
+        ws_server_instance.shutdown()
+
+    server_thread.join()
+
+
+def read_browser_events(
+    websocket: ServerConnection,
+    event_q: queue.Queue,
+    terminate_processing: Event,
+) -> None:
+    """Read browser events and add them to the event queue.
+
+    Params:
+        websocket: The websocket object.
+        event_q: A queue for adding browser events.
+        terminate_processing: An event to signal the termination of the process.
+
+    Returns:
+        None
+    """
+    set_browser_mode("replay", websocket)
+    utils.set_start_time()
+    logger.info("Starting Reading Browser Events ...")
+
+    try:
+        while not terminate_processing.is_set():
+            try:
+                message = websocket.recv(0.01)
+            except TimeoutError:
+                continue
+            timestamp = utils.get_timestamp()
+            logger.info(f"{len(message)=}")
+            data = json.loads(message)
+            assert data["type"] == "DOM_EVENT", data["type"]
+            event_q.put(
+                models.BrowserEvent(
+                    timestamp=timestamp,
+                    message=data,
+                )
+            )
+    finally:
+        set_browser_mode("idle", websocket)
+
diff --git a/openadapt/strategies/browser.py b/openadapt/strategies/browser.py
new file mode 100644
index 000000000..f6771f856
--- /dev/null
+++ b/openadapt/strategies/browser.py
@@ -0,0 +1,407 @@
+"""
+TODO:
+- re-use approach from visual.py: segment each screenshot, prompt for descriptions
+"""
+
+from pprint import pformat
+from threading import Event, Thread
+import json
+import queue
+
+from bs4 import BeautifulSoup
+from websockets.sync.server import ServerConnection
+
+from openadapt import adapters, config, models, utils, strategies
+from openadapt.custom_logger import logger
+
+# Define ws_server_instance at the top scope
+ws_server_instance = None
+
+class BrowserReplayStrategy(strategies.BaseReplayStrategy):
+    """ReplayStrategy using HTML and replay instructions."""
+
+    def __init__(
+        self,
+        recording: models.Recording,
+        instructions: str,
+    ) -> None:
+        """Initialize the BrowserReplayStrategy.
+
+        Args:
+            recording (models.Recording): The recording object.
+            instructions (str): Natural language instructions for how recording
+                should be replayed.
+        """
+        super().__init__(recording, include_a11y_data=False)
+        self.event_q = queue.Queue()
+        self.terminate_processing = Event()
+        self.recent_visible_html = ""
+        add_browser_elements(recording.processed_action_events)
+        self.browser_event_reader = Thread(
+            target=run_browser_event_server,
+            args=(self.event_q, self.terminate_processing),
+        )
+        self.browser_event_reader.start()
+
+        self.instructions = instructions
+        self.action_history = []
+        self.modified_actions = self.apply_replay_instructions(
+            recording.processed_action_events,
+            instructions
+        )
+        # Ensure browser elements are set for modified actions
+        add_browser_elements(self.modified_actions)
+        self.action_event_idx = 0
+
+    def get_recent_visible_html(self) -> str:
+        """Get the most recent visible DOM from the event queue.
+
+        Returns:
+            str: The most recent visible DOM.
+        """
+        num_messages_read = 0
+        while not self.event_q.empty():
+            event = self.event_q.get()
+            num_messages_read += 1
+            self.recent_visible_html = event.data["message"]["visibleHTMLString"]
+
+        if num_messages_read:
+            logger.info(f"{num_messages_read=} {len(self.recent_visible_html)=}")
+        return self.recent_visible_html
+
+    def get_next_action_event(
+        self,
+        screenshot: models.Screenshot,
+        window_event: models.WindowEvent,
+    ) -> models.ActionEvent | None:
+        """Get the next ActionEvent for replay.
+
+        Args:
+            screenshot (models.Screenshot): The screenshot object.
+            window_event (models.WindowEvent): The window event object.
+
+        Returns:
+            models.ActionEvent or None: The next ActionEvent for replay or None
+              if there are no more events.
+        """
+        # First, try the direct approach based on planned sequence.
+        try:
+            action = self._execute_planned_action(
+                screenshot=screenshot,
+                current_window_event=window_event
+            )
+            if action:
+                return action
+        except Exception as e:
+            logger.warning(f"Direct generation approach failed: {e}")
+
+        # Fallback to the planning approach if the direct approach fails.
+        try:
+            action = self._generate_next_action_plan(
+                screenshot=screenshot,
+                window_event=window_event,
+                recorded_actions=self.recording.processed_action_events,
+                replayed_actions=self.action_history,
+                instructions=self.instructions,
+            )
+            return action
+        except Exception as e:
+            logger.error(f"Planning approach also failed: {e}")
+            return None
+
+    def _execute_planned_action(
+        self,
+        screenshot: models.Screenshot,
+        current_window_event: models.WindowEvent,
+    ) -> models.ActionEvent | None:
+        """Try to execute the next planned action assuming it matches reality.
+
+        Args:
+            screenshot (models.Screenshot): The current state screenshot.
+            current_window_event (models.WindowEvent): The current state window data.
+
+        Returns:
+            models.ActionEvent or None: The next action event if the planned target exists.
+        """
+        if self.action_event_idx >= len(self.modified_actions):
+            return None  # No more actions to replay.
+
+        planned_action = self.modified_actions[self.action_event_idx]
+        self.action_event_idx += 1
+
+        # Find target element in the current DOM.
+        recent_visible_html = self.get_recent_visible_html()
+        soup, target_element = self._find_element_in_dom(planned_action, recent_visible_html)
+
+        if target_element:
+            planned_action.active_browser_element = target_element
+            self.action_history.append(planned_action)
+            return planned_action
+        else:
+            raise ValueError("Target element not found in the current DOM.")
+
+    def _generate_next_action_plan(
+        self,
+        screenshot: models.Screenshot,
+        window_event: models.WindowEvent,
+        recorded_actions: list[models.ActionEvent],
+        replayed_actions: list[models.ActionEvent],
+        instructions: str,
+    ) -> models.ActionEvent | None:
+        """Fallback method to dynamically plan the next action event.
+
+        Args:
+            screenshot (models.Screenshot): The current state screenshot.
+            window_event (models.WindowEvent): The current state window data.
+            recorded_actions (list[models.ActionEvent]): List of action events from the recording.
+            replayed_actions (list[models.ActionEvent]): List of actions produced during current replay.
+            instructions (str): Proposed modifications in natural language instructions.
+
+        Returns:
+            models.ActionEvent or None: The next action event if successful, otherwise None.
+        """
+        prompt_adapter = adapters.get_default_prompt_adapter()
+        system_prompt = utils.render_template_from_file("prompts/system.j2")
+        prompt = utils.render_template_from_file(
+            "prompts/generate_action_event--browser.j2",  # Updated template file name
+            current_window=window_event.to_prompt_dict(),
+            recorded_actions=[action.to_prompt_dict() for action in recorded_actions],
+            replayed_actions=[action.to_prompt_dict() for action in replayed_actions],
+            replay_instructions=instructions,
+        )
+
+        content = prompt_adapter.prompt(
+            prompt,
+            system_prompt=system_prompt,
+            images=[screenshot.image],
+        )
+        action_dict = utils.parse_code_snippet(content)
+        logger.info(f"{action_dict=}")
+        if not action_dict:
+            return None
+
+        return models.ActionEvent.from_dict(action_dict)
+
+    def apply_replay_instructions(
+        self,
+        action_events: list[models.ActionEvent],
+        replay_instructions: str,
+    ) -> list[models.ActionEvent]:
+        """Modify the given ActionEvents according to the given replay instructions.
+
+        Args:
+            action_events: list of action events to be modified in place.
+            replay_instructions: instructions for how action events should be modified.
+
+        Returns:
+            list[models.ActionEvent]: The modified list of action events.
+        """
+        action_dicts = [action.to_prompt_dict() for action in action_events]
+        actions_dict = {"actions": action_dicts}
+        system_prompt = utils.render_template_from_file("prompts/system.j2")
+        prompt = utils.render_template_from_file(
+            "prompts/apply_replay_instructions--browser.j2",  # Updated template file name
+            actions=actions_dict,
+            replay_instructions=replay_instructions,
+            # TODO: remove
+            exceptions=[],
+        )
+        print(prompt)
+        import ipdb; ipdb.set_trace()
+        prompt_adapter = adapters.get_default_prompt_adapter()
+        content = prompt_adapter.prompt(prompt, system_prompt=system_prompt)
+        content_dict = utils.parse_code_snippet(content)
+
+        try:
+            action_dicts = content_dict["actions"]
+        except TypeError as exc:
+            logger.warning(exc)
+            action_dicts = content_dict  # OpenAI sometimes returns a list of dicts directly.
+
+        modified_actions = []
+        for action_dict in action_dicts:
+            action = models.ActionEvent.from_dict(action_dict)
+            modified_actions.append(action)
+        return modified_actions
+
+    def _find_element_in_dom(self, planned_action: models.ActionEvent, html: str):
+        """Locate the target element in the current HTML DOM.
+
+        Args:
+            planned_action (models.ActionEvent): The planned action with target element info.
+            html (str): The current HTML content.
+
+        Returns:
+            Tuple[BeautifulSoup, Tag or None]: Parsed HTML and the target element or None.
+        """
+        soup = BeautifulSoup(html, 'html.parser')
+        target_selector = planned_action.active_browser_element  # Assuming selector or similar identifier is used.
+        target_element = soup.select_one(target_selector)  # Simplify finding elements.
+
+        return soup, target_element
+
+    def __del__(self) -> None:
+        """Clean up resources and log action history."""
+        self.terminate_processing.set()
+        action_history_dicts = [action.to_prompt_dict() for action in self.action_history]
+        logger.info(f"action_history=\n{pformat(action_history_dicts)}")
+
+
+# Define a whitelist of essential attributes
+WHITELIST_ATTRIBUTES = [
+    'id', 'class', 'href', 'src', 'alt', 'name', 'type', 'value', 'title', 'data-*', 'aria-*'
+]
+
+def clean_html_attributes(element: BeautifulSoup) -> str:
+    """Retain only essential attributes from an HTML element based on a whitelist.
+
+    Args:
+        element: A BeautifulSoup tag element.
+
+    Returns:
+        A string representing the cleaned HTML element.
+    """
+    whitelist_attrs = []
+
+    # Go through each attribute in the element and keep only whitelisted ones
+    for attr_name, attr_value in element.attrs.items():
+        if attr_name in WHITELIST_ATTRIBUTES or attr_name.startswith('data-') or attr_name.startswith('aria-'):
+            whitelist_attrs.append((attr_name, attr_value))
+        else:
+            logger.debug(f"Removing attribute from <{element.name}>: {attr_name}='{attr_value}'")
+    
+    # Update the element with only whitelisted attributes
+    element.attrs = dict(whitelist_attrs)
+    return str(element)
+
+def filter_and_clean_html(soup: BeautifulSoup) -> str:
+    """Filter out irrelevant elements, clean attributes, and log removed elements.
+
+    Args:
+        soup: BeautifulSoup object of the parsed HTML.
+
+    Returns:
+        A string representing the cleaned HTML.
+    """
+    # Define relevant elements for action replay
+    relevant_tags = ['a', 'button', 'div', 'span', 'input', 'img', 'form', 'iframe']
+    relevant_elements = []
+
+    # Find relevant elements and log removal of irrelevant ones
+    for el in soup.find_all():
+        if el.name in relevant_tags:
+            relevant_elements.append(el)
+        else:
+            logger.debug(f"Removing element <{el.name}> with attributes: {el.attrs}")
+
+    # Clean each relevant element
+    cleaned_elements = [clean_html_attributes(el) for el in relevant_elements]
+
+    # Recreate a simplified HTML structure with only the cleaned elements
+    return ''.join(cleaned_elements)
+
+def add_browser_elements(action_events: list) -> None:
+    """Set the ActionEvent.active_browser_element where appropriate and log actions.
+
+    Args:
+        action_events: list of ActionEvents to modify in-place.
+    """
+    action_browser_tups = [
+        (action, action.browser_event)
+        for action in action_events
+        if action.browser_event
+    ]
+    for action, browser in action_browser_tups:
+        soup, target_element = browser.parse()
+        if not target_element:
+            logger.warning(f"{target_element=}")
+            continue
+
+        # Convert BeautifulSoup object to cleaned HTML strings
+        action.active_browser_element = clean_html_attributes(target_element)
+        action.available_browser_elements = filter_and_clean_html(soup)
+
+        # Verify the cleaned elements
+        assert action.active_browser_element, action.active_browser_element
+        assert action.available_browser_elements, action.available_browser_elements
+
+        import ipdb; ipdb.set_trace()
+        foo = 2
+
+
+def run_browser_event_server(
+    event_q: queue.Queue,
+    terminate_processing: Event,
+) -> None:
+    """Run the browser event server.
+
+    Params:
+        event_q: A queue for adding browser events.
+        terminate_processing: An event to signal the termination of the process.
+
+    Returns:
+        None
+    """
+    global ws_server_instance
+
+    def run_server() -> None:
+        global ws_server_instance
+        with ServerConnection(
+            lambda ws: read_browser_events(ws, event_q, terminate_processing),
+            config.BROWSER_WEBSOCKET_SERVER_IP,
+            config.BROWSER_WEBSOCKET_PORT,
+            max_size=config.BROWSER_WEBSOCKET_MAX_SIZE,
+        ) as server:
+            ws_server_instance = server
+            logger.info("WebSocket server started")
+            server.serve_forever()
+
+    server_thread = Thread(target=run_server)
+    server_thread.start()
+    terminate_processing.wait()
+    logger.info("Termination signal received, shutting down server")
+
+    if ws_server_instance:
+        ws_server_instance.shutdown()
+
+    server_thread.join()
+
+
+def read_browser_events(
+    websocket: ServerConnection,
+    event_q: queue.Queue,
+    terminate_processing: Event,
+) -> None:
+    """Read browser events and add them to the event queue.
+
+    Params:
+        websocket: The websocket object.
+        event_q: A queue for adding browser events.
+        terminate_processing: An event to signal the termination of the process.
+
+    Returns:
+        None
+    """
+    set_browser_mode("replay", websocket)
+    utils.set_start_time()
+    logger.info("Starting Reading Browser Events ...")
+
+    try:
+        while not terminate_processing.is_set():
+            try:
+                message = websocket.recv(0.01)
+            except TimeoutError:
+                continue
+            timestamp = utils.get_timestamp()
+            logger.info(f"{len(message)=}")
+            data = json.loads(message)
+            assert data["type"] == "DOM_EVENT", data["type"]
+            event_q.put(
+                models.BrowserEvent(
+                    timestamp=timestamp,
+                    message=data,
+                )
+            )
+    finally:
+        set_browser_mode("idle", websocket)
+
diff --git a/openadapt/utils.py b/openadapt/utils.py
index 82279c0d4..3949094a6 100644
--- a/openadapt/utils.py
+++ b/openadapt/utils.py
@@ -16,6 +16,7 @@
 import threading
 import time
 
+from bs4 import BeautifulSoup
 from jinja2 import Environment, FileSystemLoader
 from PIL import Image, ImageEnhance
 from posthog import Posthog
@@ -992,6 +993,48 @@ def truncate_html(html_str: str, max_len: int) -> str:
     return html_str
 
 
+def parse_html(html: str, parser: str = "html.parser") -> BeautifulSoup:
+    # Parse the visible HTML using BeautifulSoup
+    soup = BeautifulSoup(html, parser)
+    return soup
+
+
+# XXX TODO:
+#import html2text
+def get_html_prompt(html: str, convert_to_markdown: bool = False) -> str:
+    """Convert an HTML string to a processed version suitable for LLM prompts.
+
+    Args:
+        html: The input HTML string.
+        convert_to_markdown: If True, converts the HTML to Markdown. Defaults to False.
+
+    Returns:
+        A string with preserved semantic structure and interactable elements.
+        If convert_to_markdown is True, the string is in Markdown format.
+    """
+    # Parse HTML with BeautifulSoup
+    soup = BeautifulSoup(html, 'html.parser')
+
+    # Remove non-interactive and unnecessary elements
+    for tag in soup(['style', 'script', 'noscript', 'meta', 'head', 'iframe']):
+        tag.decompose()
+
+    assert not convert_to_markdown, "poetry add html2text"
+    if convert_to_markdown:
+        # Initialize html2text converter
+        converter = html2text.HTML2Text()
+        converter.ignore_links = False  # Keep all links
+        converter.ignore_images = False  # Keep all images
+        converter.body_width = 0  # Preserve original width without wrapping
+        
+        # Convert the cleaned HTML to Markdown
+        markdown = converter.handle(str(soup))
+        return markdown
+    
+    # Return processed HTML as a string if Markdown conversion is not required
+    return str(soup)
+
+
 class WrapStdout:
     """Class to be used a target for multiprocessing.Process."""