Skip to content

Commit

Permalink
AutoGUI interaction (#36)
Browse files Browse the repository at this point in the history
After trying out the devtools protocols in BugHog, the two simple
implementations I had prepared worked only for Chromium 70+ and Firefox
127+. Even though there were some other protocols implemented in
previous browser versions, it would require a lot of effort to implement
all of these and correctly match them to the supported browser versions.
Further, there would likely be no way to target versions approx. 20-50.

As a result, I decided to use PyAutoGUI as you suggested. It works in
all browsers and versions and for practical usage it seems almost as
good as the devtools protocols.

Instead of `url_queue.txt`, users can choose to provide the experiment
configuration in `interaction_script.cmd`. These are the commands
implemented so far:

- `NAVIGATE url` Terminates the previous browser window and opens a new
browser window on the specified URL. Further, it waits some time (1 sec
for Chromium, 2 secs for Firefox) for the page to load.
- `CLICK_POSITION x y` Clicks on specific coordinates on the screen (not
necessarily the browser window), The argument values can be absolute in
pixels, percentage of the screen, or a combination of both - e.g.,
`CLICK_POSITION 100 50%` clicks 100px from the left screen border and
50% of the screen height.
- `CLICK element_id` Clicks on an element with the specified ID.
Currently, the ID can be one of `one`, `two`, `three`, `four`, `five`,
`six`. This is because PyAutoGUI can search for the location of a visual
match on the screen. Therefore, I prepared styles in `res/bughog.css`
that style elements with these IDs to boxes of distinct colors. This
allows us to bypass the limitation of having to know the exact screen
coordinates of an element.
- `WRITE text` Types the text into the focused element.
- `PRESS key`
- `HOLD key`
- `RELEASE key`
- `HOTKEY key1 key2 ...` A combination of `HOLD` and `RELEASE` for
multiple keys. E.g., `HOTKEY ctrl c`.
- `SLEEP seconds` Usually should not be necessary to use because
navigation implicitly includes sleeping.
- `SCREENSHOT file_name` Captures the screen and stores the result in
`logs/screenshots/{PROJECT}-{EXPERIMENT}-{file_name}-{BROWSER}-{VERSION}.jpg`.
Very useful for debugging

A simple experiment can be found in `Support/AutoGUI`. It got
successfully reproduced in all versions of both browsers.

We can possibly implement some browser-specific behaviour as well, e.g.,
bookmarking a string where the script would include only `BOOKMARK text`
and based on the browser version, the correct shortcut would be pressed
and screen positions clicked.

Besides this, I made the following changes:
- Extracted the default file templates to separate files and added
templates for all file types
- Implemented adding and modifying `interaction_script.cmd` and
`url_queue.txt` from the web UI
- Implemented a custom highlighting mode for `interaction_script.cmd` in
the experiment editor
- Fixed loading resources from `/res/`
  • Loading branch information
GJFR authored Nov 13, 2024
2 parents e7cfbd9 + 792447e commit 2d1ad6d
Show file tree
Hide file tree
Showing 39 changed files with 667 additions and 86 deletions.
8 changes: 4 additions & 4 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@
"Vue.volar"
]
}
}
},

// Install pip requirements
"postCreateCommand": "pip install -r requirements.txt"

// Features to add to the dev container. More info: https://containers.dev/features.
// "features": {},
Expand All @@ -35,9 +38,6 @@
// Uncomment the next line if you want to keep your containers running after VS Code shuts down.
// "shutdownAction": "none",

// Uncomment the next line to run commands after the container is created.
// "postCreateCommand": "cat /etc/os-release",

// Configure tool-specific properties.
// "customizations": {},
}
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@ nginx/ssl/keys/*
!**/.gitkeep
**/node_modules
**/junit.xml

# Screenshots
logs/screenshots/*
!logs/screenshots/.gitkeep

# Fish shell
$HOME

# JetBrains IDEs
.idea

# Created by https://www.toptal.com/developers/gitignore/api/intellij,python,flask,macos
# Edit at https://www.toptal.com/developers/gitignore?templates=intellij,python,flask,macos

Expand Down
4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,11 @@ RUN cp /app/scripts/daemon/xvfb /etc/init.d/xvfb
# Install python packages
COPY requirements.txt /app/requirements.txt
RUN pip install --user -r /app/requirements.txt
RUN apt-get install python3-tk python3-xlib gnome-screenshot -y

# Initiate PyAutoGUI
RUN touch /root/.Xauthority && \
xauth add ${HOST}:0 . $(xxd -l 16 -p /dev/urandom)

FROM base AS core
# Copy rest of source code
Expand Down
30 changes: 17 additions & 13 deletions bci/browser/automation/terminal.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,35 @@


class TerminalAutomation:

@staticmethod
def run(url: str, args: list[str], seconds_per_visit: int):
logger.debug("Starting browser process...")
def visit_url(url: str, args: list[str], seconds_per_visit: int):
args.append(url)
proc = TerminalAutomation.open_browser(args)
logger.debug(f'Visiting the page for {seconds_per_visit}s')
time.sleep(seconds_per_visit)
TerminalAutomation.terminate_browser(proc, args)

@staticmethod
def open_browser(args: list[str]) -> subprocess.Popen:
logger.debug('Starting browser process...')
logger.debug(f'Command string: \'{" ".join(args)}\'')
with open('/tmp/browser.log', 'a') as file:
proc = subprocess.Popen(
args,
stdout=file,
stderr=file
)
with open('/tmp/browser.log', 'a+') as file:
proc = subprocess.Popen(args, stdout=file, stderr=file)
return proc

time.sleep(seconds_per_visit)
@staticmethod
def terminate_browser(proc: subprocess.Popen, args: list[str]) -> None:
logger.debug('Terminating browser process using SIGINT...')

logger.debug(f'Terminating browser process after {seconds_per_visit}s using SIGINT...')
# Use SIGINT and SIGTERM to end process such that cookies remain saved.
proc.send_signal(signal.SIGINT)
proc.send_signal(signal.SIGTERM)

try:
stdout, stderr = proc.communicate(timeout=5)
except subprocess.TimeoutExpired:
logger.info("Browser process did not terminate after 5s. Killing process through pkill...")
logger.info('Browser process did not terminate after 5s. Killing process through pkill...')
subprocess.run(['pkill', '-2', args[0].split('/')[-1]])

proc.wait()
logger.debug("Browser process terminated.")
logger.debug('Browser process terminated.')
31 changes: 27 additions & 4 deletions bci/browser/configuration/browser.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations

import os
import subprocess
from abc import abstractmethod

import bci.browser.binary.factory as binary_factory
Expand All @@ -15,9 +16,13 @@


class Browser:
process: subprocess.Popen | None

def __init__(self, browser_config: BrowserConfiguration, eval_config: EvaluationConfiguration, binary: Binary) -> None:
def __init__(
self, browser_config: BrowserConfiguration, eval_config: EvaluationConfiguration, binary: Binary
) -> None:
self.browser_config = browser_config
self.process = None
self.eval_config = eval_config
self.binary = binary
self.state = binary.state
Expand All @@ -34,10 +39,22 @@ def visit(self, url: str):
match self.eval_config.automation:
case 'terminal':
args = self._get_terminal_args()
TerminalAutomation.run(url, args, self.eval_config.seconds_per_visit)
TerminalAutomation.visit_url(url, args, self.eval_config.seconds_per_visit)
case _:
raise AttributeError('Not implemented')

def open(self, url: str) -> None:
args = self._get_terminal_args()
args.append(url)
self.process = TerminalAutomation.open_browser(args)

def terminate(self):
if self.process is None:
return

TerminalAutomation.terminate_browser(self.process, self._get_terminal_args())
self.process = None

def pre_evaluation_setup(self):
self.__fetch_binary()

Expand Down Expand Up @@ -80,11 +97,17 @@ def _get_executable_file_path(self) -> str:
return os.path.join(self.__get_execution_folder_path(), self.binary.executable_name)

@abstractmethod
def _get_terminal_args(self):
def _get_terminal_args(self) -> list[str]:
pass

@abstractmethod
def get_navigation_sleep_duration(self) -> int:
pass

@staticmethod
def get_browser(browser_config: BrowserConfiguration, eval_config: EvaluationConfiguration, state: State) -> Browser:
def get_browser(
browser_config: BrowserConfiguration, eval_config: EvaluationConfiguration, state: State
) -> Browser:
from bci.browser.configuration.chromium import Chromium
from bci.browser.configuration.firefox import Firefox

Expand Down
3 changes: 3 additions & 0 deletions bci/browser/configuration/chromium.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@

class Chromium(Browser):

def get_navigation_sleep_duration(self) -> int:
return 1

def _get_terminal_args(self) -> list[str]:
assert self._profile_path is not None

Expand Down
8 changes: 6 additions & 2 deletions bci/browser/configuration/firefox.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@

from bci import cli
from bci.browser.configuration.browser import Browser
from bci.browser.configuration.options import Default, BlockThirdPartyCookies, PrivateBrowsing, TrackingProtection
from bci.browser.configuration.options import BlockThirdPartyCookies, Default, PrivateBrowsing, TrackingProtection
from bci.browser.configuration.profile import prepare_firefox_profile


SUPPORTED_OPTIONS = [
Default(),
BlockThirdPartyCookies(),
Expand All @@ -21,11 +20,15 @@

class Firefox(Browser):

def get_navigation_sleep_duration(self) -> int:
return 2

def _get_terminal_args(self) -> list[str]:
assert self._profile_path is not None

args = [self._get_executable_file_path()]
args.extend(['-profile', self._profile_path])
args.append('-setDefaultBrowser')
user_prefs = []

def add_user_pref(key: str, value: str | int | bool):
Expand All @@ -45,6 +48,7 @@ def add_user_pref(key: str, value: str | int | bool):
# add_user_pref('network.proxy.type', 1)

add_user_pref('app.update.enabled', False)
add_user_pref('browser.shell.checkDefaultBrowser', False)
if 'default' in self.browser_config.browser_setting:
pass
elif 'btpc' in self.browser_config.browser_setting:
Expand Down
Empty file.
Binary file added bci/browser/interaction/elements/five.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bci/browser/interaction/elements/four.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bci/browser/interaction/elements/one.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bci/browser/interaction/elements/six.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bci/browser/interaction/elements/three.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bci/browser/interaction/elements/two.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions bci/browser/interaction/interaction.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import logging
from inspect import signature

from bci.browser.configuration.browser import Browser as BrowserConfig
from bci.browser.interaction.simulation import Simulation
from bci.evaluations.logic import TestParameters

logger = logging.getLogger(__name__)


class Interaction:
browser: BrowserConfig
script: list[str]
params: TestParameters

def __init__(self, browser: BrowserConfig, script: list[str], params: TestParameters) -> None:
self.browser = browser
self.script = script
self.params = params

def execute(self) -> None:
simulation = Simulation(self.browser, self.params)

if self._interpret(simulation):
simulation.sleep(str(self.browser.get_navigation_sleep_duration()))
simulation.navigate('https://a.test/report/?bughog_sanity_check=OK')

def _interpret(self, simulation: Simulation) -> bool:
for statement in self.script:
if statement.strip() == '' or statement[0] == '#':
continue

cmd, *args = statement.split()
method_name = cmd.lower()

if method_name not in Simulation.public_methods:
raise Exception(
f'Invalid command `{cmd}`. Expected one of {", ".join(map(lambda m: m.upper(), Simulation.public_methods))}.'
)

method = getattr(simulation, method_name)
method_params = list(signature(method).parameters.values())

# Allow different number of arguments only for variable argument number (*)
if len(method_params) != len(args) and (len(method_params) < 1 or str(method_params[0])[0] != '*'):
raise Exception(
f'Invalid number of arguments for command `{cmd}`. Expected {len(method_params)}, got {len(args)}.'
)

logger.debug(f'Executing interaction method `{method_name}` with the arguments {args}')

try:
method(*args)
except:
return False

return True
85 changes: 85 additions & 0 deletions bci/browser/interaction/simulation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
import os
from time import sleep

import pyautogui as gui
import Xlib.display
from pyvirtualdisplay.display import Display

from bci.browser.configuration.browser import Browser as BrowserConfig
from bci.evaluations.logic import TestParameters


class Simulation:
browser_config: BrowserConfig
params: TestParameters

public_methods: list[str] = [
'navigate',
'click_position',
'click',
'write',
'press',
'hold',
'release',
'hotkey',
'sleep',
'screenshot',
]

def __init__(self, browser_config: BrowserConfig, params: TestParameters):
self.browser_config = browser_config
self.params = params
disp = Display(visible=True, size=(1920, 1080), backend='xvfb', use_xauth=True)
disp.start()
gui._pyautogui_x11._display = Xlib.display.Display(os.environ['DISPLAY'])

def __del__(self):
self.browser_config.terminate()

def parse_position(self, position: str, max_value: int) -> int:
# Screen percentage
if position[-1] == '%':
return round(max_value * (int(position[:-1]) / 100))

# Absolute value in pixels
return int(position)

# --- PUBLIC METHODS ---
def navigate(self, url: str):
self.browser_config.terminate()
self.browser_config.open(url)
self.sleep(str(self.browser_config.get_navigation_sleep_duration()))

def click_position(self, x: str, y: str):
max_x, max_y = gui.size()

gui.moveTo(self.parse_position(x, max_x), self.parse_position(y, max_y))
gui.click()

def click(self, el_id: str):
el_image_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), f'elements/{el_id}.png')
x, y = gui.locateCenterOnScreen(el_image_path)
self.click_position(str(x), str(y))

def write(self, text: str):
gui.write(text, interval=0.1)

def press(self, key: str):
gui.press(key)

def hold(self, key: str):
gui.keyDown(key)

def release(self, key: str):
gui.keyUp(key)

def hotkey(self, *keys: str):
gui.hotkey(*keys)

def sleep(self, duration: str):
sleep(float(duration))

def screenshot(self, filename: str):
filename = f'{self.params.evaluation_configuration.project}-{self.params.mech_group}-{filename}-{type(self.browser_config).__name__}-{self.browser_config.version}.jpg'
filepath = os.path.join(os.path.dirname(os.path.realpath(__file__)), '../../../logs/screenshots', filename)
gui.screenshot(filepath)
Loading

0 comments on commit 2d1ad6d

Please sign in to comment.