Number of rounds filter #10600
Replies: 2 comments 1 reply
-
It sounds like you’ve been working on a great project to analyze competition data, but the website redesign breaking your scraper is a common challenge. Here's how you might approach this: 1. Quick Fix for the ScraperIf the frontend has changed but the underlying data structure (API or page source) hasn’t, you might be able to update your script. Use tools like import requests
from bs4 import BeautifulSoup
# Example scraping process
url = "https://example.com/competitions"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
# Update selectors based on new frontend
rounds_data = soup.find_all("div", class_="round-info") # Update selector logic
for round_info in rounds_data:
print(round_info.text) 2. Requesting API FeaturesYou could request the website admins to:
This is beneficial for everyone and reduces strain from scrapers. 3. Using JavaScript Scraping (If API Isn't Available)If the site uses JavaScript to render data dynamically, traditional scraping tools won’t work. In that case:
Example with Selenium: from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com/competitions")
# Adjust based on site elements
round_elements = driver.find_elements_by_class_name("round-info")
for round_elem in round_elements:
print(round_elem.text)
driver.quit() 4. Rebuild the Script for Your Specific Use CaseIf your original goal was identifying competitions with top solvers:
5. Long-Term Solution: Extend API UsageIf you find yourself frequently relying on such data, consider:
6. Troubleshooting New FrontendIf the site now uses a more complex frontend:
|
Beta Was this translation helpful? Give feedback.
-
Hi, you can do what you intend to do by reading the WCIF for the respective competition. The events block gives away how many rounds there are per event. No need to scrape the website. With a Python script the following page is build each day for all upcoming competitions: WCA-Rounds-at-Comps and you can click the header columns to sort. Bonus: the WCIf can also be used to compare psych sheets. Locally I used the following from the WCIF json: persons -> registration -> eventIds you're interested in -> personalBests |
Beta Was this translation helpful? Give feedback.
-
I’ve always been interested in finding out how many rounds there are in a competition for a specific event. The only way to do this is by checking each event individually. I once wrote a Python script to scrape data from the website, allowing me to quickly calculate and identify events with a specific number of rounds. After the site frontend change it doesn't work anymore.
It would be very useful to have a filter for the number of rounds in the competition page and directly in the API, in order to have it as info to do extend it for other scripts (my specific one used to find the comps with registered world-class solvers based on top100 results in a specific event).
Beta Was this translation helpful? Give feedback.
All reactions