Skip to content

Release v0.11.1 #248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Release v0.11.1 #248

wants to merge 1 commit into from

Conversation

sundarshankar89
Copy link
Collaborator

  • Expose the number of available CPUs for concurrent processing (#244). The library now provides a method to determine the available CPU count for concurrent processing, which is used to optimize parallel task execution. This method attempts to obtain the number of logical CPUs available for the process through various approaches, including using the os module's process_cpu_count attribute, the sched_getaffinity function on Linux systems, or the total number of CPUs in the system, with a default fallback to 1 if all else fails. The parallel task execution functionality has been updated to utilize this available CPU count method to automatically determine the number of threads to use when running tasks concurrently, unless a specific thread count is manually specified, allowing for more accurate and flexible concurrent processing, particularly in containerized environments with imposed CPU quotas.
  • Improve support for reading text files that contain a Unicode BOM at the start (#243). The library's text file reading functionality has been enhanced to handle Unicode Byte Order Mark (BOM) markers at the start of files, providing better support for reading local and Workspace files. New methods have been introduced to detect and handle BOM markers, including _detect_encoding_bom and decode_with_bom, which enable accurate detection of the encoding and decoding of text files. Additionally, the _read_text_from_binary_io and read_text methods have been added to read text from binary IO and file paths, respectively, taking into account BOM markers and non-seekable files. The existing open method has been updated to utilize the decode_with_bom method when opening files in text mode, allowing for improved handling of BOM markers and non-seekable files. The read_text function can handle various BOMs, including UTF-8, UTF-16 LE, UTF-16 BE, UTF-32 LE, and UTF-32 BE, and correctly decodes the text, while also supporting sized reads and raising a ValueError for non-seekable files when a size is specified.

* Expose the number of available CPUs for concurrent processing ([#244](#244)). The library now provides a method to determine the available CPU count for concurrent processing, which is used to optimize parallel task execution. This method attempts to obtain the number of logical CPUs available for the process through various approaches, including using the `os` module's `process_cpu_count` attribute, the `sched_getaffinity` function on Linux systems, or the total number of CPUs in the system, with a default fallback to 1 if all else fails. The parallel task execution functionality has been updated to utilize this available CPU count method to automatically determine the number of threads to use when running tasks concurrently, unless a specific thread count is manually specified, allowing for more accurate and flexible concurrent processing, particularly in containerized environments with imposed CPU quotas.
* Improve support for reading text files that contain a Unicode BOM at the start ([#243](#243)). The library's text file reading functionality has been enhanced to handle Unicode Byte Order Mark (BOM) markers at the start of files, providing better support for reading local and Workspace files. New methods have been introduced to detect and handle BOM markers, including `_detect_encoding_bom` and `decode_with_bom`, which enable accurate detection of the encoding and decoding of text files. Additionally, the `_read_text_from_binary_io` and `read_text` methods have been added to read text from binary IO and file paths, respectively, taking into account BOM markers and non-seekable files. The existing `open` method has been updated to utilize the `decode_with_bom` method when opening files in text mode, allowing for improved handling of BOM markers and non-seekable files. The `read_text` function can handle various BOMs, including UTF-8, UTF-16 LE, UTF-16 BE, UTF-32 LE, and UTF-32 BE, and correctly decodes the text, while also supporting sized reads and raising a `ValueError` for non-seekable files when a size is specified.
Copy link

✅ 40/40 passed, 1 flaky, 2 skipped, 2m32s total

Flaky tests:

  • 🤪 test_upgrades_works (10.012s)

Running from acceptance #325

asnare added a commit to databrickslabs/lsql that referenced this pull request Jun 25, 2025
The 26.30.0 release of sqlglot introduced a breaking change that affects
our tests; this PR is a quick-fix to prevent that version from being
used.

This PR also includes type-hinting fixes that newer versions of mypy
need, along with accompanying fixes for issues that the improved
type-hints expose.

For now this is intended to:

 - Unblock databrickslabs/blueprint#248.
 - Supersede #409.
@sundarshankar89 sundarshankar89 deleted the prepare/0.11.1 branch June 25, 2025 08:40
@asnare asnare removed the request for review from nfx June 25, 2025 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants