Skip to content

Conversation

@BrknRobot
Copy link
Contributor

@BrknRobot BrknRobot commented Aug 8, 2025

Goals

  • Centralize file read logic
  • Create a single owner of file content
  • Remove duplication in the path table
  • Preserve tracking of overridden files to enable future checks against missing items to determine which file overwrite is to blame

@amtep
Copy link
Owner

amtep commented Aug 9, 2025

This PR gives a significant speedup (6.5s to 5.0s for ck3) and I can't figure out why.

@amtep
Copy link
Owner

amtep commented Aug 9, 2025

There's something wrong with the loca parsing, which is probably why it's faster. There are no reports about localization files in the output.

@amtep
Copy link
Owner

amtep commented Aug 9, 2025

I got as far as discovering that the dummy localization files from mod-test-vanilla (such as dummy_l_english.yml) are never passed to the localization FileHandler. This leaves mod_langs empty.

@BrknRobot
Copy link
Contributor Author

Oop, I'll track that down. Any design concerns?

@amtep
Copy link
Owner

amtep commented Aug 9, 2025

I was concerned about possible slowdown due to all the locking around Arc<FileEntry>, which is why I tried to benchmark it :)

In one of my early designs before the PathTable, I tried to make pathname in Loc an Arc, and it resulted in significant slowdown.

Other than that, I'll have to read it a few more times.

@BrknRobot
Copy link
Contributor Author

BrknRobot commented Aug 10, 2025

I've made sure that the Arc is only cloned when needed. Most places pass the file entry around as &Arc<FileEntry> which allows the callee to decide if they need to increase the ref count, and only one place outside of building the fileset actually does this. Dereferencing to just read a field doesn't require any atomic operations.

My most recent tests show very similar performance to before this change. Some area's have improved (notably Errors.cache.get_line()) while others have regressed slightly. Though as I continue to cleanup the change set, there seem to be fewer regressions. I keep finding extra PathBuf clones that don't need to happen.

Clean up Fileset construction and fix bug

Don't bother normalising paths

fix tests
Copy link
Owner

@amtep amtep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comment changes, but the use of PathIdx::UNTRACKED needs to be revisited. The entries that use it can definitely have errors reported. The launcher file via errors emitted by the parser, and the internal strings sometimes have errors that I need to fix. It would be awkward to not get those errors.

/// and examine the mod files.
///
/// `mod_root` is the path to the mod files. The config file will also be looked for there.
///
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function comment needs updating now

#[derive(Debug)]
pub struct FileContent {
/// Full content of the file. It must never be moved or modified.
// full: Cow<'static, str>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray commented-out line

impl FileContent {
pub fn new(mut s: String) -> Self {
s.shrink_to_fit();
//Self { full: Cow::from(s), lines: OnceLock::new() }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray commented-out line

}

pub fn new_static(s: &'static str) -> Self {
//Self { full: Cow::from(s), lines: OnceLock::new() }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray commented-out line


impl FileDb {
/// Gets a reference to a file entry.
/// If the fileset has not been finalized, a new entry will be created and returned if needed.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If the fileset has not been finalized" is no longer accurate

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, "gets a reference" was a bit confusing because I read it as "receives" but you meant "fetches"

}
}

/// # Panics
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

let msg = format!("file in unexpected directory {}", dirname.display());
let info = format!("did you mean common/{} ?", dirname.display());
err(ErrorKey::Filename).msg(msg).info(info).loc(entry).push();
err(ErrorKey::Filename).msg(msg).info(info).loc(&**entry).push();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the &** still needed? You added a From for the Arc

}

/// # Panics
/// Will panic if a lock on self.used can not be obtained
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is inaccurate :) See above

Comment on lines +248 to +249
self.db.files.entry(entry.path().to_path_buf()).or_insert_with(|| {
FileEntry::new(inner_path.to_path_buf(), kind, entry.path().to_path_buf())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling entry.path().to_path_buf() twice here

let content = leak(content);
store_source_file(entry.fullpath().to_path_buf(), &content[offset..]);
parse_pdx(entry, &content[offset..], parser)
pub fn parse_pdx_file(entry: &FileEntry, parser: &ParserMemory) -> Option<Block> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this function can be removed and the callers just call parse_pdx

@amtep
Copy link
Owner

amtep commented Aug 12, 2025

Btw I've been thinking that the internal pdx script may have been a mistake and a table-based approach would be better. But getting rid of them is a longer term task.

(Reasoning: errors in the pdx script cause runtime failures, which I mjght not catch before release. This has happened once.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants