Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no definition of 'val_files' in the data_processing.py #31

Open
despairisa opened this issue Apr 18, 2024 · 1 comment
Open

no definition of 'val_files' in the data_processing.py #31

despairisa opened this issue Apr 18, 2024 · 1 comment

Comments

@despairisa
Copy link

I have used python to run the preprocessing file: data_processing.py, which is in the CUB folder. But it fails in a strange issue, below is the code and error message.

Code:

python3 data_processing.py -data_dir datasets/CUB_200_2011 -save_dir CUB_processed

Error:

Number of train images from official train test split: 5994
Traceback (most recent call last):
File "/home/chiayilai/Desktop/ConceptBottleneck-master/CUB/data_processing.py", line 85, in
train_data, val_data, test_data = extract_data(args.data_dir)
File "/home/chiayilai/Desktop/ConceptBottleneck-master/CUB/data_processing.py", line 64, in extract_data
if val_files is not None:
NameError: name 'val_files' is not defined

The strange thing is that I can’t find the definition of “val_files” in the data_processing.py. And I can’t find any keywords related to it in the readme file. I tried to add a new folder named “val_files”, but it was still no use. Then I altered the “val_files” to “val_data”. It successfully produced three .pkl files, but I don’t know if this way is correct or not.

@despairisa despairisa changed the title co definition of 'val_files' in the data_processing.py no definition of 'val_files' in the data_processing.py Apr 18, 2024
@ellemcfarlane
Copy link

ellemcfarlane commented Nov 9, 2024

Hey, not an author but looks like they also had an undefined val_files var in the original commit (i.e. it isn't that they accidentally deleted something later on), so not sure what it's doing there 9f6ee6a#diff-2dd9fd989bcf4f6b4bdf75e647ebf91b6d0dc6ae23130e1b38fbfafddb067a5cR65

But they end up returning this anyway:

train_data = train_val_data[split :]
val_data = train_val_data[: split]

so you can probably just comment out all of these lines:

if val_files is not None:
    if img_path in val_files:
        val_data.append(metadata)
    else:
        train_data.append(metadata)

That's what I did anyway

(python36-env) bash-5.1$ python3 data_processing.py -save_dir full_data -data_dir CUB_200_2011
Number of train images from official train test split: 5994
Size of train set: 4796
Processing train set
Processing val set
Processing test set

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants