Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nan in party list causes an IndexError with predict_partyHistory() #1

Open
antoinecomp opened this issue May 6, 2020 · 0 comments
Open

Comments

@antoinecomp
Copy link

antoinecomp commented May 6, 2020

There is an nan in party list which causes an IndexError when calculating the probablility based on party history. It seems that when you're iterating over your list of parties for prediction based on party's history in pervious polls you have a nan at some point in your party list:

C:\ProgramData\Anaconda3\envs\homework3\python.exe C:/Users/antoi/Documents/Programming/GE2018/main.py
Starting..
...
  0%|          | 0/270 [00:00<?, ?it/s]
...
 72%|███████▏  | 194/270 [00:41<00:16,  4.73it/s]
Traceback (most recent call last):
  File "C:/Users/antoi/Documents/Programming/GE2018/main.py", line 47, in <module>
    data1 = compare_methods("L2-EX")
  File "C:\Users\antoi\Documents\Programming\GE2018\comparison.py", line 118, in compare_methods
    party_wise_result, seat_wise_result = final_model(paras[:12])
  File "C:\Users\antoi\Documents\Programming\GE2018\model.py", line 55, in final_model
    candidate_prob += para6*np.array(predict_partyHistory(current_constituency_data))
  File "C:\Users\antoi\Documents\Programming\GE2018\predict.py", line 124, in predict_partyHistory
    votes = party_prob[0]
IndexError: list index out of range

Process finished with exit code 1

Indeed, contrarily to all others, at some iteration you have an nan from in your list of parties.

list_parties:  ['National Party', 'MMA', 'Allah-o-Akbar Tehreek', 'IND', 'PML-N', nan, 'IND', 'APML', 'PPPP', 'TLP', 'PTI']

And the related party_prob is empty when you try to get it from party nan:

    for party in list_parties:
        party_prob = df_probability[df_probability["Party"] == party]["Probability"].tolist()
        # if party is in gallup survey or it has zero rating
        is_in_history = (df_probability[df_probability["Party"].isin([party])].index).tolist()
        # if party is in gallup (not not empty list is false)
        if( not not is_in_history ):
            votes = party_prob[0]

Indeed, the results are:

party:  nan
df_probability[df_probability["Party"] == party]: 
 Empty DataFrame
Columns: [Party, Probability, Unnamed: 2, Unnamed: 3, Unnamed: 4, Unnamed: 5, Unnamed: 6]
Index: []
party_prob:  []

My attempt

If I try to filter the parties to get rid out of these nan I create a ValueError. I tried:

    # find probability of winning for each candidate from gallup survey
    candidate_prob = []
    list_parties = [x for x in list_parties if str(x) != 'nan']
    for party in list_parties:

But got:

C:\ProgramData\Anaconda3\envs\homework3\python.exe C:/Users/antoi/Documents/Programming/GE2018/main.py
Starting..
....
  0%|          | 1/270 [00:00<00:47,  5.72it/s]C:\Users\antoi\Documents\Programming\GE2018\predict.py:135: RuntimeWarning: divide by zero encountered in double_scalars
  prob_extra = 0.5*float(remaining_prob/remaining_candidates)
C:\Users\antoi\Documents\Programming\GE2018\predict.py:174: RuntimeWarning: divide by zero encountered in double_scalars
  prob_extra = 0.5*float(remaining_prob/remaining_candidates)
  7%|▋         | 19/270 [00:03<00:46,  5.45it/s]C:\Users\antoi\Documents\Programming\GE2018\predict.py:57: RuntimeWarning: divide by zero encountered in double_scalars
  prob_extra = 0.5*float(remaining_prob/remaining_candidates)
 72%|███████▏  | 194/270 [00:37<00:14,  5.12it/s]
Traceback (most recent call last):
  File "C:/Users/antoi/Documents/Programming/GE2018/main.py", line 47, in <module>
    data1 = compare_methods("L2-EX")
  File "C:\Users\antoi\Documents\Programming\GE2018\comparison.py", line 118, in compare_methods
    party_wise_result, seat_wise_result = final_model(paras[:12])
  File "C:\Users\antoi\Documents\Programming\GE2018\model.py", line 55, in final_model
    candidate_prob += para6*np.array(predict_partyHistory(current_constituency_data))
ValueError: operands could not be broadcast together with shapes (11,) (10,) (11,) 
@antoinecomp antoinecomp changed the title Nan in party list causes an IndexError with predict_partyHistory Nan in party list causes an IndexError with predict_partyHistory() May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant