Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: "['QN_9' 'RF_TU' 'eor'] not found in axis" in execution of "3-dwd_konverter_build_df.ipynb" #27

Open
slowtoaccept opened this issue Apr 8, 2021 · 5 comments

Comments

@slowtoaccept
Copy link

As instructed, all ipynb files run in sequence.

'Finished file: import\produkt_tu_stunde_20190409_20201231_00096.txt'
'This is file 10'
'Shape of the main_df is: (851261, 1)'

KeyError Traceback (most recent call last)
in
25 df = pd.read_csv(file, delimiter=";")
26 # Prepare the df befor merging (Drop obsolete, convert to datetime, filter to date, set index)
---> 27 df.drop(columns=obsolete_columns, inplace=True)
28 df["MESS_DATUM"] = pd.to_datetime(df["MESS_DATUM"], format="%Y%m%d%H")
29 df = df[df['MESS_DATUM']>= "2007-01-01"]

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
4306 weight 1.0 0.8
4307 """
-> 4308 return super().drop(
4309 labels=labels,
4310 axis=axis,

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
4151 for axis, labels in axes.items():
4152 if labels is not None:
-> 4153 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
4154
4155 if inplace:

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
4186 new_axis = axis.drop(labels, level=level, errors=errors)
4187 else:
-> 4188 new_axis = axis.drop(labels, errors=errors)
4189 result = self.reindex(**{axis_name: new_axis})
4190

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
5589 if mask.any():
5590 if errors != "ignore":
-> 5591 raise KeyError(f"{labels[mask]} not found in axis")
5592 indexer = indexer[~mask]
5593 return self.delete(indexer)

KeyError: "['QN_9' 'RF_TU' 'eor'] not found in axis"

@chris1610
Copy link
Owner

It looks like those columns are not in your data set. Since you're trying to drop them, it shouldn't matter.

You could try replacing the drop code with this:

df.drop(columns=obsolete_columns, inplace=True, errors='ignore')

This will tell pandas to ignore the error that's being raised because the columns are not in the DataFrame.

@slowtoaccept
Copy link
Author

Hi Chris
I've run the example code as provided w/o mods. Ran your suggested change (line 28) and got another error as seen below. I'm not an experienced "Pandite", but rely only on the provided code.
Thanks for your help

'Finished file: import\produkt_tu_stunde_20190409_20201231_00096.txt'
'This is file 10'
'Shape of the main_df is: (851261, 1)'

KeyError Traceback (most recent call last)
~\Anaconda3\envs\tide\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3079 try:
-> 3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'MESS_DATUM'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in
27 # df.drop(columns=obsolete_columns, inplace=True)
28 df.drop(columns=obsolete_columns, inplace=True, errors='ignore')
---> 29 df["MESS_DATUM"] = pd.to_datetime(df["MESS_DATUM"], format="%Y%m%d%H")
30 df = df[df['MESS_DATUM']>= "2007-01-01"]
31 df.set_index(['MESS_DATUM', 'STATIONS_ID'], inplace=True)

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\frame.py in getitem(self, key)
3022 if self.columns.nlevels > 1:
3023 return self._getitem_multilevel(key)
-> 3024 indexer = self.columns.get_loc(key)
3025 if is_integer(indexer):
3026 indexer = [indexer]

~\Anaconda3\envs\tide\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
-> 3082 raise KeyError(key) from err
3083
3084 if tolerance is not None:

KeyError: 'MESS_DATUM'

@chris1610
Copy link
Owner

Hmm. I'm not sure what't going on. It's likely there's an error earlier in the script and the files are downloaded or processed properly. You should try to look at the downloaded files and make sure they are placed in the correct directories and have the right content.

I realize that's a little vague for a new user but I think its likely something changed and the files are stored differently.

@slowtoaccept
Copy link
Author

Hi Chris
Here's a snippet from the imported file list. All have a MESS_DATUM column. Is MESS_DATUM format the problem? It is rejected by df["MESS_DATUM"] = pd.to_datetime(df["MESS_DATUM"], format="%Y%m%d%H")
17 2 Dir(s) 434,812,313,600 bytes...
STATIONS_ID MESS_DATUM QN_9 TT_TU RF_TU eor
0 3 1950040101 5 5.7 83.0 eor
1 3 1950040102 5 5.6 83.0 eor
2 3 1950040103 5 5.5 83.0 eor
3 3 1950040104 5 5.5 83.0 eor
4 3 1950040105 5 5.8 85.0 eor

@chris1610
Copy link
Owner

I re-ran this on my local machine and the file I see looks like yours so I think the date format is ok.

Is it possible that there is an extra file in your import directory? Look at each of the files and make sure they are all formatted the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants