Filtering by condition in Python dataset -
i'm struggling sorting operation of stata file in phyton3: asked keep households without kids out of dataset/table:
i used filtering condition filter these rows out of table:
filtering_condition = df["kids"] > 0 df_nokids = df.loc[filtering_condition,"kids"]
this, however, gives me unknown error:
keyerror traceback (most recent call last) /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance) 1944 try: -> 1945 return self._engine.get_loc(key) 1946 except keyerror: pandas/index.pyx in pandas.index.indexengine.get_loc (pandas/index.c:4154)() pandas/index.pyx in pandas.index.indexengine.get_loc (pandas/index.c:4018)() pandas/hashtable.pyx in pandas.hashtable.pyobjecthashtable.get_item (pandas/hashtable.c:12368)() pandas/hashtable.pyx in pandas.hashtable.pyobjecthashtable.get_item (pandas/hashtable.c:12322)() keyerror: 'kids' during handling of above exception, exception occurred: keyerror traceback (most recent call last) <ipython-input-321-e72cd8a67065> in <module>() 1 #keep households without kids , use dataset rest of assignment ----> 2 filtering_condition = df["kids"] > 0 3 df_nokids = df.loc[filtering_condition,"kids"] /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key) 1995 return self._getitem_multilevel(key) 1996 else: -> 1997 return self._getitem_column(key) 1998 1999 def _getitem_column(self, key): /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_column(self, key) 2002 # column 2003 if self.columns.is_unique: -> 2004 return self._get_item_cache(key) 2005 2006 # duplicate columns & possible reduce dimensionality /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py in _get_item_cache(self, item) 1348 res = cache.get(item) 1349 if res none: -> 1350 values = self._data.get(item) 1351 res = self._box_item_values(item, values) 1352 cache[item] = res /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py in get(self, item, fastpath) 3288 3289 if not isnull(item): -> 3290 loc = self.items.get_loc(item) 3291 else: 3292 indexer = np.arange(len(self.items)) [isnull(self.items)] /opt/anaconda/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance) 1945 return self._engine.get_loc(key) 1946 except keyerror: -> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key)) 1948 1949 indexer = self.get_indexer([key], method=method, tolerance=tolerance) pandas/index.pyx in pandas.index.indexengine.get_loc (pandas/index.c:4154)() pandas/index.pyx in pandas.index.indexengine.get_loc (pandas/index.c:4018)() pandas/hashtable.pyx in pandas.hashtable.pyobjecthashtable.get_item (pandas/hashtable.c:12368)() pandas/hashtable.pyx in pandas.hashtable.pyobjecthashtable.get_item (pandas/hashtable.c:12322)() keyerror: 'kids'
any explanations of doing wrong?
thanks!
do mean this:
df_kids = df[df['kids']>0]
this selects rows 'kids' column not zero.
Comments
Post a Comment