Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in DecisionTree #58

Open
Gewissta opened this issue Aug 27, 2020 · 0 comments
Open

error in DecisionTree #58

Gewissta opened this issue Aug 27, 2020 · 0 comments
Labels
bug Something isn't working

Comments

@Gewissta
Copy link

data = pd.read_csv('Data/Bankloan.csv', sep=';')
for i in ['debtinc', 'creddebt', 'othdebt']:
data[i] = data[i].str.replace(',', '.').astype('float')
train, test, y_train, y_test = train_test_split(data.drop('default', axis=1),
data['default'],
test_size=0.3,
stratify=data['default'],
random_state=42)
X_train = pd.get_dummies(train)
X_test = pd.get_dummies(test)
tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
tree.fit(X_train.values, y_train.values)


ValueError Traceback (most recent call last)
in
1 tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
----> 2 tree.fit(X_train.values, y_train.values)

in fit(self, X, Y)
78 self.n_classes = max(Y) + 1 if self.classifier else None
79 self.n_feats = X.shape[1] if not self.n_feats else min(self.n_feats, X.shape[1])
---> 80 self.root = self._grow(X, Y)
81
82 def predict(self, X):

in _grow(self, X, Y, cur_depth)
138
139 # grow the children that result from the split
--> 140 left = self._grow(X[l, :], Y[l], cur_depth)
141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))

in _grow(self, X, Y, cur_depth)
139 # grow the children that result from the split
140 left = self._grow(X[l, :], Y[l], cur_depth)
--> 141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
143

in _grow(self, X, Y, cur_depth)
139 # grow the children that result from the split
140 left = self._grow(X[l, :], Y[l], cur_depth)
--> 141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
143

in _grow(self, X, Y, cur_depth)
133
134 # greedily select the best split according to criterion
--> 135 feat, thresh = self._segment(X, Y, feat_idxs)
136 l = np.argwhere(X[:, feat] <= thresh).flatten()
137 r = np.argwhere(X[:, feat] > thresh).flatten()

in _segment(self, X, Y, feat_idxs)
155 gains = np.array([self._impurity_gain(Y, t, vals) for t in thresholds])
156
--> 157 if gains.max() > best_gain:
158 split_idx = i
159 best_gain = gains.max()

/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims, initial, where)
28 def _amax(a, axis=None, out=None, keepdims=False,
29 initial=_NoValue, where=True):
---> 30 return umr_maximum(a, axis, None, out, keepdims, initial, where)
31
32 def _amin(a, axis=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation maximum which has no identity

Link to dataset https://drive.google.com/file/d/1lj7qUyG7BOV6cAGm8-tDNUqS62IEgk5p/view?usp=sharing

@ddbourgin ddbourgin added the bug Something isn't working label May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants