我在cmd中输入pip install imblearn出现了下面情况,并没有报错,也没有显示successfully installed(喵喵喵???非常疑惑) 于是我确实的肯定imblearn包有问题,于是我们将C:\Users\Lenovo\Anaconda3\Lib\site-packages下面的关于imblearn的两个文件夹(imblearn以及imbalanced_learn-0.7.0.dist-info)删掉。并搜索了其他安装imblearn包的方法,如下: imbalanced-learn is currently available on the PyPi’s repository and you can install it via pip:* pip install -U imbalanced-learn The package is release also in Anaconda Cloud platform: conda install -c glemaitre imbalanced-learn If you prefer, you can clone it and run the setup.py file. Use the following commands to get a copy from GitHub and install all dependencies: git clone https://github.com/scikit-learn-contrib/imbalanced-learn.git cd imbalanced-learn pip install
Or install using pip and GitHub: pip install -U git+https://github.com/scikit-learn-contrib/imbalanced-learn.git 转载链接 https://pypi.org/project/imbalanced-learn/. 可以看到第一个和第二个方法不需要安装git,第三个第四个都需要安装git 所以我们就从不需要git的方法入手 打开anaconda prompt 输入 conda install -c glemaitre imbalanced-learn 开始报错 无法定位程序输入点OPENSSL_sk_new_reserve于动态链接库C:\Users\Lenovo\Anaconda3\Library\bin\libssl-1_1-x64.dll上. 程序猿的基本素养就是遇事不决百度一下,发现有很多博文已经解决了这个问题,我采用了下面的解决方法 找到anaconda3的安装路径,C:\Users\Lenovo\Anaconda3\DLLs文件中的libssl-1_1-x64.dll的文件创建日期信息以及文件大小应该与C:\Users\Lenovo\Anaconda3\Library\bin下的libssl-1_1-x64.dll一致,否则将DLLs的该文件复制粘贴到bin文件夹覆盖原文件。 简单粗暴很好用,爱了爱了 转载链接 https://blog.csdn.net/weixin_40093242/article/details/104125273.
然后我再次尝试conda install -c glemaitre imbalanced-learn命令 真开心又有新的报错,这次是HTTP报错,有两个文件没有下载成功 Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by ‘NewConnectionError(’<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x0000018D8B602160>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed’,)’: /simple/pip/ 上面的问题是因为网的问题,出现了链接超时或者是无法访问外网。解决办法如下: 把下面四行代码依次输入cmd中
三.更改imblearn包,解决SMOTE‘ object has no attribute ‘_validate_data‘报错问题 这篇其实带点儿日志性的分享博文,读者可以按需食用,经我反思,第一个问题是作为小白的我当初分别下了python和anaconda,并且用anaconda又下了个python,导致各种路径出错,正常安装的小伙伴应该没有这样的问题,第二个问题HTTP报错相信很多早已经改了下载源的老司机们也不会有错误。现在解决的第三个问题是本篇精髓,原创干货,转载请注明出处。
下面用一个官网的example
from collections import Counter
>>>from sklearn.datasets import make_classification
>>>from imblearn.over_sampling importSMOTE # doctest:+NORMALIZE_WHITESPACE>>>X, y =make_classification(n_classes=2, class_sep=2,
weights=[0.1,0.9], n_informative=3, n_redundant=1, flip_y=0,
n_features=20, n_clusters_per_class=1, n_samples=1000, random_state=10)>>>print('Original dataset shape %s'%Counter(y))>output:Original dataset shape Counter({1:900,0:100})>import imblearn
>>>> sm =SMOTE(random_state=42)>>>> X_res, y_res = sm.fit_resample(X,y)>print('Original dataset shape %s'%Counter(y_res))
执行后
AttributeError Traceback(most recent call last)in()1 oversample=SMOTE(random_state=0)---->2 X_os,y_os=oversample.fit_resample(X_train,y_train)~\Anaconda3\lib\site-packages\imblearn\base.py infit_resample(self,X, y)75check_classification_targets(y)76 arrays_transformer =ArraysTransformer(X, y)
—>77X, y, binarize_y = self.check_X_y(X, y)7879 self.sampling_strategy =check_sampling_strategy(~\Anaconda3\lib\site-packages\imblearn\base.py in_check_X_y(self,X, y, accept_sparse)132 accept_sparse =[“csr”, “csc”]133 y, binarize_y =check_target_type(y, indicate_one_vs_all=True)
–>134X, y = self._validate_data(135X, y, reset=True, accept_sparse=accept_sparse
136)
AttributeError: ‘SMOTE’ object has no attribute ‘_validate_data’
执行之后出现一毛一样的报错SMOTE‘ object has no attribute ‘_validate_data‘ 我们有充分的理由认为1.我们的代码是没问题的2.安装包是没问题的3.数据是没问题的 于是我们怀疑是imblearn包本身的问题 错误提示指向\Anaconda3\lib\site-packages\imblearn\base.py这个包 于是我们用记事本打开了对应路径下的base.py文件,想找到X, y = self._validate_data的定义,想解决报错中没有对象跟_validate_data相关的问题 下面是我们找到的报错位置处的代码
classBaseSampler(SamplerMixin):"""Base classfor sampling algorithms.
Warning: This classshould not be used directly. Use the derive classes
instead."""
def __init__(self, sampling_strategy="auto"):
self.sampling_strategy = sampling_strategy
def _check_X_y(self,X, y, accept_sparse=None):if accept_sparse is None:
accept_sparse =["csr","csc"]
y, binarize_y =check_target_type(y, indicate_one_vs_all=True)returnX, y, binarize_y
X, y = self._validate_data(X, y, reset=True, accept_sparse=accept_sparse
)
def _identity(X, y):returnX, y