核化这个概念在很多机器学习方法中都有应用,如SVM,PCA等。在此结合sklearn中的KPCA说说核函数具体怎么来用。
KPCA和PCA都是用来做无监督数据处理的,但是有一点不一样。PCA是降维,把m维的数据降至k维。KPCA恰恰相反,它是把m维的数据升至k维。但是他们共同的目标都是让数据在目标维度中(线性)可分,即PCA的最大可分性。
在sklearn中,kpca和pca的使用基本一致,接口都是一样的。kpca需要指定核函数,不然默认线性核。
首先我们用下面的代码生成一组数据。
import numpy as np
from sklearn.decomposition import PCA, KernelPCA
import matplotlib.pyplot as plt import math
x=[]
y=[]
N = 500
for i in range(N):
deg = np.random.randint(0,360)
if np.random.randint(0,2)%2==0:
x.append([6*math.sin(deg), 6*math.cos(deg)])
y.append(1)
else:
x.append([15*math.sin(deg), 15*math.cos(deg)])
y.append(0)
y = np.array(y)
x = np.array(x)
print('ok')
这些数据可以用下图来表示
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAXwAAAD8CAYAAAB0IB+mAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJztnW9sHPd5578Pl6aSpRzEXvoMWzGXycE9nHTAGRBh4JBL4ILExckbNwVaJLdyhMCALG5aCHf3JoVe1CggoDhcEBiHkg6LuFa02xh5E9RohAQhizvjXqUU4CaiAl/VRFREu7HIOAdLtCWLfO7Fb3/d4XL+7s7/+X6Awe7OzO78ZmbnO888v+f3PKKqIIQQUn7Gsm4AIYSQdKDgE0JIRaDgE0JIRaDgE0JIRaDgE0JIRaDgE0JIRaDgE0JIRaDgE0JIRaDgE0JIRRjPugFOpqamdGZmJutmEEJIobh06dKWqj4UtF6uBH9mZgZra2tZN4MQQgqFiGyEWY8uHUIIqQgUfEIIqQgUfEIIqQgUfEIIqQgUfEIIqQixCL6IvCwi74jIZce8F0RkU0Te6E1fiGNbhCTC/Dwg0p8+8hFgbAyYmQG6XbNOuw2Mj5vltRpw+LBZZ2rKTIPrE5Iz4rLwXwHwtMv8b6rqE73pYkzbIsSdbtcIrpvweon1zAxw7Biwurr/t+7cAVSBjQ3g1ClzQ1haAnZ3zfK9PeD2bbPO9raZBtf3uzmI9JfzJkFSIhbBV9XXAfwmjt8i5ABuQj44r902QruxsV94u12zzEusNzaAK1f8t7+zc/CGEGZ9v5sD0F/ubOvg/vLpgcSIxFXTVkRmAPytqv673ucXAHwVwP8DsAbgv6nquy7fOwXgFABMT08f39gINX6AlJH5+f3COjEB3L1rrGDn/3Riwnz+8MP+vMF1LM0mcONGX1zzTLMJnDtnxH9nx3u9RgN48UWg1UqvbSTXiMglVZ0NWi/JTtslAJ8C8ASAtwF8w20lVV1W1VlVnX3oocCRwaQMuFnsg2IPGLEHDgr53bv7xd5tHcv168UQe8C09exZf7EHzBOC84kA8HdnEdIjMcFX1V+r6q6q7gH4SwBPJrUtkmPCul6iuEyiMD1tfOijUK8Dc3PxtMeP6Wkj+mHY2TE3B8Ac48Fj+pWv7O+Enp9Prt2kMCQm+CLyiOPjFwFc9lqXlAwr8iLAs8/uF6KXXjpowQZZtGER2f+5Xu+7SPyo14GjR/fPO3TI/F6zCSwvAysrwMJC/+YxNgZMTpp1Gg0z2fWHuTnYtk5Ph/+OvTm4PRXs7e3/vLpqOqftzff++82r7Txut6O3mRQPVR15AvBdGLfNhwBuAHgOwAUAPwPwUwCvAXgk6HeOHz+upMB0OqqNhqqR9+SmiQnV++7bP69eV11YUG02VUXMa6fTb9vCgmqtZtYdG1OdnHRfLy68ttdo9I+RXe5sQ6dj9iXMcWg2zXdE4jmuhw8ncyxI4gBY0zBaHWaltCYKfoHodPaL68JCeKHyEnG/5VbUrDgObr9MQuXct0bD3CwGj0e93t/nZjO+m6nbzWNuLsujQUJAwSfxY4XITRiiWJmD61rxmps7eBMoo6APg98NLspTwSg3ZJ6L3BJW8GMLy4yD2dlZZT78nGI7Bkf1t9frwMmTwMWLxgc9PW181wwxHI1u1/jyr183/Q8ffJDctup106/Bc5Yb8hCWSYqGX2hfmHBBL2xnqu0AXVwErl0zHYvXrlE44qDV6h/T998/2HF89KgR6jjY2QHOnGEYaAGh4BODW2ifM9Y7TLjgYJQMYKJXLlwwv0lxT4+Vlf1OmfV1c7NtNs15Onx4tN/f3vb+r5DcQsGvKoPW/Jkz7uGSNtY7KFywXgdOn+4LSrMJdDrA1hZFPi84nwLee8+IdadjzhXgfsMOy86OcdXR4s81FPwq4mbN2/wug1jL/ty5gy4BumqKj70JqJonMSv+w7C7u9/ib7fp9skZFPwqEsUfby37Vmu/S6DZpKumbDjF3znZpwB73huN4N/a2TGD7Oj2yRUU/LLi1wEbdvi+Hf1pcboEKPLVYfC8v/hiuA7gwQhAp4uQZAIFv4wEdcB6+eMbjf2WHEPviBuDT3tRchWFNTZIIlDwy4iby8ZpXbn54+t1Y7nRgidhcFr958979+8M4jQ2mOEzdSj4ZcTLirLz3fzxtObJsLj9n06fdjcqrIvQ7Sn0xAkj/kzklhgcaVtGZmbMBTRIs2msMkLSwDn6d3BEtdd/1LKwYKK+SCg40rbKeLlsnB2whCSNXyd/kC9/aYmungSg4JcRumxI3gmT99/p6mHe/lig4JcVhlCSPOP2FBrE7q6x/A8fptU/JBR8Qkj62KfQYXL63L5trP6pKQp/RCj4eYOhaqQqtFomp4+zdGQUtreN8B87Fn/bSgoFPy+02+ZPf+IEh6OTarG4CNy710/jENXVc+UKi7SHhIKfB9pt45scLDwNcDg6qRbOgIMorK4m056SQcHPA8vL/ss5HJ1UCWcSt4UF494kscAjmQd2d/2XhwlhI6SMLC6a66PTCZelEzBPzOPjDOV0gYKfNm6dsn4dVhwwRYix+re2TKlGN2xJR+setUaUDeWk6AOg4KeLVxbLp55yX39ykgOmCHGyvn6wXu/cnCnpCHi7Ryn6ACj46eKVxfLq1f2habWa+XzrFsWekEEG6/VasQf83aMU/XgEX0ReFpF3ROSyY96DIvJjEfnH3usDcWyrcDhdOF7Joq5f3x+adu8eE0cRMgxB8fxBARIlJy4L/xUATw/M+zqAVVV9HMBq73N16HbNSEBnXL0X7JQlJB5OnfJfHhQgUXJiEXxVfR3AbwZmPwPgfO/9eQC/F8e2CoH11XsVBnfCTllC4mNx0bhDvRhmRG+JSNKH/7Cqvt17/88AHnZbSUROiciaiKzdvHkzweakRLcLnDwZXCScWSwJSQY/0Q96Aig542lsRFVVRFx9Gqq6DGAZMAVQ0mhPYljLPuixkYVICEkW2we2vGyux1rNXJsV7xtL0sL/tYg8AgC913cS3FY+cIvCGYQuHELSIUwgRMWSFSYp+K8BONl7fxLA3yS4rexwjurzK9kGmJGCdOEQkg+86uqWOO1yLC4dEfkugKcATInIDQB/CuDPAXxPRJ4DsAHgD+PYVq6wo/qCqNWA8+cp9ITkCa8n8u3tvq+/ZNdsLIKvql/2WDTnMb8chInprddp1ROSR/ySEu7smOALoFTXLkfajoJf5yyjcAjJN0HjX3Z3S1ePgoI/Cl4xvbUaa8kSknfC1NUtWT0KCn5Y3HrzvWJ6Kx7rS0ghsMVWgtIub2yUJoKHgh8GryyXn/60e9Kzisf6ElIYbNrlTsd/FK6N4Cl4/VxRvxwvKTM7O6tra2tZN+MgMzPuIZccQEVIebCGXdBYmkcfBTY302lTSETkkqrOBq1HCz8MXr35LD1ISHkIW0/3rbcKa+lT8P2YnzfRNl5PQcxySUi5sPV0g0T/ypVC5tan4HsxPw+srnovZ4oEQspLmGu7gLn1Kfhe+Ik94+sJKTetlnf9XEsBc+unki2zdLCjlpDys74OHDlifPZuFDC3Pi18QgjxYnPT29L/yEcKl2WTgu/EmfnSi7lypwcihAywvr5/vM3YmHl/+/b+cTkFEH0KPmCEXsRkvvTzy83NASsr6bWLEJIPnLn1H3vsoE4UJAUDffhBKY5rNXOiCSEEKPS4HFr4QaFVBeyJJ4QkiNf4m7Gx3Pv0qy343W6woBewJ54QkiBeWTZ3d3Pv06+u4Ptlu3TCzJeEECfOFAwi7kZhTn361RX8oILjIsx8SQhxx6Zg2NszkxsbG7mz8qsp+O22f8HxhQVzEin2paXdNoaZiJkOHTK1q+1na7gdPmze22jdqSng/vv3r+ec7Hp2mp/Pek9J4vjl1MqZa6d6gh8UldNsUuhLgI20tdPYmBHfmZl+BK7TMLt719SudrK3Z0KtgX5Xz/Y2cOuW93YHu4RWV91vBOPjhcy9Rdzwq5yVM9dO9QTfLyqHCdFKwfz8wXu6qhFfvwe7JBm8EezumjbaG8DUVK4MQRIF69P3IkeunWoJflBUDhOiFQ7n4Ojx8eAkp3lle9sUVBLJdVQf8aLV8k+pnBPXTnUEPygqp1aj2BcAp8APDo7e3S2m2A9iq+nR6i8YBXDtVEfwg6JyGH6ZW5wdrEHZL8qE0+qnz78AFMC1k7jgi8g1EfmZiLwhItkUrO12g6Ny2FGbS6w/3ivyLS4mJoBGY/+8sTFgctK8t6HWjYaJ3PEiqXF61udP0c85OXftpGXh/66qPhGmyG7sBLlyGJWTO7rdfohknC6aubn+WJlmE+h0TGeuKnDnDrC11f+sakT21i3z3ubN2toC3ntv/3rOya6nan7fXvtx3Qi+9S3j48/5CP5qk2fXjqomOgG4BmAqzLrHjx/X2Gk2va5N1XpdtdOJf5skMp2O/6mKMs3Nmd8SMa95OsWdjmqjEc9+Du4zyRGdjvfJEol9cwDWNIweh1lplAnALwG8AeASgFMuy08BWAOwNj09HfuB8L1K8qQEFabTMffeUUVPRHVhIeu9GY64bgRF3f9S4mXBNJuxbypPgn+k9/qvAPwDgM96rRu7hd/pGBVI6aCT4RjVsq/VyiN0cTzpTE7SlskFbpZMQl6FsIKfuA9fVTd7r+8A+D6AJ5Pe5r9w9qw5zIOIcIBVDuh2jR866mCoiYn9/vd798rTDWNTtKjuL7JUq/U7kIO4fdtE99Rq7OTNlMEka81m5mN9RN0EMa4fF5kEMKaq7/Xe/xjAn6nqD93Wn52d1bW1GAN5xsbcBR/wnk8Spd02//lhQisbDeDFF6s7XMLGH/hFF7vBQm3lR0QuaYigmKQt/IcB/B8R+QcAPwHwAy+xTwSvpEZ+YVMkMWyIZVSxHx83Fv3WVnXFHugbjIcORfuezedDa58kKviq+gtV/fe96ZiqpudH6Xbds1wxX07qtNvmYWuYEMu5OeDDD6st9E5aLeCDD8xxicrSEnDsWPxtIhGYn880nWo5R9raZ9/B9IeNRuY+tKphrfqwHrRmc38XJF0R7qys9P38YxGu4itXgAceSK5dxAe3RE+rq6mKfjkF3yuNwuHDFPsUabejWfV8+IrO4qJxkXU64Tt1f/tb4MiRZNtFXPC6GFJMAFVOwS9wVfmy4Jai2I8cBDAUmlbLeDDDunreeoujdKtIOQXfq7PWrzINiY0jR8IbLXNzxjVx7RrFPg5WVsKLfg6SN5KUKZ/gs7M2U+bnjfUYhoUF+uiTYGXFHNsg+MCbMl534mF64IekXILfbgPPPsvO2gwJY9kfOmR8zmUZLJVHFhfNk9PHP+69jipTL6eK1+PX1aup+dfKI/jdLvDSS+7hIOysTYUw/9mFBRNWyNORDu++Czz6qP86TL2cIisrxtpxZtPc2EgtbXJ5BN8rjQLAZ9cUsA9XfszN0arPgs3NfqpmEe/1lpZSDwuvJm5RhCmlTU40tUJURkqt4JdGodk0vYIkEdrt4IicRx81wkOyx0/0AaZiSJwEUr7kJbVCenhF4DBRWqJYT5ofc3MU+zwRVIylDHWBc42fViXs1imP4LtVmREBTp+mwzhBTp/2N0qaTVqLeSNM+Wa6dhLk3Dn3xyzVxN065RF8t1SkFy7QaZwg8/PuEbAWPlzlk8XF4LDN1VUOzEqMVsvbSoqaKzwi5RH8btfcHa9fN49M587Rsk+Qbjf40Z8PV/llcTE46+aZM+m0haRHOQTfJkvb2DB3zhTDnKrKc8/5L19Y4MNV3vn2t/2XDw5nIcWnHIKfYZhTFZmfB+7c8V4uQrEvAq1W8CBP2kwJ4dVzHtSjPiLlEHwvvxfj7xMhjCuHFIOVFX/XDm2mhPDqOQ/Toz4CxRf8btc7sJjJ0mInyOLj4Kri4efaSbgPsbrYnnNn0eIU/KDFH3jlVQVbxETpsNcwVoKKjufo70QiMDXl7bPnQKz8U52BV17qo0qxTwBafOXkxRe9lzFEszwUX/Az6vwgB2Ft+OISZBsxRDNBul3z6Dw2Zl4TvLuOJ/bLabG7G20+GZqg/yEHWRWbWs37smGIZkLYkHIbZWhDyoFEPBTFt/AbjWjzydAEWXn0oBWbhANEiBsph5QXX/BJavhZeSkW7SEJweiqDPDqFEuos6z4gu+lQnwGTRVGcZQfFkhJgJT7IBMXfBF5WkTeFJGrIvL12DfATltCUuFb38q6BSUk5T7IRAVfRGoA/gLA5wEcBfBlETka60bYaUtIbPh1fe3tpdeOyuAV2pZQyFvSFv6TAK6q6i9U9S6AVwE8E+sWUj5ghJQZv3h8kgBudTzq9cRC3pIW/CMAfuX4fKM3718QkVMisiYiazdv3oy+hZQPGCFlptUCJibcl01OptuWSuBWx2N5ObGQt8w7bVV1WVVnVXX2oYceiv4DKR+wKuN1wVMIysVnPuM+/ytfSbcdlaHVMjW39/bMa4LalbTgbwJ4zPH5E7158ZLiAasyXhc8haA8dLvA3/2d+7KLF9NtS2Vot4HxcWOwjo8nGg6VaPI0ERkH8H8BzMEI/d8D+M+quu62/lDJ00hqeCXYmpjwz49PioNfcjwRdtzGTrsNLC0dnB8xc2Yukqep6j0AfwTgRwB+DuB7XmJP8o/X0Ia7dxmjXRb8xvsw23gCLC9Hmz8iifvwVfWiqv6Oqv5rVWVPaklhjHb5YRxEApQpDp+UC8Zol5ug5HjsGkuAso20JeWBMdrlJqgwPUmAlEsdUvBJaIIsPBbJKDZ+He9MPpsQKZc6pOCT2Hj++axbQJKCT3cJ0e2aeNe9PTOG6Pz5RNOWUvBJJPzSIN++TSu/qNB/nwG2+MnGhinJaoufJHgRFb+IOUkdEe9ljQawtZVeW0g8sIh5BngNemg2zQDSCOQiDj9VUqwLWXX8/LksQ1A85uf9zxvFPiFSLn4ClEXwM3g0qjJB/twjR/yXk/xw7Biwuuq9nElnE8JPmxKs5VEOwfeqC8lexERotfwTpr31lhESkm/abeDKFf91ONgqIfxq1iZYy6Mcgn/9uvv827c55j8hgkbWXrnCB6y8EzR6v9FgZ21i+LltEnysKofg+yX5SCgnRdVptUx3iR9nzqTTFjIcQYYkQzETotv1jnwQSfSxqhyC73eAWOowMYI8ZuzAzSfttn+kFWAic2jdJ8TZs6av0Y3TpxM98OUJy/T7B+doH8vGkSPGZ+8FD32+8MrG6+ToUWCdOW2TIwGtql5Yph90JifG5qap2eCFiJnm59NrE/HmpZeC16HYJ0hG0TmWagi+X484GZlXXvGug2pZXT1YepikR7drBlcFGZApaE61ySg6x1Iewfcb85/gQAZiXI4vv9wvK+zF+++b5QycSpd2GzhxIlyfSkJJGonFK6IQSGXQQ3kEf2XF38ykyiSKs6xwEEtLPB1p0e0G++wBcyNOMEkjsXhFFCYcnWMpj+ADxsz0guGZuYKinw5hvJmHD5sbNcU+Bc6dO+jbFEk8OsdSLsH3O2AMz0wNP++ak6UlduYmxfy80ZEgb2a9Hq4jl8REq2WMT+v/bDaBCxdSu9uWS/AB/14nRuukwsoK8NGPhlt3ddWsy1MTH/Pz/vlxLGNjRnsYb58yTv/ntWupnoDyCb5frxMTqqXGzo6J5w7DBx+YTkV26A6PM1lsGLEfHwe+8x2KfdUon+DbkmFu7OxwvH+KrK8DnU5wCgYn9O1Hw4ZbnjjRTxYbRKNhQmkp9tWjfIIPGNH3ig/c3qaVnyKtlrEk/bJrDrK0ZESMp8mfdht49tloKSxUTYEain01SUzwReQFEdkUkTd60xeS2pYrfgnVOBArVVot4Nat8J25gBGxEyco/INY142IuTFGGYkf5fiTGGm3jQ9NxLxm+AibtIX/TVV9ojddTHhb+/GLaeVArExYWYkuOhR+w6DrJiosU5gRNnmRjRLc3c3Ub5lY8jQReQHALVX9H2G/E3tN21rNfSRQrQbcuxffdkgkwiTwCqIKAtZum5DJYS9RkXAD4UiCpKRBeUme9sci8lMReVlEHkh4Wwfx+rczJj9TFheNiI3iYlhdNYJWNsvfWvLDuGwGOX06vnaRIeh2c6dBIwm+iKyIyGWX6RkASwA+BeAJAG8D+IbHb5wSkTURWbt58+YozTmIV24KFurMBSsrJorHryh6ENblY92jRRX/dtsYfWFz3vhRqzFNQi7wiwjMKkudqiY+AZgBcDlovePHj2usdDqq9bqqMZTMJGJem02znOSCTke10dh/qkadajXVhYWs98ybOPdZJN/7WjkWFvxPWMwnC8CahtHiMCsNMwF4xPH+vwB4Neg7sQu+qrmqms39Ym+nep2in0OSEH9AdWwse1HsdFQnJ+PbH9ouOaTTOag1zmlyMvZN5kHwLwD4GYCfAnjNeQPwmhIRfIsV/cGp2Uxum2QknPfqpKeJCXONjiKeXjeqRsMs63TMU0cc7c36xkV8CPrTJnB3Div45SlxGMTYmDncgzCUoRCEzQ8TB/U6cPIkcPGiSV/+4IPAnTtmLIEbExPAc88B3/42cPeu+zr33Qd87GOj++cbDVNcnAOncoyX1gDmBG5txb7JvETp5AevgVh+A7RIblhZMdfQqJ28YdjZMeGQNlXB9ra32ANG5JeWvMUeAD78cHixbzTMfqtylGzu6Xa9c4mImLt1hlRH8N3yUNfrqRQdIPHRahnRUw2fnG0YsnzwHRszUTbWB0CRLwjdrknQ6BZymWLOez+qI/hueaiZG7bQrK/3RbHTyX+0baPhH41nLfndXYZUFo5u1/gBd3YOLqvVUs1570d1BB9wz0OdozwXZHjsqY3D7eNXl3dY7rvPPM2fP78/kZzTmqclX1BsFjuvwVR7e7k5sdUS/EG88lwcOZJtu8hION0+zmlhwdvCnpjoP/idPn3Q+xeErQnrdqNpNIC/+ivTLptIzraJ1nzB6XaD81/kqJ+wOlE6boyPe9+Vq5CshXjS7ZqkqmGjdF5+OTdGHEmTqSn/3vh6PRXXcdgonfFEW5F3/PJZpBUDSHKJtcYJ8aTb9Rf7Wi13/YTVdulklc+CEFJs2m2T+MgLEdNhkyOxB6ou+H71bwkhxI35+eD83jkIwXSj2oK/uAg8+qj7MpYHIoQM0u0Gu3sbjdz2xFdb8AFgc/OguLPDlhDihl/KY8B00mY8mtYPCj7QH7dvJ4o9IcSJrUwTlB8jZ520g1Dwg+DALEKqjU2ZECT2Cwu5FnuAgu8PB2YRUm26XTOK1i1lgpO5udz67Z1Q8P1YXnaf/9ZbwAPpl+glhKRItwt89avBmfQajcK4gSn4fvgNzPrtb014FiGknJw5Y/Ja+5HzTtpBKPh+BA3MWl0tbtVsQog7YTtoG43cd9IOQsH3I8zArGefZUcuIWUhbAetrVxVILEHKPj++A3MsqiabHm09AkpPmfPBnfQTkwUyo3jhIIfxOYm8PGP+6+jav4ohJBi0u0CMzOmrqUfY2OFTo1KwQ/Du+8Gp1rY2DCx+uzIJaQ4WH/9iRPBYl+vA9/5TmHFHqDgh2dlxZRSCiqHtLoKfPSjdPEQknfC+uuBQnbQukHBj0KrZbLgBYn+Bx+YPxJFn5D8EsZf32waQ6+AHbRuVLsAyjAsLgKf/rT5s/g9Au7smKLGQCn+KISUjuvX/Zc3m6ZQcokYycIXkT8QkXUR2ROR2YFlfyIiV0XkTRH53GjNzBm2YnYQu7vGNzg1RWufkLzhV2u2XgfOnUuvLSkxqkvnMoDfB/C6c6aIHAXwJQDHADwNYFFEyldeKmzO/O1tungIyRvnzrlXqy+Jv96NkQRfVX+uqm+6LHoGwKuqekdVfwngKoAnR9lWLllZCS/61sVD0SckPWy45diYeXVef62WEfZm0/TLlcxf70ZSnbZHAPzK8flGb175sLn0O53gVAx08RCSHjYKZ2PDXKMbGweftK17dm/PvJZU6C2Bgi8iKyJy2WV6Jo4GiMgpEVkTkbWbN2/G8ZPZ0GqZosVuj4iDbG9T+AlJmjNnDkbh7OxUepBkYJSOqg4zkmgTwGOOz5/ozXP7/WUAywAwOzsbkIc051jr4MyZcLG91rfv/C4hZHS6Xe9rMCg6p8Qk5dJ5DcCXROSQiHwSwOMAfpLQtvJFq2V8gGFcPAB9+4TEhdNfb0Oi3fCLzik5o4ZlflFEbgD4DwB+ICI/AgBVXQfwPQBXAPwQwNdU1Se5fAmJ4uKxvv1ajZk3CRmGQX+9Xy2LEoZbhkU0qJpLiszOzura2lrWzYiXbje8i8fJwkIhSqYRkgvCJD4D+mmNS4aIXFLV2aD1mFohaZwunkYj/PeWlpiMjZCwhPHLF6w6VRJQ8NMiqm/fsrpK0SckCC+/fK3Wj7Ev6WCqKFDw0yaKb9+yunpw0AghpI/bqNl63VxrFYmxDwMFPwvsCL/JyfDf2dhg7D4hXriNmqVFfwAKfla0WsCtW6ZzNijdshPm5SFVwi81wiAVGzU7DBT8rFlcNH/QhYXw37Gx+2EuAkKKRrsNjI8bQ8hWovJKjUAiQcHPC4uL5k8dNhnb7m7/IqCrh5SFdttEqHnF0Vc8NcKoUPDzhjMZW5SOXbp6SFFxum2WloLXr3BqhFGh4OcV2wkVJXaf1g8pGoMjZMNQ4dQIo0LBzzPO2H0bfRAUw0/rhxSJMHVlnZS0ElVaUPCLgDP6ICiG32n9RIlwICQLohgoDLUcGQp+0fBz9TitnzDFHwhJkjAGR5B7plYzEWyqDLWMA1XNzXT8+HElEeh0VJtNVRHz2un0lzWbquYy2T81m2b5woJqrWbm1WrmMyFx0emo1uv7/3v1+v7/aJT1iC8A1jSExmYu8s6Jgh8jIu6CL2LE3W0ZRZ8My6Dx0Wj4Gxx+36XYRyas4DM9clnxShfbbAI3brjHOddqwL17iTeNlIBu13S4bmyYYIKwOiJi+qJIrDA9ctXxSiZ17pz3oBa/ohGEWJz9Q0B4sQcYUpkxFPyy4pdMyiu0M0raZlJ+bKerSH8aHweefz5aKKWFIZWZQ8EvM17JpGzh9EG85rvhzHcyPs7SjGWi3TY3f5vHxsnuLnD7drjfaTSYvTIRPvRUAAAGsUlEQVRnjGfdAJIBtnTi8rK5gGs1I/ZhSyrafCeW3d3+Z5ZlLC7drrHewwq6H7a6FAU+V7DTlkRnfNzb31+rDXcTIdli/fLDuGoGaTQo9inDTluSHH6du3aZtfrp6skPfgOhoqY4GBszbhqg3/fTbJo0IFtbFPucQsEn0YnSubu8fHBeu20Ew3YE3n8/RwCPyqCYt9sHP/uNvI6ag+n5502/kKoJ5eVI2GIQJlg/rYkDrwqC18AtrynKd+3o38lJ1bGx/jwOCjuIHbBkB9T5HVev5XYglNfIbLfzw3OROxBy4BUtfBKdxUWT38Ra+rWad5nGwacBN4vfiXUJ3b7dH6BTNfeQ01qfmjKTjYYSOWixA8Gx8F7LrWXvNm4DMG2wuWysNc9+mcIykuCLyB+IyLqI7InIrGP+jIi8LyJv9KaXRm8qyRWLi/1H+Xv3gNOn3dcbDPUcZXCX82bhJorWfTE/X9yQ0cGkd9vbZgL6x25jA3jppXg6WO1AKLdxG52O2SYFvjyEeQzwmgD8WwD/BsD/AjDrmD8D4HLU36NLp+CESchmlw87qbon3Aqa3NqSxxwuYV0rw0yDbh0mKSsNSMOlo6o/V9U3R7rjkPIwaPW7WYZRBncNYt1DUSNKgIOupLymj06qgE29bp7COBCq0iTpw/9kz53zv0XkMwluhxQJ6//38vn7YW8Ww4jioCvJ7aaRhxKRUXLNDB5D+7nZNMd4UNwXF91HXpPKECj4IrIiIpddpmd8vvY2gGlVfQLAfwXw1yLyMY/fPyUiayKydvPmzeH2ghSLxUUjOta5YEs4An0rfnLS+OTtvIWF/hPDMAm4BjuPvW4aWZeI9Oo8HcTNYr9woR8eSXEnLgSmVlDV+ag/qqp3ANzpvb8kIv8E4HcAHBhGq6rLAJYBM9I26rZICWi1ognSuXPRR4UOupKmp93TR2edzdEeh7Nnzc3nwQfN5+3t/ijmZtMcA4o4iUgiLh0ReUhEar33nwLwOIBfJLEtUkEGI0oaDTNZS3dubn/IqPPpwOKXPjprnEnvtrbMZPtFrAVPsSdDMFLyNBH5IoD/CeAhAD8QkTdU9XMAPgvgz0TkQwB7AE6r6m9Gbi0hlqhPBW7fB/qW9PQ0rWZSepg8jRBCCg6TpxFCCNkHBZ8QQioCBZ8QQioCBZ8QQioCBZ8QQipCrqJ0ROQmAJfRMKGZArAVU3OypCz7AXBf8khZ9gPgvliaqvpQ0Eq5EvxREZG1MKFJeacs+wFwX/JIWfYD4L5EhS4dQgipCBR8QgipCGUT/ID6eYWhLPsBcF/ySFn2A+C+RKJUPnxCCCHelM3CJ4QQ4kHhBb9MhdS99qW37E9E5KqIvCkin8uqjcMgIi+IyKbjXHwh6zZFQUSe7h33qyLy9azbMwoick1EftY7D4XKVCgiL4vIOyJy2THvQRH5sYj8Y+/1gSzbGBaPfUn8Oim84AO4DOD3AbzusuyfVPWJ3nQ65XYNg+u+iMhRAF8CcAzA0wAWbb2BAvFNx7m4mHVjwtI7zn8B4PMAjgL4cu98FJnf7Z2HooUzvgLz/3fydQCrqvo4gNXe5yLwCg7uC5DwdVJ4wS9TIXWffXkGwKuqekdVfwngKoAn021dZXkSwFVV/YWq3gXwKsz5ICmjqq8DGKyr8QyA87335wH8XqqNGhKPfUmcwgt+AGUppH4EwK8cn2/05hWJPxaRn/YeZQvx2N2jDMfeiQJYEZFLInIqcO3887Cqvt17/88AHs6yMTGQ6HVSCMFPupB6mgy5L7knYL+WAHwKwBMw5+UbmTa22vzH3jXxeQBfE5HPZt2guFATcljksMPEr5ORShymRdKF1NNkmH0BsAngMcfnT/Tm5Yaw+yUifwngbxNuTpzk/thHQVU3e6/viMj3YVxWbv1fReHXIvKIqr4tIo8AeCfrBg2Lqv7avk/qOimEhT8MJSuk/hqAL4nIIRH5JMy+/CTjNoWmdyFavgjTOV0U/h7A4yLySRGZgOk8fy3jNg2FiEyKyP32PYD/hGKdCzdeA3Cy9/4kgL/JsC0jkcZ1UggL348yFVL32hdVXReR7wG4AuAegK+p6m6WbY3IfxeRJ2Aet68BeD7b5oRHVe+JyB8B+BGAGoCXVXU942YNy8MAvi8igLn2/1pVf5htk8IjIt8F8BSAKRG5AeBPAfw5gO+JyHMwmXb/MLsWhsdjX55K+jrhSFtCCKkIpXXpEEII2Q8FnxBCKgIFnxBCKgIFnxBCKgIFnxBCKgIFnxBCKgIFnxBCKgIFnxBCKsL/B7+BXso7KyACAAAAAElFTkSuQmCC" alt="" />
显然,我们选的正样本(蓝)都落在一个半径为6的圆上,负样本(红)全选在一个半径为15的圆上。
这样的数据显然是线性不可分的。如果我们强行要用线性分类器来做,可以对原始数据做一个kpca处理。
如下代码:
kpca = KernelPCA(kernel="rbf", n_components=14)
x_kpca = kpca.fit_transform(x)
我们用rbf核,指定维度为14(为啥是14,后面说),也就是说吧X(2维)中的数据映射到14维的空间中去。
然后我们用一个线性的SVM分类器来对映射后的数据做一个分类,随机取80%做训练集,20%做测试集。
from sklearn import svm
clf = svm.SVC(kernel='linear') clf.fit(x_kpca[:0.8*N],y[:0.8*N])
y0 = y[0.8*N:]
y1 = clf.predict(x_kpca[0.8*N:])
print(np.linalg.norm(y0-y1, 1))
此时,输出的是0,也就是说我们预测出的分类与正确的分类完全吻合,没有一个错误。此时分类器在这个数据集上是完全有效的。
综上所述,通过核化,可以把原本线性不可分的数据映射到高维空间后实现线性可分。
不过这个过程也不完全是那么简单,我们取不同的维度得到的结果可能相差甚远,也就是经典的调参问题。
在此,我测试了几组不同的k取值,得到不同k值条件下分类准确率值。如下图。
可以看出,不同的k值对结果是有影响的。但不同的数据集可能不一样,所以需要进行调参。我这个取14就全ok了。