1、气候监测数据集 [url]http://cdiac.ornl.gov/ftp/ndp026b[/url]
关联:
[url]http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar[/url]
[url]http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData[/url]
Download the Financial Data (~17.5M zipped file, ~67M unzipped data)
Download the Medical Data (~2M zipped file, ~6M unzipped data)
[url]http://lisp.vse.cz/pkdd99/Challenge/chall.htm[/url]
2、几个实用的测试数据集下载的网站
[url]http://www.cs.toronto.edu/~roweis/data.html[/url]
[url]http://www.cs.toronto.edu/~roweis/data.html[/url]
[url]http://kdd.ics.uci.edu/summary.task.type.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.phys.uni.torun.pl/~duch/software.html[/url]
在下面的网址可以找到reuters数据集[url]http://www.research.att.com/~lewis/reuters21578.html[/url]
以下网址上有各种数据集:
[url]http://kdd.ics.uci.edu/summary.data.type.html[/url]
进行文本分类,还有一个数据集是可以用的,即rainbow的数据集
[url]http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html[/url]
[url]http://www.cs.toronto.edu/~roweis/data.html[/url]
[url]http://kdd.ics.uci.edu/summary.task.type.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.phys.uni.torun.pl/~duch/software.html[/url]
在下面的网址可以找到reuters数据集[url]http://www.research.att.com/~lewis/reuters21578.html[/url]
以下网址上有各种数据集:
[url]http://kdd.ics.uci.edu/summary.data.type.html[/url]
进行文本分类,还有一个数据集是可以用的,即rainbow的数据集
[url]http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html[/url]
3、找了很多测试数据集,写论文的同志们肯定需要的,至少能用来检验算法的效果
可能有一些不能访问,但是总有能访问的吧:
可能有一些不能访问,但是总有能访问的吧:
UCI收集的机器学习数据集
[url]ftp://pami.sjtu.edu.cn/[/url]
[url]http://www.ics.uci.edu/~mlearn//MLRepository.htm[/url]
[url]ftp://pami.sjtu.edu.cn/[/url]
[url]http://www.ics.uci.edu/~mlearn//MLRepository.htm[/url]
statlib
[url]http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm[/url]
[url]http://lib.stat.cmu.edu/[/url]
[url]http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm[/url]
[url]http://lib.stat.cmu.edu/[/url]
关于基金的数据挖掘的网站
[url]http://www.gotofund.com/index.asp[/url]
[url]http://www.gotofund.com/index.asp[/url]
各种数据集:
[url]http://kdd.ics.uci.edu/summary.data.type.html[/url]
[url]http://www.mlnet.org/cgi-bin/mlnetois.pl/?File=datasets.html[/url]
[url]http://lib.stat.cmu.edu/datasets/[/url]
[url]http://dctc.sjtu.edu.cn/adaptive/datasets/[/url]
[url]http://fimi.cs.helsinki.fi/data/[/url]
[url]http://www.almaden.ibm.com/software/quest/Resources/index.shtml[/url]
[url]http://miles.cnuce.cnr.it/~palmeri/datam/DCI/[/url]
[url]http://kdd.ics.uci.edu/summary.data.type.html[/url]
[url]http://www.mlnet.org/cgi-bin/mlnetois.pl/?File=datasets.html[/url]
[url]http://lib.stat.cmu.edu/datasets/[/url]
[url]http://dctc.sjtu.edu.cn/adaptive/datasets/[/url]
[url]http://fimi.cs.helsinki.fi/data/[/url]
[url]http://www.almaden.ibm.com/software/quest/Resources/index.shtml[/url]
[url]http://miles.cnuce.cnr.it/~palmeri/datam/DCI/[/url]
[url]http://www.w3.org/TR/WD-logfile-960221.html[/url]
[url]http://www.w3.org/Daemon/User/Config/Logging.html#AccessLog[/url]
[url]http://www.w3.org/1998/11/05/WC-workshop/Papers/bala2.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.web-caching.com/traces-logs.html[/url]
[url]http://www-2.cs.cmu.edu/webkb[/url]
[url]http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf[/url]
[url]http://www.cs.cornell.edu/projects/kddcup/index.html[/url]
[url]http://www.w3.org/Daemon/User/Config/Logging.html#AccessLog[/url]
[url]http://www.w3.org/1998/11/05/WC-workshop/Papers/bala2.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.web-caching.com/traces-logs.html[/url]
[url]http://www-2.cs.cmu.edu/webkb[/url]
[url]http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf[/url]
[url]http://www.cs.cornell.edu/projects/kddcup/index.html[/url]
数据生成器的链接
[url]http://www.cse.cuhk.edu.hk/~kdd/data_collection.html[/url]
[url]http://www.almaden.ibm.com/cs/quest/syndata.html[/url]
[url]http://www.cse.cuhk.edu.hk/~kdd/data_collection.html[/url]
[url]http://www.almaden.ibm.com/cs/quest/syndata.html[/url]
关联:
[url]http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar[/url]
[url]http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData[/url]
WEKA:
[url]http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar[/url]
1。A jarfile containing 37 classification problems, originally obtained from the UCI repository
[url]http://prdownloads.sourceforge.net/weka/datasets-UCI.jar[/url]
2。A jarfile containing 37 regression problems, obtained from various sources
[url]http://prdownloads.sourceforge.net/weka/datasets-numeric.jar[/url]
3。A jarfile containing 30 regression datasets collected by Luis Torgo
[url]http://prdownloads.sourceforge.net/weka/regression-datasets.jar[/url]
[url]http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar[/url]
1。A jarfile containing 37 classification problems, originally obtained from the UCI repository
[url]http://prdownloads.sourceforge.net/weka/datasets-UCI.jar[/url]
2。A jarfile containing 37 regression problems, obtained from various sources
[url]http://prdownloads.sourceforge.net/weka/datasets-numeric.jar[/url]
3。A jarfile containing 30 regression datasets collected by Luis Torgo
[url]http://prdownloads.sourceforge.net/weka/regression-datasets.jar[/url]
另一个人提供的
[url]http://www.cs.toronto.edu/~roweis/data.html[/url]
[url]http://kdd.ics.uci.edu/summary.task.type.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.phys.uni.torun.pl/~duch/software.html[/url]
在下面的网址可以找到reuters数据集
[url]http://www.research.att.com/~lewis/reuters21578.html[/url]
[url]http://www.cs.toronto.edu/~roweis/data.html[/url]
[url]http://kdd.ics.uci.edu/summary.task.type.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/[/url]
[url]http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/[/url]
[url]http://www.phys.uni.torun.pl/~duch/software.html[/url]
在下面的网址可以找到reuters数据集
[url]http://www.research.att.com/~lewis/reuters21578.html[/url]
进行文本分类,还有一个数据集是可以用的,即rainbow的数据集
[url]http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html[/url]
[url]http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html[/url]
Download the Financial Data (~17.5M zipped file, ~67M unzipped data)
Download the Medical Data (~2M zipped file, ~6M unzipped data)
[url]http://lisp.vse.cz/pkdd99/Challenge/chall.htm[/url]
还有另外一个很好的资源网址为:[url]http://kdd.ics.uci.edu/[/url],里面包含的数据资源如下(按应用领域划分):
Direct Marketing
KDD CUP 1998 Data
GIS
Forest CoverType
Indexing
Corel Image Features
Pseudo Periodic Synthetic Time Series
Intrusion Detection
KDD CUP 1999 Data
Process Control
Synthetic Control Chart Time Series
Recommendation Systems
Entree Chicago Recommendation Data
Robots
Pioneer-1 Mobile Robot Data
Robot Execution Failures
Sign Language Recognition
Australian Sign Language Data
High-quality Australian Sign Language Data
Text Categorization
20 Newsgroups Data
Reuters-21578 Text Categorization Collection
NSF Research Awards Abstracts 199 0-2003
World Wide Web
Microsoft Anonymous Web Data
MSNBC Anonymous Web Data
Syskill Webert Web Data
这里又找到一个,在一个老外的blog上找到的。(儿童节前一天)
[url]http://www.fs.fed.us/fire/fuelman/[/url]
KDD CUP 1998 Data
GIS
Forest CoverType
Indexing
Corel Image Features
Pseudo Periodic Synthetic Time Series
Intrusion Detection
KDD CUP 1999 Data
Process Control
Synthetic Control Chart Time Series
Recommendation Systems
Entree Chicago Recommendation Data
Robots
Pioneer-1 Mobile Robot Data
Robot Execution Failures
Sign Language Recognition
Australian Sign Language Data
High-quality Australian Sign Language Data
Text Categorization
20 Newsgroups Data
Reuters-21578 Text Categorization Collection
NSF Research Awards Abstracts 199 0-2003
World Wide Web
Microsoft Anonymous Web Data
MSNBC Anonymous Web Data
Syskill Webert Web Data
这里又找到一个,在一个老外的blog上找到的。(儿童节前一天)
[url]http://www.fs.fed.us/fire/fuelman/[/url]
本文转自 yuwenhu 51CTO博客,原文链接:http://blog.51cto.com/yuwenhu/136551,如需转载请自行联系原作者