比较mongodb的集合在有索引和无索引的情况下的insert,update(包括update 1个文档和update所有文档这2种情况),find的性能差异。
测试方法,先在集合中插入1000条原始数据,以此为基础,重复执行2000次的insert,update,find来做性能比较,由于find的速度比较快,实际中是采用10000条原始数据,重复执行100000来比较。测试的ruby代码如下
#!/usr/bin/env ruby # 20140309, index_test.rb ### # test index ### require "rubygems" require "mongo" require "benchmark" class MongoConnection def initialize(host, port) @mongoconn = Mongo::MongoClient.new(host, port) @db = @mongoconn.db("test") end def test() @coll = @db.collection("test") # insert test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.insert({ num:i } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "Insert Time used(s): #{tm_used}" @coll.drop # update test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.update({ }, { "$set" => {"num"=>i} } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "update_all Time used(s): #{tm_used}" @coll.drop # update test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.update({ num:300 }, { "$set" => {"num"=>i} } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "update Time used(s): #{tm_used}" @coll.drop # find test @coll.insert({"num"=>0}) (1..10000).each { |i| @coll.insert({ num:i } ) } tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1..100000).each { |i| x = @coll.find({ num:5000 } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "find Time used(s): #{tm_used}" @coll.drop end def test_index() @coll = @db.collection("test") # insert test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } @coll.create_index( :num=>Mongo::ASCENDING ) tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.insert({ num:i } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "insert Time used(s): #{tm_used}" @coll.drop # update test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } @coll.create_index( :num=>Mongo::ASCENDING ) tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.update({ }, { "$set" => {"num"=>i} } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "update_all Time used(s): #{tm_used}" @coll.drop # update test @coll.insert({"num"=>0}) (1..1000).each { |i| @coll.insert({ num:i } ) } @coll.create_index( :num=>Mongo::ASCENDING ) tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1001..3000).each { |i| @coll.update({ num:300 }, { "$set" => {"num"=>i} } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "update Time used(s): #{tm_used}" @coll.drop # find test @coll.insert({"num"=>0}) (1..10000).each { |i| @coll.insert({ num:i } ) } @coll.create_index( :num=>Mongo::ASCENDING ) tm_start = Time.now.to_f Benchmark.bm do |t| t.report{ (1..100000).each { |i| x = @coll.find({ num:5000 } ) } } end # p @coll.find.to_a tm_used = Time.now.to_f - tm_start puts "find Time used(s): #{tm_used}" @coll.drop end end mongo_conn = MongoConnection.new("localhost", 27017) puts "======TEST WITHOUT INDEX========================" mongo_conn.test() puts "======TEST WITH INDEX===========================" mongo_conn.test_index()
测试结果如下
======TEST WITHOUT INDEX======================== user system total real 1.000000 0.080000 1.080000 ( 1.401279) Insert Time used(s): 1.4016945362091064 user system total real 1.050000 0.090000 1.140000 ( 1.442167) update_all Time used(s): 1.442460298538208 user system total real 1.250000 0.180000 1.430000 ( 4.492881) update Time used(s): 4.493251085281372 user system total real 2.740000 0.000000 2.740000 ( 2.743044) find Time used(s): 2.7432751655578613 ======TEST WITH INDEX=========================== user system total real 0.980000 0.110000 1.090000 ( 1.461577) insert Time used(s): 1.4617598056793213 user system total real 1.020000 0.130000 1.150000 ( 1.601074) update_all Time used(s): 1.6013922691345215 user system total real 1.140000 0.140000 1.280000 ( 1.784649) update Time used(s): 1.7848689556121826 user system total real 2.710000 0.000000 2.710000 ( 2.699411) find Time used(s): 2.6996209621429443
对于以上结果,在有index的时候,insert单条文档,update所有文档这2种情况比没有index的时候略差一点,从上面的数据来看,大约是差5%~10%;update单个文档的性能,有index的时候,要远好于没有index的时候,时间上大约要少60%;令人出乎意料的是find的性能,似乎差异并不明显,有索引的时候仅仅略优于没有索引的情况,差异小于5%。