Вы находитесь на странице: 1из 8

MongoDB vs Redis

Iulian Lazar
SSA
University Politehnica of Bucharest
Email: iulianlaz@gmail.com

AbstractNoSQL databases become over time a very MongoDB is a document-oriented database writ-
plausible solution in order to replace SQL databases due ten in C++ [11]. The objects are stored serialized as
to their flexibility. For this purpose, a detailed analysis of BSON. The objects do not need to have the same
their performance will be needed in order to make the right
structure or fields and the common fields do not
decision based on our neeeds. Experimental results will
show that each of the databases that is tested (MongoDB need to have the same type, thus allowing a flexible
and Redis) has advantages and disadvantages. The context schema storage. MongoDB supports auto-sharding
in which the database is used represents the key factor where it partitions the data collections and stores
in order to determine which database you should use for the partitions across available servers. This results
better performance. in a dynamically balanced load [5].
I. I NTRODUCTION
II. R ELATED W ORK
In this article a comparison between two NoSQL
open source databases: MongoDB [6] and Redis NoSQL databases have emerged to ease certain
will be presented. MongoDB is a document oriented storage problem which SQL databases used to have.
database, whilst Redis is a key value database type. Nowadays, web applications became very lively and
The document oriented databases encapsulates data changing, so it is hard to define relational schemas,
in JSON format. because each entry into the database is different
from previous entries.
They have no restrictions in terms of scheme
(schema free databases), so each document can have For this matter it is recommended to use non-
different attributes. A key value store database can relational databases. Thus, the SQL databases over-
be viewed as a dictionary: each attribute is repre- head will be reduced (e.g. fewer fields for each
sented by a unique key; the data can be collected input, simplified mappings etc.). Also, another im-
using the associated key. If the key is unknown, the portant aspect in terms of NoSQL databases is the
data can not be extracted. fact that they can be easily distributate compared to
their rivals, classical SQL databases.
Redis provides many data structures [1] in order
to manipulate the data in a more efficient way, A very good analysis of the MongoDB database
depending on what people want to accomplish. A system regarding: various replication methods, par-
more detailed description of these structures and titioning (in terms of arhitecture), data access can
querying possibilities that can be made on these be found in this paper [3]. First of the NoSQL
databases can be found in this article [2]. databases is believed to be Googles BigTable [4].
His success initiated a number of other open-source
Redis typically holds the whole dataset in mem-
NoSQL database development. As mentioned be-
ory. Versions up to 2.4 could be configured to use
fore, NoSQL databases are schema free in compar-
what they refer to as virtual memory [17] in which
ison with SQL databases.
some of the dataset is stored on disk, but this feature
is deprecated. The information is transferred from According to the tests [5] between NoSQL and
memory to disk from time to time in order to keep SQL databases, not all NoSQL databases perform
the persistance of the data. much better than SQL databases. The focus of the
tests was to compare the key-value stores implemen- used only when it is absolutely necesary, because it
tations on NoSQL and SQL databases. adds system complexity. It is recommended insted
While NoSQL databases are generally designed of sharding to use read replicas or caching, when is
for optimized key- value stores, SQL databases are possible.
not. Yet, the results suggest that not all NoSQL
databases perform better than SQL databases. In III. A RCHITECTURE
those tests it is compared read, write, delete, and in-
stantiate operations on the key-value storage. Based In order to test the performance of the two
on those tests it was discovered that even within NoSQL systems, the tools described below have
NoSQL databases there is a wide variation in the been used. For creating a distributed system, three
performance of these operations. We also observe virtual machines from OpenStack have been used
little correlation between performance and the data with the following configuration: Ubuntu 14.04 op-
model each database uses. erating system, 1 single core processor, 1 GB RAM
and 24 GB disk.
A performance overview and evaluation of the
execution speed of five popular NoSQL databases: On each of this machines were installed: redis
MongoDB, Redis, Cassandra[13], HBase [15] and database, mongo database, php [12] client for redis
OrientDB [16] has been done by a team from Uni- database and php client for mongodb. In order
verisity of Coimbra [7]. They used for performance to measure cpu usage, overall I/O activities and
analysis Yahoo! Cloud Serving Benchmark [8], be- bandwidth, the following tools have been used: sar
cause they can easily generate data and create a set [18] command, nload [19] (in order to measure real
of performance tests in a simplistic way. Each of the time bandwidth traffic).
test scenarios is called workload and is defined by For testing Redis performance, two data struc-
a set of features, including a percentage of read and tures have been used: sorted sets and lists. The
update operations, total number of operations, and database has been populated with a maximum num-
number of records used. The benchmark package ber of 5 millions entries. The initial test has tried to
provides a set of default workloads that may be insert 10 millions entries, but 1 GB RAM memory
executed and are defined by read, update, scan and was not enough, so the database has crashed. The
insert percentages. maximum number of clients simultaneously con-
A very interesting comparison of Cassandra, nected was 400. Entries has been inserted into two
HBase and MySQL[21] can be found in this article Redis databases using client side partitioning [20].
[9]. Also, some performance measurements were
In order to keep a clear evidence between Redis
done, especially to determine read and update la-
and MongoDB performance, the same tests have
tency in a read/write intensive environment.
been run for MongoDB. In the experimental results
In order to take a deeper look into the structure described below, it has been tried to provide the
of the presented databases it is recommended to strengths and weaknesses of each database and their
read the article [10] written by Clarence J M Tauro limitations.
and Aravindh S. According to the authors, the
biggest challenge to NOSQL databases is scaling The data model that will be used for the two
to size. Scaling to size in todays world is scaling databases has the following SQL classic structure:
horizontally, that is adding new machines. There are Users table with the following columns: id,
number of techniques to achieve this: Master Slave Name
replication, Sharding, Dynamo model.
Group table with the following columns: id,
The goal for horizontal scalability is linear scal-
Group name
ability; when the number of machines are doubled,
the storage system doubles the query performance Data table: id, user id, group id, data sensors,
of the system [14]. Sharding mechanism should be time
The entries inserted into Redis and MongDB
database will be mapped to the following structure:
For Redis database: the key will be computed
using user id and group id (this will be
unique): userId:groupId. Then the key will
keep the payload. For sorted set structure,
the score will represent the creation time for
the entry.
For MongoDB database: A document with Fig. 1: .
the following fields: user, group, data sen-
sors, time.
The tests will show the databases performance As it can be seen, CPU usage for MongoDB
especially for insert(write) operations, but interest- in the highest point is almost 70% (user usage)
ing results will be described for read and delete and 45% (system usage), whereas for Redis the
operations. highest point is almost 35% (system usage) and 10%
(user usage). In conclusion, MongoDB uses more
resources to insert almost the same amount of data
IV. E XPERIMENTAL R ESULTS with the same number of clients.
To provide the best possible comparison between The I/O overall activities can be visualized below
the two databases, Redis and MongoDB, it has in figure 2.
been tried to use the same tests scenarios for both
databases. The tests will show the CPU overall
usage during insertion, I/O overall activities, band-
width speed.
Also, a comparison between the number of en-
tries inserted in a predefined period of time will
be presented. For insertion tests, the number of
clients that made the requests have been modified
(for a minimum of 1 client to a maximum of 400
clients). Also, clients run the queries from different
machines. For searching tests, the structure from
Redis have been changed. First, it will be presented Fig. 2: .
the results when a list structure type has been used
followed by a sorted set structure.
First test scenario uses 200 clients that made I/O activities for Redis insertion is almost zero,
requests in parallel (all clients are running on a because Redis is an in-memory database, it keep
different machine than the machine where Redis and all the data in RAM. In order to mantain the data
MongDB databases are installed). Each client inserts consistence, data will be saved on disk from time to
entries into database for a period of 30 seconds. time (the highst peek from the diagram represents
When the test has finished, the number of inserted the time when the data are written to disk - 8016
keys into Redis database was 202180, whilst into bwrtn/s). MongoDB stores data on disk, the highest
MongoDB database was 239736. During insertion, peek as shown from the diagram is almost 30000
CPU usage has been monitored. Thus, the diagram bytes written per second.
from figure 1 shows CPU usage (system and user During insertion time, data has been transferred
CPU usage). with a minimum speed of 1Kb/s for both databases
and a maximum speed of 17Mb/s for MongoDB and is an in-memory database, it keep all the data in
7.5Mb/s for Redis. A more detailed look can be seen RAM. In order to mantain the data consistence, data
below in figure 3. will be saved on disk from time to time (the highst
peek from the diagram represents the highest speed
when the data are written to disk - 136 bwrtn/s).
MongoDB stores data on disk, the highest peek
as shown from the diagram is almost 28752 bytes
written per second.
The I/O overall activities can be visualized below
in figure 5.

Fig. 3: .

The second test scenario uses 200 clients that


made requests in parallel and they are distributed
on two machines, 100 clients on each machine. (all
clients are running on a different machines than the
machine where Redis and MongDB databases are
Fig. 5: .
installed).
Each client inserts entries into database for a
period of 30 seconds. When the test has finished, During insertion time, data has been transferred
the number of inserted keys into Redis database with a minimum speed of 1Kb/s for both databases
was 472673, whilst into MongoDB database was and a maximum speed of 17Mb/s for MongoDB and
237752. During insertion, CPU usage has been 31.42Mb/s for Redis. A more detailed look can be
monitored. Thus, the diagram from figure 4 shows seen below in figure 6.
CPU usage (system and user CPU usage).

Fig. 6: .
Fig. 4: .

Another test scenario uses 400 clients that made


As it has been discussed earlier, I/O activities requests in parallel and they are distributed on
for Redis insertion is almost zero, because Redis two machines, 200 clients on each machine. (all
clients are running on a different machines than the
machine where Redis and MongDB databases are
installed).
Each client inserts entries into database for a
period of 30 seconds. When the test has finished,
the number of inserted keys into Redis database
was 661042, whilst into MongoDB database was
282624. During insertion, CPU usage has been
monitored. As it can be seen on diagram below, CPU
usage when the keys was inserted into MongoDB
database was 100 percent (60 percent user CPU and Fig. 8: .
almost 40 percent on system CPU). When keys was
inserted into Redis database, CPU usage was almost
50 percent (40 percent system CPU and 10 percent
user CPU). Thus, the diagram from figure 7 shows below in figure 9.
CPU usage (system and user CPU usage).

Fig. 9: .
Fig. 7: .

As a conclusion for this type of tests when


As it has been discussed earlier, I/O activities inserting entries for a period of 30 seconds with
for Redis insertion is almost zero, because Redis a different number of clients would be that CPU is
is an in-memory database, it keep all the data in used more intensive when inserting into MongoDB
RAM. In order to mantain the data consistence, data database; insertions operation uses the disk more
will be saved on disk from time to time (the highst intensive when is adding data into MongoDB.
peek from the diagram represents the highest speed
when the data are written to disk - 123 bwrtn/s). On each databases was tried to insert one million
MongoDB stores data on disk, the highest peek entries with a different number of clients each time.
as shown from the diagram is almost 12452 bytes The results can be seen in the table 1 (the table
written per second. below) and in the diagram from figure 10. The data
transfer is faster in the Redis case.
The I/O overall activities can be visualized below
in figure 8. No of Clients Redis Time MongoDB Time
During insertion time, data has been transferred 1 4 min 49 s 5 min 51s
with a minimum speed of 1Kb/s for both databases 10 1 min 20 s 2 min 1 s
and a maximum speed of 18Mb/s for MongoDB and 20 1 min 16 s 2 min 12 s
30 Mb/s for Redis. A more detailed look can be seen 40 1 min 20 s 1 min 52 s
Table 1 - Time spent in order to insert 1 million The searching time in this case was 12 seconds.
entries
The second approach will use a sorted set struc-
ture. With this kind of structure the time field
(described in architecture section) can be kept as
score for the entry. Thus, search will be done by
the database. The searching time in this case was
less than a second.
Having the same configuration for the MongoDB
database (1 client, 1 million entries into database),
the searching operation can be done by the database.
The searching time in this case was about two
seconds.
In figure 11 can be seen a comparison of these
Fig. 10: . searches. Insertion time for Redis entries was about
4 minutes and 36 seconds and for MongDB entries
was 6 minutes and 42 seconds. To conclude, search-
As it can be seen from figure 10 and table 1, ing time is better on Redis databases.
insertion time for Redis is lower than insertion time
for MongoDB. Thus, Redis is faster on insertion.
An advantage of Redis is that it keeps the data in-
memory.
It has been tried to partition the data between two
Redis servers. For that, client side partitioning has
been used. Client side partitioning means that the
clients directly select the right node where to write
or read a given key.
Many Redis clients implement client side parti-
tioning. The results were not impressive. For one
million entries with a number of ten clients the in- Fig. 11: .
sertion time was 1 minute 38 seconds as against the
results obtained on a single machine (1 minute 20
seconds). It should be noted that the implementation
of client-side partitioning and his method to choose
the right server can bring a major overhead.
Searching time for a specific entry into database
will be analysed below. First, it will be discussed
searching time for Redis database. Having 1 client
that made the requests, it will be inserted into one
single key one million entries.
First time a list structure will be used to store
data. In order to search for a time interval (for data
model described in arhitecture section) it should be Fig. 12: .
fetched all the records, decoded and then extracted
the entries that matched our needs. Thus, in this
case, the entire search will be made by the client. Figure 12 describes the time spent for delete
operation for both databases. As it can be seen, for
one million entries into database, the deletion time
was almost 2 seconds in Redis case and 20 seconds
for MongoDB.
For 5 millions entries, the deletion time was
almost 11 seconds for Redis case and 2 minutes and
50 seconds for MongoDB. As a conclusion, deletion
time is better on Redis databases.
When it has been tried to be added 10 millions
entries in Redis database, the Redis instance has
Fig. 14: .
crashed. Only 7-8 millions entris has been added
to database. In order to insert 10 millions entries,
200 clients have been used and each client has tried
to add 50000 entries. The RAM memory has been
exhausted as it can be seen in figure 13.

Fig. 13: .

Fig. 15: .
Another test scenario has been created in order
to add 5 millions entries into each database using
200 clients. Each clients creates 25000 entries. In
Redis case, the insertion time was 5 minutes and V. C ONCLUSION
15 second, the fastest client ends his requests in This paper has tried to make a performance
5 minutes and 1 second, whilst the slowest client analysis of two NoSQL databases, Redis and Mon-
ends his requests in 5 minutes 28 seconds. The goDB, which became very popular as solutions for
insertion time for MongoDB was 11 minutes and relational databases, SQL. It has been provided the
44 seconds, the fastest client ends his request in 11 advantages and disadvantages of using each of the
minutes and 38 seconds, whilst the slowest client database.
ends his requests in 11 minutes 50 seconds. Figure
As it can be seen in this paper, Redis database
14 describes the time spent in each cases.
can be a very good solution for insertion operations
A comparison between the number of entries because is faster than MongoDB. A disadvantage is
inserted into MongoDB database and Redis database that if a lot of data cames in a short time interval,
can be visualized in figure 15. being an in-memory database, the RAM memory
As it can be seen, when the number of clients could be exhausted and Redis instance can crash. It
has been increased, the number of entries inserted is recommended to use Redis database for intensive
into Redis database increases. But, we cannot say reads operations, to store user session (because is a
the same thing about MongoDB database. The num- very demanded object for a web page for example).
ber of entries inserted into MongoDB databases If the proper structure is used, it can be obtained a
increases very slowly with increasing the number very good response time for searching queries.
of clients. MongoDB is slower than Redis during insertion
and deleting operations, but you should consider that
Redis is an in-memory database and could not keep
large amounts of data. MongoDB is a good solution
if you want to store large chunks of data, because
searching queries are executed very fast.
In conclusion, if you want to store large amounts
of data and search fast through this, a good solution
is MongoDB. If you have a limited amount of
data and want to insert it very fast you could use
Redis. If you are keen on document type databases,
you should use MongoDB or if you are fond of
key values databases you should use Redis. It all
depends on your needs and your decisions.

R EFERENCES
[1] http://redis.io/topics/data-types-intro
[2] 2011 International Conference on Cloud and Service Comput-
ing: NoSQL Evaluation- A Use Case Oriented Survey,
[3] Christof Strauch, NoSQL Databases, Stuttgart Media Univer-
sity.
[4] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach,
M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber,Bigtable:
a distributed storage system for structured data, in Proceedings
of the 7th USENIX Symposium on Operating Systems Design
and Implementation - Volume 7, ser. OSDI 06. Berkeley, CA,
USA: USENIX Association, 2006, pp. 1515.
[5] Yishan Li and Sathiamoorthy Manoharan, A performance com-
parison of SQL and NoSQL databases Department of
Computer Science, University of Auckland, New Zealand
[6] http://www.mongodb.org/
[7] Veronika Abramova, Jorge Bernardino, Pedro Furtado, Which
NoSQL Database? A Performance Overview
[8] Cooper, B., Silberstein, A., Tam, E., Ramakrishnan, R., and
Sears, R.: Benchmarking cloud serving systems with YCSB. In
Proceedings of the 1st ACM Symposium on Cloud Computing
(SoCC 10). ACM, New York, NY, USA, 143-154, 2010.
[9] Bogdan George Tudorica, Cristian Bucur, A comparison between
several NoSQL databases with comments and notes
[10] Clarence J M Tauro, Aravindh S, Shreeharsha A.B Com-
parative Study of the New Generation, Agile, Scalable, High
Performance NOSQL Databases
[11] K. Chodorow and M. Dirolf, MongoDB: The Definitive Guide.
OReilly Media, September 2010
[12] http://php.net/
[13] http://cassandra.apache.org/
[14] Adam Marcus, The NoSQL Ecosystem
[15] http://hbase.apache.org/
[16] http://www.orientechnologies.com/orientdb/
[17] http://redis.io/topics/virtual-memory
[18] http://linux.die.net/man/1/sar
[19] http://linux.die.net/man/1/nload
[20] http://redis.io/topics/partitioning
[21] http://www.mysql.com/

Вам также может понравиться