We are dealing with both relational and NoSQL databases and we are often involved in advising our customers on choosing better product for their unique circumstances. One of our favorites, MariaDB ColumnStore has number of immediate advantages for utilization in BI systems:
For companies already using MySQL it may offer additional benefit of not leaving MySQL universe.
However, there are other good NoSQL vendors around and we are always on the lookout for better product or features. This time we noticed number of discussions and blogs that claimed MariaDB ColumnStore as having significant performance issues comparing to competition.
For example, here they compare the following vendors (and MariaDB ColumnStore appears the slowest):
We decided to make our own test – and if necessary, to find a solution. Below is the description of our experiment.
We always approach optimization process using two principal methods:
- Data Model optimization
- Parameters optimization
Data Model Optimization
We looked at the tables used in the original tests and decided to replace VARCHAR and DECIMAL to INT UNSIGNED wherever possible.
Our environment: Centos 7 Guest on Oracle Virtual Box, MacBook Pro.
SQL statement: SELECT AVG(fact_id) FROM fact_ratings;
|HPE Vertica||0.39 sec|
|MariaDB ColumnStore||28.37 sec|
Our result for MariaDB ColumnStore: 1.34 sec.
SQL statement: SELECT rating, count(*) from fact_ratings group by rating order by rating;
|HPE Vertica||0.95 sec|
|MariaDB ColumnStore||29.24 sec|
Our result for MariaDB ColumnStore: 2.02 sec.
SQL statement: select dim_movies.title, avg(fact_ratings.rating) from dim_movies join fact_ratings on dim_movies.movieId = fact_ratings.”movieId” group by dim_movies.title DESC order by dim_movies.title limit 20;
|HPE Vertica||14.94 sec|
|MariaDB ColumnStore||5 min 22 sec|
Our result for MariaDB ColumnStore: 10.98 sec.
SQL statement: select dim_movies.title, fact_ratings.rating, dim_links.imdbid
left outer join fact_ratings on dim_movies.movieId = fact_ratings.”movieId”
left outer join dim_links on dim_movies.movieId = dim_links.movieid
where fact_ratings.rating > (select avg(fact_ratings.rating) from fact_ratings)
order by fact_ratings.rating DESC
|HPE Vertica||10.12 sec|
|MariaDB ColumnStore||6 min 4 sec|
Our result for MariaDB ColumnStore: 17.55 sec.
We experimented using very simple optimization method and common hardware – and still results are pretty good. If we were using the same hardware as described in the original comparison or applied parameter optimization – we tend to believe our results would be even better.
We are glad to be able to confirm MariaDB ColumnStore capabilities and will continue recommending it to our customers.