ClickHouse - 列式储存数据库


Apache
Linux
C/C++

软件简介

ClickHouse是俄罗斯第一大搜索引擎Yandex开发的列式储存数据库.令人惊喜的是,这个列式储存数据库的性能大幅超越了很多商业MPP数据库软件,比如Vertica,InfiniDB.

相比传统的数据库软件,ClickHouse要快100-1000X:

100Million 数据集:

  • ClickHouse比Vertica约快5倍,比Hive快279倍,比My SQL快801倍

1Billion 数据集:

  • ClickHouse比Vertica约快5倍,MySQL和Hive已经无法完成任务了

该项目当前还有 一些不足 :

  • pre-build包只有Ubuntu平台的可用,并且该项目当前没有任何架构文档

  • 只有Github上面的C++源代码

主要功能

  • True column-oriented

  • Vectorized query execution

  • Data compression

  • Parallel and distributed query execution

  • Real-time data ingestion

  • On-disk locality of reference

  • Real-time query processing

  • Cross-datacenter replication

  • High availability

  • SQL support

  • Local and distributed joins

  • Pluggable external dimension tables

  • Arrays and nested data types

  • Approximate query processing

  • Probabilistic data structures

  • Full support of IPv6

  • Features for web analytics

  • State-of-the-art algorithms

  • Detailed documentation

  • Clean documented code

应用场景

  • Web and App analytics

  • Advertising networks and RTB

  • Telecommunications

  • E-commerce

  • Information security

  • Monitoring and telemetry

  • Business intelligence

  • Online games

  • Internet of Things