Alexander Zeier, Visiting Professor, Massachusetts Institute of Technology shared his keynote on “SAP HANA®: In-Memory Data Management for Enterprise Applications” and shared insights about in memory computing to help faculty complement lectures and help students understand this breakthrough technology.
— in San Antonio, Texas.
More Details on Facebook – SAP UA : http://www.facebook.com/media/set/?set=a.10150782259887498.465877.367939082497&type=3
Hi Alexander,
I really liked your presentation at the univerity alliance keynote. Unfortunately, I did not fully understand how the columnar storage helps to reduce the required size of memory. Could you elaborate on that?
Bests,
Dr. Antonio Banderas
Hi Antonio,
please check if this is helpful to understand more compression in HANA.
Even though main memory sizes have grown rapidly, data compression techniques are still necessary if we are to keep all the data of large enterprise applications in main memory.
Compression also minimizes the amount of data that needs to be transferred between non-volatile storage and main memory on, for example, system startup. In HANA, data values are not stored directly, but rather as references to a compressed, sorted dictionary containing the distinct column values. These references are encoded and can even be further compressed to save space. This dictionary-compression technique offers excellent compression rates in an enterprise environment where many values, for example, country names, are repeated. Read performance is also improved, because many operations can be performed directly on the compressed data.
Best Alexander
Hi Alexander,
thanks for your reply. Do you know, if dictionary encoding can be used in row oriented database systems as well?
Bests,
Antonio
Hi Antonio,
short answer: In row harddisk-based databases is this not possible this way, the latence access time would be far to long. Only a in-memory column databse can offer this.
here more details why compression is so well working for in-memory data management:
Data compression techniques exploit redundancy within data and knowledge about
the data domain. Compression applies particularly well to columnar storage in an
enterprise data management scenario, since all data within a column (a) have the
same data type and (b) in many cases there are few distinct values, for example in
a country column or a status column. In column stores, compression is used for
two reasons: to save space and to increase performance.
Efficient use of space is of particular importance to in-memory data management
because, even though the cost of main memory has dropped considerably, it
is still relatively expensive compared to disk. Due to the compression within the
columns, the density of information in relation to the space consumed is increased.
As a result more relevant information can be loaded for processing at a time
thereby increasing performance. Fewer load actions are necessary in comparison to
row storage, where even columns of no relevance to the query are loaded.
I hope this details help.
Best Alexander