The short answer is "whenever possible."
HEAVY.AI uses a Dictionary Coder to optimize storage and performance. When you import data with Immerse, HEAVY.AI scans text-based fields for duplicate values. If 70% of the values appear more than once, the field is defined as TEXT ENCODING DICT. A dictionary table stores the common values, while the database stores reference to the full values. This can result in huge savings in storage and processing time.
Also, if the queries use the Text columns in GROUPS BY and DISTINCT, these columns must be encoded.
You can further improve the performance and memory saving by sharing dictionaries between and columns and tables. You can read about sharted dictionaries here.
Please sign in to leave a comment.