大数据Spark “蘑菇云”行动第98课:Hive性能调优压缩和分布式缓存
Hive压缩一般采用Snappy、LZO和GZIP
core-site.xml
io.compression.codecs
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.DeflateCodec,
org.apache.hadoop.io.compress.SnappyCodec,
org.apache.hadoop.io.compress.Lz4Codec
set hive.exec.compress.intermediate=true; set mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
DistrubutedCache.addCacheFile()
hive.aux.jars.path