序言
默认我们搭建起来的Hive或者SparkSql都是不支持事务的.需要相应的配置才能开启Hive的事务.
同时Hive的Delete和Update也是需要先开启ACID才能支持的cuiyaonan2000@163.com.
参考地址:
- LanguageManual DML - Apache Hive - Apache Software Foundation
- Hive Transactions - Apache Hive - Apache Software Foundation
还是针对hive-site.xml进项额外的配置即可
hive.support.concurrencytruehive.enforce.bucketingtruehive.exec.dynamic.partition.modenonstricthive.txn.managerorg.apache.hadoop.hive.ql.lockmgr.DbTxnManagerhive.compactor.initiator.ontruehive.compactor.worker.threads1数据库
如果你是在已经初始化的metastore上启用ACID则需要执行hive目录/soft/hadoop/apache-hive-3.1.2-bin/scripts/metastore/upgrade/mysql 下找到如下的文件在mysql的数据库中执行cuiyaonan2000@163.com
如果是还没初始化数据库则,在配置完hive-site.xml后,直接执行初始化数据库就行了.命令如下:
schematool -dbType mysql -initSchemaTable的相关设置
通过官网和网友们的翻译我们了解到table必须满足如下的要求才能支持事务和delete与update.
翻译如下
- 表的存储格式必须是ORC(STORED AS ORC);
- 表必须进行分桶(CLUSTERED BY (col_name, col_name, ...) INTO num_buckets BUCKETS);
- Table property中参数transactional必须设定为True(tblproperties('transactional'='true'));
官网的示例:
CREATE TABLE table_name ( id int, name string ) CLUSTERED BY (id) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES ("transactional"="true", "compactor.mapreduce.map.memory.mb"="2048", -- specify compaction map job properties "compactorthreshold.hive.compactor.delta.num.threshold"="4", -- trigger minor compaction if there are more than 4 delta directories "compactorthreshold.hive.compactor.delta.pct.threshold"="0.5" -- trigger major compaction if the ratio of size of delta files to -- size of base files is greater than 50% );