Hive

Is it possible to support hive on spark when executing sql tasks in Hive is too slow?

UHadoop does not currently support hive on spark, but you can replace it with Spark-sql, or enable Spark-thriftserver and connect through beeline.

What to do if there is not enough memory for the map/reduce task when executing SQL statements?

If you see running beyond physical memory limits error in the log file, you can increase the memory that map or reduce can use through


set mapreduce.map.memory.mb=XXX
set mapreduce.reduce.memory.mb=XXX

etc. If this situation occurs in multiple tasks, you can add configuration in /home/hadoop/hive/conf/hive-site.xml.

hive-server2 reports insufficient file permissions when submitting tasks through jdbc

You can use the hadoop user when connecting to hive, for example, in java code:


Connection con = DriverManager.getConnection("jdbc:hive2://ip:10000/default", "hadoop", "");

What to do when the speed is too slow when executing sql?

The default optimization strategy of Hive is for simple operations like select, and it will not start the map task. You can force the start of the map task by specifying the hive.fetch.task.conversion=none parameter in the command, or configuring it in /home/hadoop/hive/conf/hive-site.xml.

Example:


hive -e "set hive.fetch.task.conversion=none;select * from spark_testtable;"