US3 Support
Note:
- Only supports for newly created uhadoop-2.2 version instances;
1. Use Cases
- Cold data or low-frequency calculation analysis scenarios
- Small-scale computing analysis scenarios
- Data backup
2. Access Steps
-
UHadoop Resource Creation: Create UHadoop instance of Hadoop-2.2 version via console;
-
US3 Resource Creation:
- Create the required Bucket on the US3 console. If it has been created, skip this step. For details, refer to: [US3-Create Storage Space](/docs/ufile/guide/space#Creating a Storage Space)
- Create tokens on the US3 console and associate them with the Bucket. If it has been created, skip this step. For details, refer to: US3-Token Management
-
UHadoop Configuration Modification:
-
Select the created UHadoop instance, enter Cluster Management-> Service Management page, click on Hadoop-> Parameter Configuration button
-
Modify US3 token information
- fs.us3.access.key:US3 token public key
- fs.us3.secret.key:US3 token private key
Perform the above steps, after these two parameters are modified, click the OK button, then check the re-option reminder option to restart the Hadoop service.
-
-
Hive Adaptation: After the Hadoop service restarts, restart the Hive service. Warning: Not restarting the service will report an error when accessing us3.
3. Usage Examples
3.1 HDFS
The format is hadoop fs -ls us3://<bucket name>/<path>
. Example of use is as follows:
-
Upload a file
-
View file list
3.2 Hive
You can use one of the following two methods to store data on US3.
-
The entire database is established on US3
hive (default)> create database hive_us3 location "us3://us3-uhadoop/hive_us3"
-
Specify the table to be established on US3
hive (hive_db)> create table us3_test ( name string) row format delimited fields terminated by '\t' location 'us3://sniper-s3-adapter/hive/us3-test';
3.3 Spark
Whether it is through spark-submit or pyspark, spark-shell, spark-sql and other read and write US3, you need to specify the file path format as: us3://<bucket name>/path
, as follows:
-
Upload the test script:
hadoop fs -put $SPARK_HOME/examples/src/main/python/pi.py us3://us3-uhadoop/example
-
Perform the test:
spark-submit --master yarn --deploy-mode client --num-executors 2 --executor-cores 1 --executor-memory 1G us3://us3-uhadoop/example/pi.py 100