Docs
uhadoop
Developer Guide
US3 Access

US3 Support

Note:

  • Only supports for newly created uhadoop-2.2 version instances;

1. Use Cases

  • Cold data or low-frequency calculation analysis scenarios
  • Small-scale computing analysis scenarios
  • Data backup

2. Access Steps

  1. UHadoop Resource Creation: Create UHadoop instance of Hadoop-2.2 version via console;

  2. US3 Resource Creation:

    • Create the required Bucket on the US3 console. If it has been created, skip this step. For details, refer to: [US3-Create Storage Space](/docs/ufile/guide/space#Creating a Storage Space)
    • Create tokens on the US3 console and associate them with the Bucket. If it has been created, skip this step. For details, refer to: US3-Token Management
  3. UHadoop Configuration Modification:

    • Select the created UHadoop instance, enter Cluster Management-> Service Management page, click on Hadoop-> Parameter Configuration button

    • Modify US3 token information

      • fs.us3.access.key:US3 token public key
      • fs.us3.secret.key:US3 token private key

      Perform the above steps, after these two parameters are modified, click the OK button, then check the re-option reminder option to restart the Hadoop service.

  4. Hive Adaptation: After the Hadoop service restarts, restart the Hive service. Warning: Not restarting the service will report an error when accessing us3.

3. Usage Examples

3.1 HDFS

The format is hadoop fs -ls us3://<bucket name>/<path>. Example of use is as follows:

  • Upload a file

  • View file list

3.2 Hive

You can use one of the following two methods to store data on US3.

  1. The entire database is established on US3

    hive (default)> create database hive_us3 location "us3://us3-uhadoop/hive_us3"
    
  2. Specify the table to be established on US3

    hive (hive_db)> create table us3_test ( name string) row format delimited fields terminated by '\t' location 'us3://sniper-s3-adapter/hive/us3-test';
    

3.3 Spark

Whether it is through spark-submit or pyspark, spark-shell, spark-sql and other read and write US3, you need to specify the file path format as: us3://<bucket name>/path, as follows:

  • Upload the test script:

    hadoop fs -put $SPARK_HOME/examples/src/main/python/pi.py us3://us3-uhadoop/example
    
  • Perform the test:

    spark-submit --master yarn --deploy-mode client --num-executors 2 --executor-cores 1 --executor-memory 1G us3://us3-uhadoop/example/pi.py 100