Index of cdh6/6.3.2/docs/hadoop-3.0.0-cdh6.3.2/hadoop-hdfs-httpfs/


NameLast ModifiedSize
Parent Directory
apidocs/ - -
css/ - -
images/ - -
dependency-analysis.html 2019-11-12 13:42 21.67KB
httpfs-default.html 2019-11-12 13:42 6.71KB
index.html 2019-11-12 13:41 25.02KB
project-reports.html 2019-11-12 13:43 22.34KB
ServerSetup.html 2019-11-12 13:45 31.09KB
UsingHttpTools.html 2019-11-12 13:43 25.15KB

Cluster Setup
  • Commands Reference
  • FileSystem Shell
  • Compatibility Specification
  • Downstream Developer's Guide
  • Interface Classification
  • FileSystem Specification
  • Common
    HDFS
    MapReduce
    MapReduce REST APIs
    YARN
    YARN REST APIs
    Hadoop Compatible File Systems
    Auth
    Tools
    Reference
    Configuration
    Built by Maven

    Hadoop HDFS over HTTP - Documentation Sets

    HttpFS is a server that provides a REST HTTP gateway supporting all HDFS File System operations (read and write). And it is interoperable with the webhdfs REST HTTP API.

    HttpFS can be used to transfer data between clusters running different versions of Hadoop (overcoming RPC versioning issues), for example using Hadoop DistCP.

    HttpFS can be used to access data in HDFS on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).

    HttpFS can be used to access data in HDFS using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.

    The webhdfs client FileSystem implementation can be used to access HttpFS using the Hadoop filesystem command (hadoop fs) line tool as well as from Java applications using the Hadoop FileSystem Java API.

    HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.

    How Does HttpFS Works?

    HttpFS is a separate service from Hadoop NameNode.

    HttpFS itself is Java Jetty web-application.

    HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file system operation. For example, using the curl Unix command:

    • $ curl 'http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt?op=OPEN&user.name=foo' returns the contents of the HDFS /user/foo/README.txt file.

    • $ curl 'http://httpfs-host:14000/webhdfs/v1/user/foo?op=LISTSTATUS&user.name=foo' returns the contents of the HDFS /user/foo directory in JSON format.

    • $ curl 'http://httpfs-host:14000/webhdfs/v1/user/foo?op=GETTRASHROOT&user.name=foo' returns the path /user/foo/.Trash, if / is an encrypted zone, returns the path /.Trash/foo. See more details about trash path in an encrypted zone.

    • $ curl -X POST 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS&user.name=foo' creates the HDFS /user/foo/bar directory.

    User and Developer Documentation