首页 \ 教程 \ hadoop

知识点

hadoop

Hadoop 利用FileSystem API 执行hadoop文件读写操作

java API 操作 Zookeeper

Hadoop 文件系统API开发编译

JAVA对XML文件的操作

Hadoop小文件操作之SequenceFile

Hadoop配置文件详解、安装及相关操作

关于Java Me的文件系统应用之Fc Api

Java操作HDFS文件系统

POI 操作 Excel的主要API

Hadoop : 新版API 自定义InputFormat 把整个文件作为一条记录处理

HDFS 使用Java api实现上传/下载/删除文件

Hadoop HelloWord Examples -对Hadoop FileSystem进行操作 - 基于Java

采用Python来访问Hadoop HSFS存储实现文件的操作

利用SolrJ操作solr API完成index操作

JAVA的solr操作实现（基本操作）

使用Java API操作Hadoop文件

2019-03-28 14:19|来源: 网络

1. 概述

2. 文件操作

2.1 上传本地文件到Hadoop fs

2.2 在hadoop fs中新建文件，并写入

2.3 删除hadoop fs上的文件

2.4 读取文件

3. 目录操作

3.1 在hadoop fs上创建目录

3.2 删除目录

3.3 读取某个目录下的所有文件

4. 参考资料接代码下载

<1>. 概述

hadoop中关于文件操作类基本上全部是在org.apache.hadoop.fs包中，这些api能够支持的操作包含：打开文件，读写文件，删除文件等。

hadoop类库中最终面向用户提供的接口类是FileSystem，该类是个抽象类，只能通过来类的get方法得到具体类。get方法存在几个重载版本，常用的是这个：

static FileSystem get(Configuration conf);

该类封装了几乎所有的文件操作,例如mkdir，delete等。综上基本上可以得出操作文件的程序库框架：

operator()
{
得到Configuration对象
得到FileSystem对象
进行文件操作
}

另外需要注意的是，如果想要运行下面的程序的话，需要将程序达成jar包，然后通过hadoop jar的形式运行，这种方法比较麻烦，另外一种方法就是安装eclipse的hadoop插件，这样能够很多打包的时间。

<1>. 文件操作

1.1 上传本地文件到文件系统

      /*
     * upload the local file to the hds
     * notice that the path is full like /tmp/test.c
      */
     public static void uploadLocalFile2HDFS(String s, String d)
         throws IOException
    {
        Configuration config = new Configuration();
        FileSystem hdfs = FileSystem.get(config);

        Path src = new Path(s);
        Path dst = new Path(d);

        hdfs.copyFromLocalFile(src, dst);

        hdfs.close();
    }

1.2 创建新文件，并写入

     /*
   * create a new file in the hdfs.
     * notice that the toCreateFilePath is the full path
     * and write the content to the hdfs file.
      */
     public static void createNewHDFSFile(String toCreateFilePath, String content) throws IOException
    {
        Configuration config = new Configuration();
        FileSystem hdfs = FileSystem.get(config);

        FSDataOutputStream os = hdfs.create( new Path(toCreateFilePath));

        os.write(content.getBytes( " UTF-8 " ));

        os.close();

        hdfs.close();
    }

1.3 删除文件

   /*
   * delete the hdfs file
     * notice that the dst is the full path name
      */
     public static boolean deleteHDFSFile(String dst) throws IOException
    {
        Configuration config = new Configuration();
        FileSystem hdfs = FileSystem.get(config);

        Path path = new Path(dst);
         boolean isDeleted = hdfs.delete(path);

        hdfs.close();

         return isDeleted;
    }

1.4 读取文件

     /** read the hdfs file content
     * notice that the dst is the full path name
      */
     public static byte [] readHDFSFile(String dst) throws Exception
    {
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(conf);

         // check if the file exists
        Path path = new Path(dst);
         if ( fs.exists(path) )
        {
            FSDataInputStream is = fs.open(path);
             // get the file info to create the buffer
            FileStatus stat = fs.getFileStatus(path);

             // create the buffer
             byte [] buffer = new byte [Integer.parseInt(String.valueOf(stat.getLen()))];
            is.readFully( 0 , buffer);

            is.close();
            fs.close();

             return buffer;
        }
         else
        {
             throw new Exception( " the file is not found . " );
        }
    }

<2>. 目录操作

2.1 创建目录

      /** make a new dir in the hdfs
     *
     * the dir may like '/tmp/testdir'
      */
     public static void mkdir(String dir) throws IOException
    {
        Configuration conf =    new Configuration();
        FileSystem fs = FileSystem.get(conf);

        fs.mkdirs( new Path(dir));

        fs.close();
    }

2.2 删除目录

      /** delete a dir in the hdfs
     *
     * dir may like '/tmp/testdir'
      */
     public static void deleteDir(String dir) throws IOException
    {
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(conf);

        fs.delete( new Path(dir));

        fs.close();
    }

2.3 读取某个目录下的所有文件

   public static void listAll(String dir) throws IOException
   {
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(conf);

        FileStatus[] stats = fs.listStatus( new Path(dir));

         for ( int i = 0 ; i < stats.length; ++ i)
        {
             if (stats[i].isFile())
            {
                 // regular file
                System.out.println(stats[i].getPath().toString());
            }
             else if (stats[i].isDirectory())
            {
                 // dir
                System.out.println(stats[i].getPath().toString());
            }
             else if (stats[i].isSymlink())
            {
                 // is s symlink in linux
                System.out.println(stats[i].getPath().toString());
            }

        }
        fs.close();
    }

<4>. 参考资料及代码下载

免费下载地址在 http://linux.linuxidc.com/

用户名与密码都是www.linuxidc.com

具体下载目录在 /pub/2011/11/12/使用Java API操作Hadoop文件/

知识点

相关文章

最近更新

使用Java API操作Hadoop文件

<1>. 概述

<1>. 文件操作

<2>. 目录操作

<4>. 参考资料及代码下载

相关问答

怎样用 filesystem java api 来实现查看文件系统空间总量就相当于linux命令： hadoop fs -du 的操作[2023-05-12]

安装配置hadoop，要更改hadoop-env.sh文件中的export JAVA_HOME为所[2023-11-03]

eclipse(java api)操作hadoop hdfs，我试图将本地文件拷贝进hdfs，目标却是本地文件系统，不是hdfs。[2022-01-09]

hadoop的MapReduce程序运行操作问题[2022-03-24]

java.api操作hadoop的hdfs需要什么权限吗?[2022-06-15]

求大神指导Java操作Hadoop的一个问题[2021-12-07]

java.api操作hadoop的hdfs需要什么权限吗?[2021-11-10]

任何人都可以为我提供hadoop文件shell命令到java类的映射吗？(Can anyone provide me mapping of hadoop file shell commands to java classes?)[2022-07-21]

Hadoop API：Reducer的OutputFormat(Hadoop API: OutputFormat for Reducer)[2023-06-16]

如何使用Java有效地读取Hadoop（HDFS）文件中的第一行？(How to read first line in Hadoop (HDFS) file efficiently using Java?)[2022-11-06]

知识点

相关文章

最近更新

使用Java API操作Hadoop文件

<1>. 概述

<1>. 文件操作

<2>. 目录操作

<4>. 参考资料及代码下载

相关问答

怎样用 filesystem java api 来实现查看文件系统空间总量 就相当于linux命令： hadoop fs -du 的操作[2023-05-12]

安装配置hadoop，要更改hadoop-env.sh文件中的export JAVA_HOME为所[2023-11-03]

eclipse(java api)操作hadoop hdfs，我试图将本地文件拷贝进hdfs，目标却是本地文件系统，不是hdfs。[2022-01-09]

hadoop的MapReduce程序运行操作问题[2022-03-24]

java.api操作hadoop的hdfs需要什么权限吗?[2022-06-15]

求大神指导Java操作Hadoop的一个问题[2021-12-07]

java.api操作hadoop的hdfs需要什么权限吗?[2021-11-10]

任何人都可以为我提供hadoop文件shell命令到java类的映射吗？(Can anyone provide me mapping of hadoop file shell commands to java classes?)[2022-07-21]

Hadoop API：Reducer的OutputFormat(Hadoop API: OutputFormat for Reducer)[2023-06-16]

如何使用Java有效地读取Hadoop（HDFS）文件中的第一行？(How to read first line in Hadoop (HDFS) file efficiently using Java?)[2022-11-06]

怎样用 filesystem java api 来实现查看文件系统空间总量就相当于linux命令： hadoop fs -du 的操作[2023-05-12]