FileSystem方法——判断功能,是直接调用FileSyste

今天的主要内容

  • HDFS获取文件系统

  • HDFS文件上传

  • HDFS文件下载

  • HDFS目录创建

  • HDFS文件夹删除

  • HDFS文件名更改

  • HDFS文件详情查看

  • 定位文件读取

  • FileSystem类的学习

//获取文件系统@Testpublic void initHDFS() throws Exception{ //1. 获取文件系统 Configuration configuration = new Configuration(); FileSystem fileSystem = FileSystem.get(configuration); //2. 打印文件系统到控制台 System.out.println(fileSystem.toString; }

@Testpublic void putFileToHdfs() throws Exception{ Configuration conf = new Configuration(); conf.set("dfs.replication", "2"); //代码优先级是最高的 conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000"); FileSystem fileSystem = FileSystem.get; //上传文件 fileSystem.copyFromLocalFile(new Path("hdfs.txt"), new Path("/user/anna/hdfs/test.txt")); //关闭资源 fileSystem.close(); }参数优先级:客户端代码中设置的值 >classpath 下的用户自定义配置文件 > 然后是服务器的默认配置

public void copyToLocalFile(boolean delSrc,Path src,Path dst,boolean useRawLocalFileSystem) throws IOExceptiondelSrc - whether to delete the srcsrc - pathdst - pathuseRawLocalFileSystem - whether to use RawLocalFileSystem as local file system or not.@Testpublic void testCopyToLocalFile() throws Exception{ Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000"); FileSystem fileSystem = FileSystem.get; ///下载文件 fileSystem.copyToLocalFile(false,new Path("/user/anna/hdfs/test.txt"), new Path("test.txt"),true); //关闭资源 fileSystem.close(); }

@Testpublic void testMakedir() throws Exception{ Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000"); FileSystem fileSystem = FileSystem.get; //目录创建 fileSystem.mkdirs(new Path("/user/anna/test/hahaha")); //关闭资源 fileSystem.close();}

@Testpublic void testDelete() throws Exception{ Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000"); FileSystem fileSystem = FileSystem.get; //文件夹删除 fileSystem.delete(new Path("/user/anna/test/hahaha"),true); //true表示递归删除 //关闭资源 fileSystem.close();}

@Testpublic void testRename() throws Exception{ Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000"); FileSystem fileSystem = FileSystem.get; //文件名称更改 fileSystem.rename(new Path("/user/anna/test/copy.txt"), new Path("/user/anna/test/copyRename.txt")); //关闭资源 fileSystem.close();}

几种实现方法

1. public abstract FileStatus[] listStatus throws FileNotFoundException,IOException * 返回FileStatus型数组2. public FileStatus[] listStatus(Path f,PathFilter filter) throws FileNotFoundException,IOException3. public FileStatus[] listStatus(Path[] files,PathFilter filter) throws FileNotFoundException,IOException * 此时注意PathFilter是一个接口,里面只有一个方法:accept,本质是对文件进行筛选 * Enumerate all files found in the list of directories passed in, calling listStatus(path, filter) on each one.注意:以上方法返回的文件按照字母表顺序排列

代码:FileStatus[] listStatus

//FileStatus[] listStatus的使用try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //listStatus获取/test目录下信息 FileStatus[] fileStatuses = fileSystem.listStatus(new Path; //遍历输出文件夹下文件 for(FileStatus fileStatus :fileStatuses) { System.out.println(fileStatus.getPath() + " " + new Date(fileStatus.getAccessTime + " " + fileStatus.getBlockSize() + " " + fileStatus.getPermission; }}catch(Exception e) { e.printStackTrace();}/*在JDK1.8中输出结果为:----------------------------------------------------------------------------hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz 2012-07-26 134217728 rw-r--r--hdfs://10.9.190.90:9000/test/hello.txt 2012-07-26 134217728 rw-r--r--hdfs://10.9.190.90:9000/test/test2 1970-01-01 0 rwxr-xr-x----------------------------------------------------------------------------*/

代码:FileStatus[] listStatus(Path f,PathFilter filter)

try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //列出目录下后缀为.md的文件相关信息 FileStatus[] statuses = fileSystem.listStatus(new Path("/test/test2"), new PathFilter() { @Override public boolean accept(Path path) { // TODO Auto-generated method stub String string = path.toString(); if(string.endsWith return true; else return false; } }); //列出文件信息 for(FileStatus status : statuses) { System.out.println("Path : " + status.getPath() + " Permisson : " + status.getPermission() + " Replication : " + status.getReplication; }}catch(Exception e) { e.printStackTrace();}
  • 今天的主要内容

    • 对照官方文档进行FileSystem类的学习

    • FileSystem中的方法

       * boolean exists * boolean isDirectory * boolean isFile * FileStatus getFileStatus * Path getHomeDirectory() * FileStatus[] listStatus(Path path, PathFilter filter) FileStatus[] listStatus(Path path) FileStatus[] listStatus(Path[] paths, PathFilter filter) FileStatus[] listStatus(Path[] paths) * RemoteIterator[LocatedFileStatus] listLocatedStatus(Path path, PathFilter filter) RemoteIterator[LocatedFileStatus] listLocatedStatus(Path path) RemoteIterator[LocatedFileStatus] listFiles(Path path, boolean recursive) * BlockLocation[] getFileBlockLocations(FileStatus f, int s, int l) BlockLocation[] getFileBlockLocations(Path P, int S, int L) * long getDefaultBlockSize() long getDefaultBlockSize long getBlockSize * boolean mkdirs(Path p, FsPermission permission) * FSDataOutputStream create(Path, ...) FSDataOutputStream append(Path p, int bufferSize, Progressable progress) FSDataInputStream open(Path f, int bufferSize) * boolean delete(Path p, boolean recursive) * boolean rename(Path src, Path d) * void concat(Path p, Path sources[]) * boolean truncate(Path p, long newLength) * interface RemoteIterator boolean hasNext() E next() * interface StreamCapabilities boolean hasCapability(capability)
      
  • start-dfs.sh启动hadoop集群

  • eclipse进行hdfs文件系统的访问

    • 导入相应的jar包
  • 创建与hdfs的连接并获取FileSystem文件对象

    • 第一种方式

       * public static FileSystem get(Configuration conf) throws IOException //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //namenode上的IP地址 端口为:9000 //获得fileSystem FileSystem fileSystem = FileSystem.get;
      
    • 第二种方式

       * public static FileSystem get(URI uri,Configuration conf,String user) throws IOException, InterruptedException URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory; FileSystem fileSystem = FileSystem.get(new URI("hdfs://10.9.190.90:9000"),new Configuration; //此时工作目录会相应更改为/user/root
      
    • 两种方式比较

      • 第二种方式可能会抛出InterruptedException异常,因为

        • the static FileSystem get(URI uri, Configuration conf,String user) method MAY return a pre-existing instance of a filesystem client class—a class that may also be in use in other threads. The implementations of FileSystem shipped with Apache Hadoop do not make any attempt to synchronize access to the working directory field.(此时get方法可能会返回一个已经存在FileSystem对象,也就是存在线程异步问题,所以我们尽量用前一种方式来完成FileSystem对象的创建)

package hdfs;

HDFS的JavaAPI操作:

org.apache.hadoop.fs.FileSystem简介

  • The abstract FileSystem class is the original class to access Hadoop filesystems; non-abstract subclasses exist for all Hadoop-supported filesystems.(抽象基类FileSystem定义了对hadoop文件系统的操作)

  • All operations that take a Path to this interface MUST support relative paths. In such a case, they must be resolved relative to the working directory defined by setWorkingDirectory().(setWorkingDirectory()方法默认工作目录)

    • FileSystem中的getWorkingDirector()返回当前系统的工作目录

    • 代码

       //获得与hdfs文件系统的连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000"); //获取文件系统对象 FileSystem fileSystem = FileSystem.get; //获取当前工作目录 System.out.println("=========获取当前工作目录============="); System.out.println(fileSystem.getWorkingDirectory; //设置新的工作目录 //System.out.println("=========设置新的工作目录============="); fileSystem.setWorkingDirectory(new Path("hdfs://10.9.190.90:9000/user/anna")); //Path在hdfs中的作用和File作用类似,代表路径
      
    • 结果

       =========获取当前工作目录============= hdfs://10.9.190.90:9000/user/root =========获取设置后工作目录============= hdfs://10.9.190.90:9000/user/anna
      

import java.io.FileInputStream;

  1. package hdfs;  
  2.   
  3. import static org.junit.Assert.fail;  
  4.   
  5. import java.util.Arrays;  
  6.   
  7. import org.apache.Hadoop.conf.Configuration;  
  8. import org.apache.hadoop.fs.BlockLocation;  
  9. import org.apache.hadoop.fs.FSDataOutputStream;  
  10. import org.apache.hadoop.fs.FileStatus;  
  11. import org.apache.hadoop.fs.FileSystem;  
  12. import org.apache.hadoop.fs.Path;  
  13. import org.apache.hadoop.hdfs.DistributedFileSystem;  
  14. import org.apache.hadoop.hdfs.protocol.DatanodeInfo;  
  15. import org.junit.Test;  
  16.   
  17. public class TestHdfs {  
  18.   
  19.         @Test  
  20.         public void test() {  
  21.                 fail("Not yet implemented");  
  22.         }  
  23.           
  24.         //上传本地文件到HDFS   
  25.         @Test  
  26.         public void testUpload() throws Exception{  
  27.                   
  28.                 Configuration conf = new Configuration();  
  29.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  30.                   
  31.                 FileSystem hdfs = FileSystem.get(conf);  
  32.                 Path src = new Path("F:\lzp\T.txt");  
  33.                 Path dst = new Path("/");  
  34.                 hdfs.copyFromLocalFile(src, dst);  
  35.                   
  36.                 System.out.println("Upload to " + conf.get("fs.default.name"));  
  37.                 FileStatus files[] = hdfs.listStatus(dst);  
  38.                 for(FileStatus file : files){  
  39.                         System.out.println(file.getPath());  
  40.                 }  
  41.         }  
  42.           
  43.         //创建HDFS文件   
  44.         @Test  
  45.         public void testCreate() throws Exception{  
  46.                   
  47.                 Configuration conf = new Configuration();  
  48.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  49.                   
  50.                 byte[] buff = "hello world!".getBytes();  
  51.                   
  52.                 FileSystem hdfs = FileSystem.get(conf);  
  53.                 Path dst = new Path("/test");  
  54.                 FSDataOutputStream outputStream = null;  
  55.                 try{  
  56.                         outputStream = hdfs.create(dst);  
  57.                         outputStream.write(buff,0,buff.length);  
  58.                 }catch(Exception e){  
  59.                         e.printStackTrace();  
  60.                           
  61.                 }finally{  
  62.                         if(outputStream != null){  
  63.                                 outputStream.close();  
  64.                         }  
  65.                 }  
  66.                   
  67.                 FileStatus files[] = hdfs.listStatus(dst);  
  68.                 for(FileStatus file : files){  
  69.                         System.out.println(file.getPath());  
  70.                 }  
  71.         }  
  72.           
  73.         //重命名HDFS文件   
  74.         @Test  
  75.         public void testRename() throws Exception{  
  76.                   
  77.                 Configuration conf = new Configuration();  
  78.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  79.                   
  80.                   
  81.                 FileSystem hdfs = FileSystem.get(conf);  
  82.                 Path dst = new Path("/");  
  83.                   
  84.                 Path frpath = new Path("/test");  
  85.                 Path topath = new Path("/test1");  
  86.                   
  87.                 hdfs.rename(frpath, topath);  
  88.                   
  89.                 FileStatus files[] = hdfs.listStatus(dst);  
  90.                 for(FileStatus file : files){  
  91.                         System.out.println(file.getPath());  
  92.                 }  
  93.         }  
  94.           
  95.         //刪除HDFS文件   
  96.         @Test  
  97.         public void testDel() throws Exception{  
  98.                   
  99.                 Configuration conf = new Configuration();  
  100.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  101.                   
  102.                   
  103.                 FileSystem hdfs = FileSystem.get(conf);  
  104.                 Path dst = new Path("/");  
  105.                   
  106.                 Path topath = new Path("/test1");  
  107.                   
  108.                 boolean ok = hdfs.delete(topath,false);  
  109.                 System.out.println( ok ? "删除成功" : "删除失败");  
  110.                   
  111.                 FileStatus files[] = hdfs.listStatus(dst);  
  112.                 for(FileStatus file : files){  
  113.                         System.out.println(file.getPath());  
  114.                 }  
  115.         }  
  116.           
  117.         //查看HDFS文件的最后修改时间   
  118.         @Test  
  119.         public void testgetModifyTime() throws Exception{  
  120.                   
  121.                 Configuration conf = new Configuration();  
  122.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  123.                   
  124.                   
  125.                 FileSystem hdfs = FileSystem.get(conf);  
  126.                 Path dst = new Path("/");  
  127.                   
  128.                 FileStatus files[] = hdfs.listStatus(dst);  
  129.                 for(FileStatus file : files){  
  130.                         System.out.println(file.getPath() +"t" + file.getModificationTime());  
  131.                 }  
  132.         }  
  133.           
  134.         //查看HDFS文件是否存在   
  135.         @Test  
  136.         public void testExists() throws Exception{  
  137.                   
  138.                 Configuration conf = new Configuration();  
  139.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  140.                   
  141.                   
  142.                 FileSystem hdfs = FileSystem.get(conf);  
  143.                 Path dst = new Path("/T.txt");  
  144.                   
  145.                 boolean ok  = hdfs.exists(dst);  
  146.                 System.out.println( ok ? "文件存在" : "文件不存在");  
  147.         }  
  148.           
  149.         //查看某个文件在HDFS集群的位置   
  150.         @Test  
  151.         public void testFileBlockLocation() throws Exception{  
  152.                   
  153.                 Configuration conf = new Configuration();  
  154.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  155.                   
  156.                   
  157.                 FileSystem hdfs = FileSystem.get(conf);  
  158.                 Path dst = new Path("/T.txt");  
  159.                   
  160.                 FileStatus fileStatus =  hdfs.getFileStatus(dst);  
  161.                 BlockLocation[] blockLocations =hdfs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());  
  162.                 for(BlockLocation block : blockLocations){  
  163.                         System.out.println(Arrays.toString(block.getHosts())+ "t" + Arrays.toString(block.getNames()));  
  164.                 }  
  165.         }  
  166.           
  167.         //获取HDFS集群上所有节点名称   
  168.         @Test  
  169.         public void testGetHostName() throws Exception{  
  170.                   
  171.                 Configuration conf = new Configuration();  
  172.                 conf.addResource(new Path("D:\myeclipse\Hadoop\hadoopEx\src\conf\hadoop.xml"));  
  173.                   
  174.                   
  175.                 DistributedFileSystem hdfs = (DistributedFileSystem)FileSystem.get(conf);  
  176.                 DatanodeInfo[] dataNodeStats = hdfs.getDataNodeStats();  
  177.                   
  178.                 for(DatanodeInfo dataNode : dataNodeStats){  
  179.                         System.out.println(dataNode.getHostName() + "t" + dataNode.getName());  
  180.                 }  
  181.         }  
  182.   
  183. }  

FileSystem方法——判断功能

  • 预备知识

    import org.apache.hadoop.fs.Path;类似于java.io.File代表hdfs的文件路径

  1. 方法

    • public boolean exists throws IOException

      • 判断文件是否存在
    • public boolean isDirectory throws IOException

      • 判断是否为目录
    • public boolean isFile throws IOException

      • 判断是否为文件
  2. 练习

     try { //获得与hdfs文件系统的连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000"); //获取连接对象 FileSystem fileSystem = FileSystem.get; //判断文件是否存在 System.out.println(fileSystem.exists(new Path); //true //判断是否为目录 System.out.println(fileSystem.isDirectory(new Path); //true //判断是否为文件 System.out.println(fileSystem.isFile(new Path); //false }catch(Exception e) { e.printStackTrace(); }
    

import java.io.FileNotFoundException;

图片 1

FileSystem方法——获取功能—文件信息获取

  1. 方法

    • public abstract FileStatus getFileStatus throws IOException

      • Return a file status object that represents the path.

      • 返回的是FileStatus对象类型

    • public Path getHomeDirectory()

      • Return the current user's home directory in this FileSystem. The default implementation returns "/user/$USER/".

      • 返回当前用户的home目录

  2. 练习

     try { //获得与hdfs文件系统的连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000"); //获取连接对象 FileSystem fileSystem = FileSystem.get; //获取当前用户的home目录 System.out.println("========当前用户的home目录============"); Path path = fileSystem.getHomeDirectory(); System.out.println; //获取文件状态对象 System.out.println("============文件信息==============="); FileStatus status = fileSystem.getFileStatus(new Path("/eclipse")); System.out.println("Path : " + status.getPath; System.out.println("isFile ? " + status.isFile; System.out.println("Block size : " + status.getBlockSize; System.out.println("Perssions : " + status.getPermission; System.out.println("Replication : " + status.getReplication; System.out.println("isSymlink : " + status.isSymlink; }catch(Exception e) { e.printStackTrace(); } /* 在JDK1.8中输出结果为: * ------------------------------------------------ * ========当前用户的home目录============ hdfs://10.9.190.90:9000/user/anna ============文件信息=============== Path : hdfs://10.9.190.90:9000/eclipse isFile ? true Block size : 134217728 Perssions : rw-r--r-- Replication : 3 isSymlink : false ------------------------------------------------ */
    
  3. FileStatus中常用方法

    • public Path getPath()

    • public boolean isFile()

    • public boolean isSymlink()

    • public long getBlockSize()

    • public short getReplication()

    • public FsPermission getPermission()

import java.io.IOException;

FileSystem方法——获取功能——文件夹遍历1

  1. 方法

    • public abstract FileStatus[] listStatus throws FileNotFoundException,IOException

      • 返回FileStatus型数组
    • public FileStatus[] listStatus(Path f,PathFilter filter)throws FileNotFoundException,IOException

    • public FileStatus[] listStatus(Path[] files,PathFilter filter)throws FileNotFoundException,IOException

      • 此时注意PathFilter是一个接口,里面只有一个方法:accept,本质是对文件进行筛选

      • Enumerate all files found in the list of directories passed in, calling listStatus(path, filter) on each one.

    • 注意:以上方法返回的文件按照字母表顺序排列

  2. 练习1——FileStatus[] listStatus的使用

     //FileStatus[] listStatus的使用 try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //listStatus获取/test目录下信息 FileStatus[] fileStatuses = fileSystem.listStatus(new Path; //遍历输出文件夹下文件 for(FileStatus fileStatus :fileStatuses) { System.out.println(fileStatus.getPath() + " " + new Date(fileStatus.getAccessTime + " " + fileStatus.getBlockSize() + " " + fileStatus.getPermission; } }catch(Exception e) { e.printStackTrace(); } /* 在JDK1.8中输出结果为: ---------------------------------------------------------------------------- hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz 2012-07-26 134217728 rw-r--r-- hdfs://10.9.190.90:9000/test/hello.txt 2012-07-26 134217728 rw-r--r-- hdfs://10.9.190.90:9000/test/test2 1970-01-01 0 rwxr-xr-x ---------------------------------------------------------------------------- */
    
  3. 练习2——FileStatus[] listStatus(Path f,PathFilter filter)的使用

    • 需求:列出/test/test2目录下以.md结尾的问价信息

    • 代码:

       try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //列出目录下后缀为.md的文件相关信息 FileStatus[] statuses = fileSystem.listStatus(new Path("/test/test2"), new PathFilter() { @Override public boolean accept(Path path) { // TODO Auto-generated method stub String string = path.toString(); if(string.endsWith return true; else return false; } }); //列出文件信息 for(FileStatus status : statuses) { System.out.println("Path : " + status.getPath() + " Permisson : " + status.getPermission() + " Replication : " + status.getReplication; } }catch(Exception e) { e.printStackTrace(); }
      
  4. 注意问题

    • By the time the listStatus() operation returns to the caller, there is no guarantee that the information contained in the response is current. The details MAY be out of date, including the contents of any directory, the attributes of any files, and the existence of the path supplied.(listStatus

import java.net.URI;

FileSystem方法——获取功能——文件夹遍历2

  1. 方法

    • public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatusthrows FileNotFoundException, IOException

    • protected org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f,PathFilter filter)throws FileNotFoundException, IOException

      • 注意:此方法是protected的,protected权限是:本类,同一包下,不同包下子类
    • 注意:LocatedFileStatus是FileStatus的子类

  2. 使用

     try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //列出目录下后缀为.md的文件相关信息 RemoteIterator<LocatedFileStatus> iterator = fileSystem.listLocatedStatus(new Path("/test/test2")); while(iterator.hasNext { LocatedFileStatus status = iterator.next(); System.out.println("Path : " + status.getPath() + " Permisson : " + status.getPermission() + " Replication : " + status.getReplication; } }catch(Exception e) { e.printStackTrace(); } /* * 在JDK1.8中输出结果为: * --------------------------------------------------------------------------------------------- * Path : hdfs://10.9.190.90:9000/test/test2/Map.md Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/test2/biji.md Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/test2/haha.txt Permisson : rw-r--r-- Replication : 3 --------------------------------------------------------------------------------------------- * */
    
  3. 与listStatus不同的是

    • listStatus返回的是FileStatus[]数组类型,遍历时可通过数组for-each进行遍历

    • listLocatedStatus返回的是LocatedFileStatus类型的RemoteIterator集合,通过迭代器进行遍历输出

    • 但是要注意的是listLocatedStatus()方法本质上内部还是listStatus实现的

import java.net.URISyntaxException;

FileSystem方法——获取功能——文件夹遍历3

  1. 方法

    • public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path f,boolean recursive)throws FileNotFoundException,IOException
      • 递归遍历出文件夹内容以及子文件夹中内容
  2. 使用

     try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //列出目录下后缀为.md的文件相关信息 RemoteIterator<LocatedFileStatus> iterator = fileSystem.listFiles(new Path,true); while(iterator.hasNext { LocatedFileStatus status = iterator.next(); System.out.println("Path : " + status.getPath() + " Permisson : " + status.getPermission() + " Replication : " + status.getReplication; } }catch(Exception e) { e.printStackTrace(); } /* * 在JDK1.8中输出结果为: * --------------------------------------------------------------------------------------------------- * Path : hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/hello.txt Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/test2/Map.md Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/test2/biji.md Permisson : rw-r--r-- Replication : 3 Path : hdfs://10.9.190.90:9000/test/test2/haha.txt Permisson : rw-r--r-- Replication : 3 --------------------------------------------------------------------------------------------------- * */ 
    

import org.apache.hadoop.conf.Configuration;

FileSystem方法——获取功能——获取文件block的位置

  1. 方法

    • public BlockLocation[] getFileBlockLocations(Path p,long start,long len) throws IOException

    • public BlockLocation[] getFileBlockLocations(FileStatus file,long start,long len) throws IOException

  2. 使用

     //查看/test/hadoop的block存放位置 try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; FileStatus status = fileSystem.getFileStatus(new Path("/test/hadoop")); BlockLocation[] locations = fileSystem.getFileBlockLocations(status, 0,status.getLen; for(BlockLocation location : locations) { System.out.println("host : " + location.getHosts() + " name : " + location.getNames() + " length : " + location.getLength; } }catch(Exception e) { e.printStackTrace(); } /* 在JDK1.8中输出结果为: ------------------------------------------------------------------------------ host : [Ljava.lang.String;@18ece7f4 name : [Ljava.lang.String;@3cce57c7 length : 134217728 host : [Ljava.lang.String;@1cf56a1c name : [Ljava.lang.String;@33f676f6 length : 79874467 ------------------------------------------------------------------------------ */
    

import org.apache.hadoop.fs.FSDataInputStream;

FileSystem方法——获取功能——获取到某文件的输出流

  1. 方法

    • public FSDataOutputStream create throws IOException

    • public FSDataOutputStream create(Path f,boolean overwrite)throws IOException

      • overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an exception will be thrown.
    • public FSDataOutputStream create(Path f,Progressable progress)throws IOException

      • Create an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.
    • public FSDataOutputStream create(Path f,boolean overwrite,int bufferSize)throws IOException

    • public FSDataOutputStream create(Path f,boolean overwrite,int bufferSize, Progressable progress)throws IOException

    • FSDataOutputStream append(Path p, int bufferSize, Progressable progress)

  2. 使用——将本地E:/hzy.jpg上传到hdfs的/1.jpg

     public static void main(String[] args) { BufferedInputStream in = null; FSDataOutputStream out = null; try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //获取本地文件输入流 File file = new File("E:/hzy.jpg"); in = new BufferedInputStream(new FileInputStream; final long fileSize = file.length(); //获取到/test/hello.txt的输出流 out = fileSystem.create(new Path,new Progressable() { long fileCount = 0; @Override public void progress() { // TODO Auto-generated method stub fileCount++; System.out.println("总进度:" + (fileCount/fileSize)*100 + " %"); } }); //拷贝 int len = 0; while((len = in.read { out.write; //此时也可以用:IOUtils.copyBytes(in,out,conf); } in.close(); out.close(); }catch(Exception e) { e.printStackTrace(); }finally { if(in != null) { try { in.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } if (out != null) { try { out.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } }
    

    }

import org.apache.hadoop.fs.FSDataOutputStream;

FileSystem方法——获取功能——获取到某文件的输入流——读取文件

  1. 方法

    • public FSDataInputStream open throws IOException

    • public abstract FSDataInputStream open(Path f,int bufferSize)throws IOException

  2. 使用——将hdfs中的1.jpg拷贝到本地E:/hzy2.jpg

     try { //创建与HDFS连接 Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://10.9.190.90:9000"); //获得fileSystem FileSystem fileSystem = FileSystem.get; //获取hdfs文件输入流 FSDataInputStream in = fileSystem.open(new Path); //获取本地输出流 BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(new File("E:/hzyCopy.jpg"))); int len = 0; byte[] bArr = new byte[1024*3]; while((len = in.read != -1) { out.write(bArr,0,len); } in.close(); out.close(); }catch(Exception e) { e.printStackTrace(); }
    

    }

import org.apache.hadoop.fs.FileStatus;

FileSystem方法——创建功能

  • public boolean mkdirs throws IOException

import org.apache.hadoop.fs.FileSystem;

FileSystem方法——删除功能

  • public abstract boolean delete(Path f,boolean recursive) throws IOException

  • 设计线程同步问题

import org.apache.hadoop.fs.Path;

FileSystem方法——重命名功能

  • public abstract boolean rename(Path src,Path dst)throws IOException

import org.apache.hadoop.io.IOUtils;

FileSystem其他方法

  • public void concat(Path trg,Path[] psrcs)throws IOException

    • Concat existing files together.
  • public boolean truncate(Path f,long newLength)throws IOException

public class App2 {

interface RemoteIterator

  1. 定义

     public interface RemoteIterator<E> { boolean hasNext() throws IOException; E next() throws IOException; } 
    
    • The primary use of RemoteIterator in the filesystem APIs is to list files on (possibly remote) filesystems.
  2. 使用

     //listLocatedFileStatus public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus throws FileNotFoundException,IOException //listLocatedStatus(Path f,PathFilter filter) protected org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f,PathFilter filter) throws FileNotFoundException,IOException //listStatusIterator public org.apache.hadoop.fs.RemoteIterator<FileStatus> listStatusIterator throws FileNotFoundException,IOException //listFiles(Path f,boolean recursive) public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path f,boolean recursive) throws FileNotFoundException,IOException
    

static final String PATH="hdfs://192.168.2.27:9000/hello";

interface StreamCapabilities

  1. 方法

     public interface StreamCapabilities { boolean hasCapability(String capability); }
    
  2. 使用

     hadoop2.7.3中无此方法,在2.9.1中才有
    

static final String DIR = "/d1";

static final String FILE = "/d1/hello";

public static void main(String[] args) throws IOException, URISyntaxException {

//用户代码操作HDFS时,是直接调用FileSystem的子类完成的

FileSystem fileSystem = getFileSystem();

//创建文件夹 hadoop fs -mkdir

本文由美高梅游戏网站登录发布于美高梅棋牌游戏,转载请注明出处:FileSystem方法——判断功能,是直接调用FileSyste

您可能还会对下面的文章感兴趣: