首页 \ 问答 \ Java NIO与非NIO性能(Java NIO vs Non NIO Performance)

Java NIO与非NIO性能(Java NIO vs Non NIO Performance)

我花了相当多的时间来尝试优化文件哈希算法,以便尽可能地降低每一次性能。

看我之前的SO主题:

获取文件哈希性能/优化

FileChannel ByteBuffer和Hashing Files

确定适当的缓冲区大小

建议多次使用Java NIO来获得本机性能提升(通过将缓冲区保留在系统中而不是将它们引入JVM)。 但是,我的NIO代码在基准测试中运行得相当慢(使用每种算法反复散列相同的文件,以消除任何可能导致结果偏差的操作系统/驱动器“魔力”。

我现在有两种方法可以做同样的事情:

This one runs faster almost every time:

/**
 * Gets Hash of file.
 * 
 * @param file String path + filename of file to get hash.
 * @param hashAlgo Hash algorithm to use. <br/>
 *     Supported algorithms are: <br/>
 *     MD2, MD5 <br/>
 *     SHA-1 <br/>
 *     SHA-256, SHA-384, SHA-512
 * @param BUFFER Buffer size in bytes. Recommended to stay in<br/>
 *          multiples of 2 such as 1024, 2048, <br/>
 *          4096, 8192, 16384, 32768, 65536, etc.
 * @return String value of hash. (Variable length dependent on hash algorithm used)
 * @throws IOException If file is invalid.
 * @throws HashTypeException If no supported or valid hash algorithm was found.
 */
public String getHash(String file, String hashAlgo, int BUFFER) throws IOException, HasherException {
    StringBuffer hexString = null;
    try {
        MessageDigest md = MessageDigest.getInstance(validateHashType(hashAlgo));
        FileInputStream fis = new FileInputStream(file);

        byte[] dataBytes = new byte[BUFFER];

        int nread = 0;
        while ((nread = fis.read(dataBytes)) != -1) {
            md.update(dataBytes, 0, nread);
        }
        fis.close();
        byte[] mdbytes = md.digest();

        hexString = new StringBuffer();
        for (int i = 0; i < mdbytes.length; i++) {
            hexString.append(Integer.toHexString((0xFF & mdbytes[i])));
        }

        return hexString.toString();

    } catch (NoSuchAlgorithmException | HasherException e) {
        throw new HasherException("Unsuppored Hash Algorithm.", e);
    }
}

My Java NIO method that runs considerably slower most of the time:

/**
 * Gets Hash of file using java.nio File Channels and ByteBuffer 
 * <br/>for native system calls where possible. This may improve <br/>
 * performance in some circumstances.
 * 
 * @param fileStr String path + filename of file to get hash.
 * @param hashAlgo Hash algorithm to use. <br/>
 *     Supported algorithms are: <br/>
 *     MD2, MD5 <br/>
 *     SHA-1 <br/>
 *     SHA-256, SHA-384, SHA-512
 * @param BUFFER Buffer size in bytes. Recommended to stay in<br/>
 *          multiples of 2 such as 1024, 2048, <br/>
 *          4096, 8192, 16384, 32768, 65536, etc.
 * @return String value of hash. (Variable length dependent on hash algorithm used)
 * @throws IOException If file is invalid.
 * @throws HashTypeException If no supported or valid hash algorithm was found.
 */
public String getHashNIO(String fileStr, String hashAlgo, int BUFFER) throws IOException, HasherException {

    File file = new File(fileStr);

    MessageDigest md = null;
    FileInputStream fis = null;
    FileChannel fc = null;
    ByteBuffer bbf = null;
    StringBuilder hexString = null;

    try {
        md = MessageDigest.getInstance(hashAlgo);
        fis = new FileInputStream(file);
        fc = fis.getChannel();
        bbf = ByteBuffer.allocateDirect(BUFFER); // allocation in bytes - 1024, 2048, 4096, 8192

        int b;

        b = fc.read(bbf);

        while ((b != -1) && (b != 0)) {
            bbf.flip();

            byte[] bytes = new byte[b];
            bbf.get(bytes);

            md.update(bytes, 0, b);

            bbf.clear();
            b = fc.read(bbf);
        }

        fis.close();

        byte[] mdbytes = md.digest();

        hexString = new StringBuilder();

        for (int i = 0; i < mdbytes.length; i++) {
            hexString.append(Integer.toHexString((0xFF & mdbytes[i])));
        }

        return hexString.toString();

    } catch (NoSuchAlgorithmException e) {
        throw new HasherException("Unsupported Hash Algorithm.", e);
    }
}

我的想法是Java NIO尝试使用本机系统调用,以便在系统中和JVM之外保持处理和存储(缓冲区) - 这样(在理论上)这可以防止程序不断地在程序之间来回移动。 JVM和系统。 从理论上讲,这应该更快......但也许我的MessageDigest强制JVM引入缓冲区,否定了本机缓冲区/系统调用带来的任何性能改进? 我在这个逻辑中是正确的还是我离开了?

Please help me understand why Java NIO is not better in this scenario.


I've spent considerable time attempting to optimize a file hashing algorithm to eek out every last drop of performance possible.

See my previous SO threads:

Get File Hash Performance/Optimization

FileChannel ByteBuffer and Hashing Files

Determining Appropriate Buffer Size

It was recommened several times to use Java NIO to gain native performance increases (by keeping the buffer's in the system instead of bringing them into the JVM). However, my NIO code runs considerably slower un benchmarks (hashing the same files over and over with each algorithm to negate any OS/Drive "magic" that could be skewing results.

I now have two methods that do the same thing:

This one runs faster almost every time:

/**
 * Gets Hash of file.
 * 
 * @param file String path + filename of file to get hash.
 * @param hashAlgo Hash algorithm to use. <br/>
 *     Supported algorithms are: <br/>
 *     MD2, MD5 <br/>
 *     SHA-1 <br/>
 *     SHA-256, SHA-384, SHA-512
 * @param BUFFER Buffer size in bytes. Recommended to stay in<br/>
 *          multiples of 2 such as 1024, 2048, <br/>
 *          4096, 8192, 16384, 32768, 65536, etc.
 * @return String value of hash. (Variable length dependent on hash algorithm used)
 * @throws IOException If file is invalid.
 * @throws HashTypeException If no supported or valid hash algorithm was found.
 */
public String getHash(String file, String hashAlgo, int BUFFER) throws IOException, HasherException {
    StringBuffer hexString = null;
    try {
        MessageDigest md = MessageDigest.getInstance(validateHashType(hashAlgo));
        FileInputStream fis = new FileInputStream(file);

        byte[] dataBytes = new byte[BUFFER];

        int nread = 0;
        while ((nread = fis.read(dataBytes)) != -1) {
            md.update(dataBytes, 0, nread);
        }
        fis.close();
        byte[] mdbytes = md.digest();

        hexString = new StringBuffer();
        for (int i = 0; i < mdbytes.length; i++) {
            hexString.append(Integer.toHexString((0xFF & mdbytes[i])));
        }

        return hexString.toString();

    } catch (NoSuchAlgorithmException | HasherException e) {
        throw new HasherException("Unsuppored Hash Algorithm.", e);
    }
}

My Java NIO method that runs considerably slower most of the time:

/**
 * Gets Hash of file using java.nio File Channels and ByteBuffer 
 * <br/>for native system calls where possible. This may improve <br/>
 * performance in some circumstances.
 * 
 * @param fileStr String path + filename of file to get hash.
 * @param hashAlgo Hash algorithm to use. <br/>
 *     Supported algorithms are: <br/>
 *     MD2, MD5 <br/>
 *     SHA-1 <br/>
 *     SHA-256, SHA-384, SHA-512
 * @param BUFFER Buffer size in bytes. Recommended to stay in<br/>
 *          multiples of 2 such as 1024, 2048, <br/>
 *          4096, 8192, 16384, 32768, 65536, etc.
 * @return String value of hash. (Variable length dependent on hash algorithm used)
 * @throws IOException If file is invalid.
 * @throws HashTypeException If no supported or valid hash algorithm was found.
 */
public String getHashNIO(String fileStr, String hashAlgo, int BUFFER) throws IOException, HasherException {

    File file = new File(fileStr);

    MessageDigest md = null;
    FileInputStream fis = null;
    FileChannel fc = null;
    ByteBuffer bbf = null;
    StringBuilder hexString = null;

    try {
        md = MessageDigest.getInstance(hashAlgo);
        fis = new FileInputStream(file);
        fc = fis.getChannel();
        bbf = ByteBuffer.allocateDirect(BUFFER); // allocation in bytes - 1024, 2048, 4096, 8192

        int b;

        b = fc.read(bbf);

        while ((b != -1) && (b != 0)) {
            bbf.flip();

            byte[] bytes = new byte[b];
            bbf.get(bytes);

            md.update(bytes, 0, b);

            bbf.clear();
            b = fc.read(bbf);
        }

        fis.close();

        byte[] mdbytes = md.digest();

        hexString = new StringBuilder();

        for (int i = 0; i < mdbytes.length; i++) {
            hexString.append(Integer.toHexString((0xFF & mdbytes[i])));
        }

        return hexString.toString();

    } catch (NoSuchAlgorithmException e) {
        throw new HasherException("Unsupported Hash Algorithm.", e);
    }
}

My thoughts are that Java NIO attempts to use native system calls and such to keep processing and storage (buffers) in the system and out of the JVM - this prevents (in theory) the program from having to constantly shuffle things back and forth between the JVM and the system. In theory this should be faster... but perhaps my MessageDigest forces the JVM to bring the buffer in, negating any performance improvements the native buffers/system calls can bring? Am I correct in this logic or am I way off?

Please help me understand why Java NIO is not better in this scenario.


原文:https://stackoverflow.com/questions/16321299
更新时间:2023-06-26 18:06

最满意答案

请更改FilterExpression ,如下所述。

FilterExpression="attribute_not_exists(age) AND attribute_not_exists(address)",

Please change the FilterExpression as mentioned below.

FilterExpression="attribute_not_exists(age) AND attribute_not_exists(address)",

相关问答

更多
  • 请更改FilterExpression ,如下所述。 FilterExpression="attribute_not_exists(age) AND attribute_not_exists(address)", Please change the FilterExpression as mentioned below. FilterExpression="attribute_not_exists(age) AND attribute_not_exists(address)",
  • 经过多次迭代后,我发现这是有效的 from __future__ import print_function import os import sys import re import boto3 from botocore.exceptions import ClientError from colorama import Fore, Back, Style from colorama import init init() thingType = 'TpmStation' thingBaseName = ...
  • 根据CopySnapshot - Amazon Elastic Compute Cloud : CopySnapshot将快照副本发送到您发送HTTP请求到的区域端点 ,例如ec2.us-east-1.amazonaws.com (在AWS CLI中,这是通过--region参数或默认区域您的AWS配置文件)。 因此,应该将copy_snapshot()命令发送到us-east-1 ,并将Source Region设置为us-east-2 。 如果你想移动最近的快照,你可以运行: import boto3 ...
  • 没有称为root的 IAM用户。 根用户实际上与AWS账户相关联。 它完全独立于IAM,它允许创建个人用户。 您不能禁用与root用户关联的访问密钥,但可以从根用户中删除访问密钥 。 delete-access-key文档不太清楚,但它确实引用了能够通过API调用删除密钥。 如果我正确读取它, 则可以删除根键 - 但只能以root用户身份调用该函数 。 这在Lambda函数中是不可能的。 最好的行动方案可能是通过管理控制台删除根密钥 ,然后尝试找到一种方法来监控它(但我怀疑没有可以提供此信息的调用)。 幸运 ...
  • 运算符IN不能用于KeyConditionExpression即使用查询API获取多个散列键值。 DynamoDB目前不支持此功能。 如果您需要检索多个哈希键值,请使用Batch Get Item api。 示例代码: - email1 = "abc@gmail.com" email2 = "bcd@gmail.com" try: response = dynamodb.batch_get_item( RequestItems={ 'users': { ...
  • 以下介绍如何通过AWS命令行界面(CLI)显示信息: aws ec2 describe-instances --query 'Reservations[*].Instances[*].[InstanceId, Hypervisor, NetworkInterfaces[0].Attachment.DeleteOnTermination]' 这里有一些Python: import boto3 client = boto3.client('ec2') response = client.describe_ ...
  • 你的代码对我来说很好! 我建议你检查一下你是否正在运行最新版本的boto: sudo pip install boto3 --upgrade --ignore six Successfully installed boto3-1.4.1 botocore-1.4.61 docutils-0.12 futures-3.0.5 jmespath-0.9.0 python-dateutil-2.5.3 s3transfer-0.1.7 six-1.10.0 Your code worked fine for ...
  • boto3库不提供对您引用的Java客户端库支持的跨表事务的任何支持。 DynamoDB本身并不支持此功能,因此必须在客户端层实现此类事务,并且您的表的设计非常适合支持客户端软件所需的字段。 当然可以实现类似于Java的Python事务客户端,但据我所知,没有人拥有。 For people searching for transactions, this has now changed. AWS recently introduced DynamoDB Transactions. Boto will be ...
  • 类型和API方法不是静态存在的。 boto3使用数据驱动架构,这是一种极其动态的设计,它使用JSON格式的数据( 这是一个示例)来确定可能的API调用。 他们这样做是为了便于更新库以包含新的API更改。 我不确定,但我认为他们可能会在其他语言中使用相同的SDK策略,因此可以在几乎没有重复工作的情况下对多个SDK进行更改。 以下是他们博客的引用: 图书馆必须适应用户需求的变化以及他们运行的平台的变化。 随着AWS多年来的增长,我们更新API的速度也变得更快。 这要求我们设计一种可扩展的方法,以便每周快速提供对 ...
  • 根据我的想法:您的分区键是Number_Attribute,因此在执行query时您无法执行gt (您可以执行eq ,就是这样。) 在进行query时,您可以为排序键执行gt或between的query 。 它也被称为Range键,因为它“巧妙地”将项目放在一起,它提供了在query有效地执行gt和between的可能性 现在,如果你想between你的分区键between进行操作,那么你将不得不使用如下所示的scan : Key('Number_Attribute').gt(0) response = t ...

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。