Closed
Description
We encountered a strange problem where we could not look up the value of a Hash
whose key was generated by Zstd.decompress
using the same multibyte string literal.
I found that the problem can be reproduced only if the compressed data doesn't have the Frame_Content_Size information, that is, decompress_buffered
is used.
I'm not sure if it is a bug of Ruby or zstd-ruby.
Here is the reproducible code:
require 'zstd-ruby'
# This constant was generated by the following Java code
# and the compressed data doesn't have Frame_Content_Size:
#
# import com.github.luben.zstd.RecyclingBufferPool;
# import com.github.luben.zstd.ZstdOutputStreamNoFinalizer;
# import org.apache.kafka.common.utils.ByteBufferOutputStream;
# import javax.xml.bind.DatatypeConverter;
#
# import java.io.BufferedOutputStream;
# import java.io.DataOutputStream;
# import java.io.IOException;
# import java.nio.charset.StandardCharsets;
#
# class Main {
# public static void main(String[] args) throws IOException {
# ByteBufferOutputStream buffer = new ByteBufferOutputStream(10);
# DataOutputStream stream = new DataOutputStream(new BufferedOutputStream(new ZstdOutputStreamNoFinalizer(buffer, RecyclingBufferPool.INSTANCE), 16 * 1024));
# stream.write("あ".getBytes(StandardCharsets.UTF_8));
# stream.close();
#
# System.out.println(DatatypeConverter.printHexBinary(buffer.buffer().array()));
# }
# }
COMPRESSED_DATA_HEX = '28B52FFD0058180000E38182010000'
data = Zstd.decompress([COMPRESSED_DATA_HEX].pack('H*')).force_encoding('UTF-8')
expected_data = 'あ'
puts <<~MSG
RUBY_VERSION: #{RUBY_VERSION}
data: #{data}
data == expected_data: #{data == expected_data}
data.equal?(expected_data): #{data.equal?(expected_data)}
data.hash: #{data.hash}
expected_datadata.hash: #{expected_data.hash}
{ expected_data => 1 }.has_key?(data): #{{ expected_data => 1 }.has_key?(data)}
MSG
Here is the output:
RUBY_VERSION: 3.3.1
data: あ
data == expected_data: true
data.equal?(expected_data): false
data.hash: 3328309050837243483
expected_datadata.hash: 3486244608461787623
{ expected_data => 1 }.has_key?(data): false
As you can see, { expected_data => 1 }.has_key?(data)
is false even though data == expected_data
is true.
In Ruby 3.2.2, the result is expected.
RUBY_VERSION: 3.2.2
data: あ
data == expected_data: true
data.equal?(expected_data): false
data.hash: 3278076437348888334
expected_datadata.hash: 3278076437348888334
{ expected_data => 1 }.has_key?(data): true
Metadata
Metadata
Assignees
Labels
No labels