为什么“STRING".getBytes() 的工作方式因操作系统而异

时间：2022-11-25

本文介绍了为什么“STRING".getBytes() 的工作方式因操作系统而异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

I am running the code below and I am getting different outcome from "some_string".getBytes() depending if I am in Windows or Unix. The issue happens with any string (I tried a very simple ABC and same problem.

See the differences below printed in console.

The code below is well-tested using Java 7. If you copy it entirely it will run.

Additionally, see the difference in Hexadecimal in the two images below. The first two images shows the file created in Windows. You can see the hexadecimal values with ANSI and EBCDIC respectively. The third image, the black one, is from Unix. You can see the hexadecimal (-c option) and the character readable in which I believe it is EBCDIC.

So, my straight question is: why does such code work different since I am just using Java 7 in both case? Should I check any especific property in somewhere? Maybe, Java in Windows get certain default format and in Unix it get another. If so, which property must I check or settup?

Unix Console:

$ ./java -cp /usr/test.jar test.mainframe.read.test.TestGetBytes
H = 76 - L
< wasn't found

Windows Console:

H = 60 - <
H1 = 69 - E
H2 = 79 - O
H3 = 77 - M
H4 = 62 - >
End of Message found

The entire code:

package test.mainframe.read.test;

import java.util.ArrayList;

public class TestGetBytes {

       public static void main(String[] args) {
              try {
                     ArrayList ipmMessage = new ArrayList();
                     ipmMessage.add(newLine());

                     //Windows Path
                     writeMessage("C:/temp/test_bytes.ipm", ipmMessage);
                     reformatFile("C:/temp/test_bytes.ipm");
                     //Unix Path
                     //writeMessage("/usr/temp/test_bytes.ipm", ipmMessage);
                     //reformatFile("/usr/temp/test_bytes.ipm");
              } catch (Exception e) {

                     System.out.println(e.getMessage());
              }
       }

       public static byte[] newLine() {
              return "<EOM>".getBytes();
       }

       public static void writeMessage(String fileName, ArrayList ipmMessage)
                     throws java.io.FileNotFoundException, java.io.IOException {

              java.io.DataOutputStream dos = new java.io.DataOutputStream(
                           new java.io.FileOutputStream(fileName, true));
              for (int i = 0; i < ipmMessage.size(); i++) {
                     try {
                           int[] intValues = (int[]) ipmMessage.get(i);
                           for (int j = 0; j < intValues.length; j++) {
                                  dos.write(intValues[j]);
                           }
                     } catch (ClassCastException e) {
                           byte[] byteValues = (byte[]) ipmMessage.get(i);
                           dos.write(byteValues);
                     }
              }
              dos.flush();
              dos.close();

       }

       // reformat to U1014
       public static void reformatFile(String filename)
                     throws java.io.FileNotFoundException, java.io.IOException {
              java.io.FileInputStream fis = new java.io.FileInputStream(filename);
              java.io.DataInputStream br = new java.io.DataInputStream(fis);

              int h = br.read();
              System.out.println("H = " + h + " - " + (char)h);

              if ((char) h == '<') {// Check for <EOM>

                     int h1 = br.read();
                     System.out.println("H1 = " + h1 + " - " + (char)h1);
                     int h2 = br.read();
                     System.out.println("H2 = " + h2 + " - " + (char)h2);
                     int h3 = br.read();
                     System.out.println("H3 = " + h3 + " - " + (char)h3);
                     int h4 = br.read();
                     System.out.println("H4 = " + h4 + " - " + (char)h4);
                     if ((char) h1 == 'E' && (char) h2 == 'O' && (char) h3 == 'M'
                                  && (char) h4 == '>') {
                           System.out.println("End of Message found");
                     }
                     else{
                           System.out.println("EOM not found but < was found");
                     }
              }
              else{
                     System.out.println("< wasn't found");
              }
       }
}

解决方案

You are not specifying a charset when calling getBytes(), so it uses the default charset of the underlying platform (or of Java itself if specified when Java is started). This is stated in the String documentation:

public byte[] getBytes()

Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

getBytes() has an overloaded version that lets you specify a charset in your code.

public byte[] getBytes(Charset charset)

Encodes this String into a sequence of bytes using the given charset, storing the result into a new byte array.

这篇关于为什么“STRING".getBytes() 的工作方式因操作系统而异的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

上一篇：在Java中将字节转换为二进制 下一篇：在 shell 脚本中嵌入可执行二进制文件

为什么“STRING".getBytes() 的工作方式因操作系统而异

问题描述

相关文章

最新文章