Class Files and Bytecode Instructions
Chapter Introduction
The foundation of RASP implementation lies in inserting detection code into target methods, which inevitably involves bytecode modification. In recent years, the identification and detection of various memory shells also involve bytecode files. Understanding the structure of class files is the basis for implementing bytecode detection and modification. This chapter will detail the structure of bytecode files, with a focus on the parts relevant to RASP.
Structure of Class Files
A Class file is a binary stream composed of 8-bit bytes as basic units, storing data in a fixed format specified by the Java Virtual Machine Specification.
Two data types are used for storing data: unsigned numbers
and tables
.
- Unsigned numbers: Unsigned numbers are basic data types, represented by u1, u2, u4, and u8 for 1-byte, 2-byte, 4-byte, and 8-byte unsigned numbers, respectively. Unsigned numbers can describe numbers, index references, quantity tables, or string values encoded in UTF-8.
- Tables: Composite data types consisting of multiple unsigned numbers or other tables as data items, ending with “_info”, used to describe hierarchical composite data structures.
To understand bytecode files, the following two basic concepts are also required.
- Fully qualified name: The fully qualified name of a class replaces all ”.” in the full class name with ”/”, e.g., java.lang.String becomes java/lang/String. Fully qualified names are separated by ”;”.
- Descriptor: Descriptors describe the data type of fields, the parameter list of methods, and return values. Each symbol corresponds to a different data type, as shown in Table 2-1.
Table 2-1 Java Types and Descriptor Symbols
Descriptor | Type |
---|---|
B | byte |
C | char |
D | double |
F | float |
I | int |
J | long |
S | short |
Z | boolean |
V | void |
L | Ljava/lang/Object; |
Generally, the type descriptor symbol is the first letter of the basic type in uppercase. A few exceptions are notable: J, L, and Z. These three require special memorization: J represents long, L represents objects, and Z represents boolean.
Table 2-2 shows the fixed format for Class files as specified by the Java Virtual Machine Specification. All Class files store content in this format. (Note: Each class file’s content is composed in the order listed below. If certain types are not involved, they can be empty.)
Table 2-2 Format of Class Files
Type | Name | Count | Bytes Occupied | Meaning |
---|---|---|---|---|
u4 | magic | 1 | 4 | Magic Number |
u2 | minor_version | 1 | 2 | Minor Version |
u2 | major_version | 1 | 2 | Major Version |
u2 | constant_pool_count | 1 | 2 | Constant Pool Count |
cp_info | constant_pool | constant_pool_count-1 | Table Structure | Constant Pool Table |
u2 | access_flags | 1 | 2 | Class Access Flags |
u2 | this_class | 1 | 2 | Class Index |
u2 | super_class | 1 | 2 | Superclass Index |
u2 | interfaces_count | 1 | 2 | Interface Count |
u2 | interfaces | interfaces_count | Table Structure | Interface Structure Table |
u2 | fields_count | 1 | 2 | Field Count |
field_info | fields | fields_count | Table Structure | Field Structure Table |
u2 | methods_count | 1 | 2 | Method Count |
method_info | methods | methods_count | Table Structure | Method Structure Table |
u2 | attributes_count | 1 | 2 | Class Attribute Array Length |
attribute_info | attributes | attributes_count | Table Structure | Attribute Structure Table |
The JVM specification requires every bytecode file to consist of these ten parts in a fixed order. The overall structure is shown in Figure 2-1:
This section will use the following code to illustrate the class structure. The Foo class below contains only a main method.
public class Foo { public static void main(String[] args) { System.out.println("hello word!"); }}
Figure 2-1 Foo.class File Data
Class files are complex and require analysis tools for inspection.
The first eight bytes: CA FE BA BE 00 00 00 34. The first four bytes are the magic number of the class file, fixed as CAFEBABE. Its purpose is to determine whether the class file can be accepted by the JVM. When the class loader loads a class file into memory, files whose first eight bytes are not “CAFEBABE” will be rejected. The next four bytes are divided into minor and major version numbers. The major version here is 0034 (52), corresponding to JDK8, while the minor version is generally 0. If the class file’s version number is higher than the JVM’s own version number, loading the class will throw a java.lang.UnsupportedClassVersionError. Java version numbers start from 45. After JDK1.1, each major JDK release typically increments the major version number by one. Higher versions of JDK can be backward compatible with older versions of class files but cannot run files with version numbers higher than the current JVM version. Even if the file format has not changed at all, the virtual machine must refuse to execute class files with version numbers exceeding its own. The major version numbers for released versions are shown in Table 2-3.
Table 2-3 Relationship Between Java Versions and Major Versions
JDK Version | Major Version Number (Major) |
---|---|
Java1.1 | 45 |
Java1.2 | 46 |
Java1.3 | 47 |
Java1.4 | 48 |
Java5 | 49 |
Java6 | 50 |
Java7 | 51 |
Java8 | 52 |
Java9 | 53 |
Java10 | 54 |
Java11 | 55 |
Java17 | 61 |
Java18 | 62 |
Decompiling Class Files
javap Tool
javap is a Java class file disassembler that can decompile or view bytecode generated by the Java compiler. The javap help manual is shown below.
Figure 2-2 javap Help Manual
- Without any parameters:
~ javap FooCompiled from "Foo.java"public class Foo {public Foo();public static void main(java.lang.String[]);}
By default, javap displays methods with public, protected, and default access levels.
To display private methods and fields, use the -p
option.
-s
outputs type descriptor signature information:
~ javap -s FooCompiled from "Foo.java"public class Foo {public Foo();descriptor: ()V
public static void main(java.lang.String[]);descriptor: ([Ljava/lang/String;)V}
-c
disassembles the code:
Compiled from "Foo.java"public class Foo {public Foo();Code:0: aload_01: invokespecial #1 // Method java/lang/Object."<init>":()V4: return
public static void main(java.lang.String[]);Code:0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;3: ldc #3 // String hello word!5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V8: return}
-l
outputs line numbers and local variable tables:
Compiled from "Foo.java"public class Foo {public Foo();LineNumberTable:line 1: 0LocalVariableTable:Start Length Slot Name Signature0 5 0 this LFoo;
public static void main(java.lang.String[]);LineNumberTable:line 3: 0line 4: 8LocalVariableTable:Start Length Slot Name Signature0 9 0 args [Ljava/lang/String;}
-v
displays detailed class information, including version numbers, access flags, constant pools, method descriptors, etc., and is the most frequently used option.
Classfile /Users/xxx/Foo.classLast modified Sep 5, 2024; size 542 bytesMD5 checksum 4c43623c887cd026d5e99e26eaf13c3bCompiled from "Foo.java"public class Foominor version: 0major version: 52flags: (0x0021) ACC_PUBLIC, ACC_SUPERthis_class: #5 // Foosuper_class: #6 // java/lang/Objectinterfaces: 0, fields: 0, methods: 2, attributes: 1Constant pool:#1 = Methodref #6.#21 // java/lang/Object."<init>":()V#2 = Fieldref #22.#23 // java/lang/System.out:Ljava/io/PrintStream;#3 = String #24 // hello word!#4 = Methodref #25.#26 // java/io/PrintStream.println:(Ljava/lang/String;)V#5 = Class #27 // Foo#6 = Class #28 // java/lang/Object#7 = Utf8 <init>#8 = Utf8 ()V#9 = Utf8 Code#10 = Utf8 LineNumberTable#11 = Utf8 LocalVariableTable#12 = Utf8 this#13 = Utf8 LFoo;#14 = Utf8 main#15 = Utf8 ([Ljava/lang/String;)V#16 = Utf8 args#17 = Utf8 [Ljava/lang/String;#18 = Utf8 MethodParameters#19 = Utf8 SourceFile#20 = Utf8 Foo.java#21 = NameAndType #7:#8 // "<init>":()V#22 = Class #29 // java/lang/System#23 = NameAndType #30:#31 // out:Ljava/io/PrintStream;#24 = Utf8 hello word!#25 = Class #32 // java/io/PrintStream#26 = NameAndType #33:#34 // println:(Ljava/lang/String;)V#27 = Utf8 Foo#28 = Utf8 java/lang/Object#29 = Utf8 java/lang/System#30 = Utf8 out#31 = Utf8 Ljava/io/PrintStream;#32 = Utf8 java/io/PrintStream#33 = Utf8 println#34 = Utf8 (Ljava/lang/String;)V{public Foo();descriptor: ()Vflags: (0x0001) ACC_PUBLICCode:stack=1, locals=1, args_size=10: aload_01: invokespecial #1 // Method java/lang/Object."<init>":()V4: returnLineNumberTable:line 1: 0LocalVariableTable:Start Length Slot Name Signature0 5 0 this LFoo;
public static void main(java.lang.String[]);descriptor: ([Ljava/lang/String;)Vflags: (0x0009) ACC_PUBLIC, ACC_STATICCode:stack=2, locals=1, args_size=10: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;3: ldc #3 // String hello word!5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V8: returnLineNumberTable:line 3: 0line 4: 8LocalVariableTable:Start Length Slot Name Signature0 9 0 args [Ljava/lang/String;MethodParameters:Name Flagsargs}
Typically, the javap command is frequently used on servers.
jclasslib Tool
jclasslib Bytecode Editor is a tool that visualizes class files and modifies bytecode. Project address: https://github.com/ingokegel/jclasslib.git
The jclasslib main interface is shown in Figure 2-3.
Figure 2-3 jclasslib Main Interface
- Editing the constant pool:
Figure 2-4 jclasslib Editing Constant Pool
- Editing operation instructions:
Figure 2-5 jclasslib Editing Bytecode Instructions
The tool also provides an IDEA plugin. Search for “jclasslib” in IDEA Plugins to install it (Figure 2-6).
Figure 2-6 Installing jclasslib IDEA Plugin
Usage is illustrated below. After compiling the code, select “Show Bytecode With jclasslib” from the “View” menu to intuitively see information about the current bytecode file’s class info, constant pool, method area, etc.
Figure 2-7 Using IDEA Plugin
Bytecode Instructions
Chapter Summary
This chapter mainly introduced the composition and structure of class files, which form the foundation for bytecode modification and detection in later chapters. It also covered how to decompile compiled bytecode and introduced Java’s official command-line tool javap and the open-source visualization tool jclasslib.