Skip to content

Class Files and Bytecode Instructions

Chapter Introduction

The foundation of RASP implementation lies in inserting detection code into target methods, which inevitably involves bytecode modification. In recent years, the identification and detection of various memory shells also involve bytecode files. Understanding the structure of class files is the basis for implementing bytecode detection and modification. This chapter will detail the structure of bytecode files, with a focus on the parts relevant to RASP.

Structure of Class Files

A Class file is a binary stream composed of 8-bit bytes as basic units, storing data in a fixed format specified by the Java Virtual Machine Specification.
Two data types are used for storing data: unsigned numbers and tables.

  • Unsigned numbers: Unsigned numbers are basic data types, represented by u1, u2, u4, and u8 for 1-byte, 2-byte, 4-byte, and 8-byte unsigned numbers, respectively. Unsigned numbers can describe numbers, index references, quantity tables, or string values encoded in UTF-8.
  • Tables: Composite data types consisting of multiple unsigned numbers or other tables as data items, ending with “_info”, used to describe hierarchical composite data structures.

To understand bytecode files, the following two basic concepts are also required.

  • Fully qualified name: The fully qualified name of a class replaces all ”.” in the full class name with ”/”, e.g., java.lang.String becomes java/lang/String. Fully qualified names are separated by ”;”.
  • Descriptor: Descriptors describe the data type of fields, the parameter list of methods, and return values. Each symbol corresponds to a different data type, as shown in Table 2-1.

Table 2-1 Java Types and Descriptor Symbols

DescriptorType
Bbyte
Cchar
Ddouble
Ffloat
Iint
Jlong
Sshort
Zboolean
Vvoid
LLjava/lang/Object;

Generally, the type descriptor symbol is the first letter of the basic type in uppercase. A few exceptions are notable: J, L, and Z. These three require special memorization: J represents long, L represents objects, and Z represents boolean.

Table 2-2 shows the fixed format for Class files as specified by the Java Virtual Machine Specification. All Class files store content in this format. (Note: Each class file’s content is composed in the order listed below. If certain types are not involved, they can be empty.)

Table 2-2 Format of Class Files

TypeNameCountBytes OccupiedMeaning
u4magic14Magic Number
u2minor_version12Minor Version
u2major_version12Major Version
u2constant_pool_count12Constant Pool Count
cp_infoconstant_poolconstant_pool_count-1Table StructureConstant Pool Table
u2access_flags12Class Access Flags
u2this_class12Class Index
u2super_class12Superclass Index
u2interfaces_count12Interface Count
u2interfacesinterfaces_countTable StructureInterface Structure Table
u2fields_count12Field Count
field_infofieldsfields_countTable StructureField Structure Table
u2methods_count12Method Count
method_infomethodsmethods_countTable StructureMethod Structure Table
u2attributes_count12Class Attribute Array Length
attribute_infoattributesattributes_countTable StructureAttribute Structure Table

The JVM specification requires every bytecode file to consist of these ten parts in a fixed order. The overall structure is shown in Figure 2-1:

This section will use the following code to illustrate the class structure. The Foo class below contains only a main method.

public class Foo {
public static void main(String[] args) {
System.out.println("hello word!");
}
}

Figure 2-1 Foo.class File Data

Figure 2-1 Foo.class.jpg

Class files are complex and require analysis tools for inspection.

The first eight bytes: CA FE BA BE 00 00 00 34. The first four bytes are the magic number of the class file, fixed as CAFEBABE. Its purpose is to determine whether the class file can be accepted by the JVM. When the class loader loads a class file into memory, files whose first eight bytes are not “CAFEBABE” will be rejected. The next four bytes are divided into minor and major version numbers. The major version here is 0034 (52), corresponding to JDK8, while the minor version is generally 0. If the class file’s version number is higher than the JVM’s own version number, loading the class will throw a java.lang.UnsupportedClassVersionError. Java version numbers start from 45. After JDK1.1, each major JDK release typically increments the major version number by one. Higher versions of JDK can be backward compatible with older versions of class files but cannot run files with version numbers higher than the current JVM version. Even if the file format has not changed at all, the virtual machine must refuse to execute class files with version numbers exceeding its own. The major version numbers for released versions are shown in Table 2-3.

Table 2-3 Relationship Between Java Versions and Major Versions

JDK VersionMajor Version Number (Major)
Java1.145
Java1.246
Java1.347
Java1.448
Java549
Java650
Java751
Java852
Java953
Java1054
Java1155
Java1761
Java1862

Decompiling Class Files

javap Tool

javap is a Java class file disassembler that can decompile or view bytecode generated by the Java compiler. The javap help manual is shown below.

Figure 2-2 javap Help Manual

Figure 2-2 javap Help Manual

  • Without any parameters:
~ javap Foo
Compiled from "Foo.java"
public class Foo {
public Foo();
public static void main(java.lang.String[]);
}

By default, javap displays methods with public, protected, and default access levels. To display private methods and fields, use the -p option.

  • -s outputs type descriptor signature information:
~ javap -s Foo
Compiled from "Foo.java"
public class Foo {
public Foo();
descriptor: ()V
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
}
  • -c disassembles the code:
Compiled from "Foo.java"
public class Foo {
public Foo();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3 // String hello word!
5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
}
  • -l outputs line numbers and local variable tables:
Compiled from "Foo.java"
public class Foo {
public Foo();
LineNumberTable:
line 1: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LFoo;
public static void main(java.lang.String[]);
LineNumberTable:
line 3: 0
line 4: 8
LocalVariableTable:
Start Length Slot Name Signature
0 9 0 args [Ljava/lang/String;
}
  • -v displays detailed class information, including version numbers, access flags, constant pools, method descriptors, etc., and is the most frequently used option.
Classfile /Users/xxx/Foo.class
Last modified Sep 5, 2024; size 542 bytes
MD5 checksum 4c43623c887cd026d5e99e26eaf13c3b
Compiled from "Foo.java"
public class Foo
minor version: 0
major version: 52
flags: (0x0021) ACC_PUBLIC, ACC_SUPER
this_class: #5 // Foo
super_class: #6 // java/lang/Object
interfaces: 0, fields: 0, methods: 2, attributes: 1
Constant pool:
#1 = Methodref #6.#21 // java/lang/Object."<init>":()V
#2 = Fieldref #22.#23 // java/lang/System.out:Ljava/io/PrintStream;
#3 = String #24 // hello word!
#4 = Methodref #25.#26 // java/io/PrintStream.println:(Ljava/lang/String;)V
#5 = Class #27 // Foo
#6 = Class #28 // java/lang/Object
#7 = Utf8 <init>
#8 = Utf8 ()V
#9 = Utf8 Code
#10 = Utf8 LineNumberTable
#11 = Utf8 LocalVariableTable
#12 = Utf8 this
#13 = Utf8 LFoo;
#14 = Utf8 main
#15 = Utf8 ([Ljava/lang/String;)V
#16 = Utf8 args
#17 = Utf8 [Ljava/lang/String;
#18 = Utf8 MethodParameters
#19 = Utf8 SourceFile
#20 = Utf8 Foo.java
#21 = NameAndType #7:#8 // "<init>":()V
#22 = Class #29 // java/lang/System
#23 = NameAndType #30:#31 // out:Ljava/io/PrintStream;
#24 = Utf8 hello word!
#25 = Class #32 // java/io/PrintStream
#26 = NameAndType #33:#34 // println:(Ljava/lang/String;)V
#27 = Utf8 Foo
#28 = Utf8 java/lang/Object
#29 = Utf8 java/lang/System
#30 = Utf8 out
#31 = Utf8 Ljava/io/PrintStream;
#32 = Utf8 java/io/PrintStream
#33 = Utf8 println
#34 = Utf8 (Ljava/lang/String;)V
{
public Foo();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LFoo;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3 // String hello word!
5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
LineNumberTable:
line 3: 0
line 4: 8
LocalVariableTable:
Start Length Slot Name Signature
0 9 0 args [Ljava/lang/String;
MethodParameters:
Name Flags
args
}

Typically, the javap command is frequently used on servers.

jclasslib Tool

jclasslib Bytecode Editor is a tool that visualizes class files and modifies bytecode. Project address: https://github.com/ingokegel/jclasslib.git

The jclasslib main interface is shown in Figure 2-3.

Figure 2-3 jclasslib Main Interface

  • Editing the constant pool:

Figure 2-4 jclasslib Editing Constant Pool

  • Editing operation instructions:

Figure 2-5 jclasslib Editing Bytecode Instructions

The tool also provides an IDEA plugin. Search for “jclasslib” in IDEA Plugins to install it (Figure 2-6).

Figure 2-6 Installing jclasslib IDEA Plugin

Usage is illustrated below. After compiling the code, select “Show Bytecode With jclasslib” from the “View” menu to intuitively see information about the current bytecode file’s class info, constant pool, method area, etc.

Figure 2-7 Using IDEA Plugin

Bytecode Instructions

Chapter Summary

This chapter mainly introduced the composition and structure of class files, which form the foundation for bytecode modification and detection in later chapters. It also covered how to decompile compiled bytecode and introduced Java’s official command-line tool javap and the open-source visualization tool jclasslib.