The Groovy compilation lifecycle involves 9 distinct phases. Explore all of them with 12 tested examples covering GroovyClassLoader, CompilationUnit, AST transform hooks, script vs class compilation, and joint Java-Groovy compilation.
“Understanding how the compiler thinks is the difference between fighting the language and flowing with it – every advanced Groovy technique traces back to knowing what happens between source code and bytecode.”
Dierk König, Groovy in Action
Last Updated: March 2026 | Tested on: Groovy 5.x, Java 17+ | Difficulty: Advanced | Reading Time: 20 minutes
Every time you run groovy MyScript.groovy or compile a .groovy file, your source code passes through a precise pipeline of 9 compilation phases. Each phase transforms the code – from raw text to tokens, tokens to an Abstract Syntax Tree, the AST through semantic checks and rewrites, and finally to JVM bytecode. The official compilation phases guide provides the reference for each stage. Understanding this pipeline isn’t academic trivia – it’s the foundation for AST transformations, compile-time metaprogramming, custom type checking, and debugging compilation errors that make no sense at the surface level.
Groovy’s compiler is more accessible than most. The CompilationUnit class is a public API – you can instantiate it, feed it source code, control which phases run, and inspect the AST at any point. The GroovyClassLoader wraps this into a convenient classloader that compiles and loads classes on the fly. And the CompilePhase enum gives you named hooks where your AST transforms can inject themselves.
This post walks through every phase with working code. You’ll see exactly what the compiler does at each step, how to inspect the AST, where AST transforms hook in, and how script compilation differs from class compilation. We’ll also cover joint Java-Groovy compilation – the mechanism that lets Groovy and Java classes reference each other in the same project.
Table of Contents
Quick Reference Table
| Phase | Phase Number | What Happens | AST Transforms Run? |
|---|---|---|---|
INITIALIZATION | 1 | Source files registered, compilation unit set up | No |
PARSING | 2 | Source code tokenized and parsed into a raw CST/AST | No |
CONVERSION | 3 | CST converted to Groovy AST (nodes become ClassNode, MethodNode, etc.) | Yes (global) |
SEMANTIC_ANALYSIS | 4 | Type resolution, variable scoping, consistency checks | Yes (local + global) |
CANONICALIZATION | 5 | AST cleaned up, inner classes processed, final preparations | Yes (local + global) |
INSTRUCTION_SELECTION | 6 | AST optimized for bytecode generation | Yes (local + global) |
CLASS_GENERATION | 7 | AST nodes translated to JVM bytecode | Yes (global only) |
OUTPUT | 8 | Bytecode written to .class files | No |
FINALIZATION | 9 | Cleanup, resources released | No |
Examples
Example 1: The CompilePhase Enum
What we’re doing: Listing all 9 compilation phases using the CompilePhase enum that Groovy exposes as a public API.
Example 1: The CompilePhase Enum
import org.codehaus.groovy.control.CompilePhase
println "Groovy Compilation Phases:"
println "=" * 60
CompilePhase.values().each { phase ->
println "${String.valueOf(phase.phaseNumber).padLeft(2)}. ${phase.name().padRight(25)} (phase ${phase.phaseNumber})"
}
println "\n${'=' * 60}"
println "Total phases: ${CompilePhase.values().length}"
println "\nPhases where AST transforms can run:"
[CompilePhase.CONVERSION, CompilePhase.SEMANTIC_ANALYSIS,
CompilePhase.CANONICALIZATION, CompilePhase.INSTRUCTION_SELECTION,
CompilePhase.CLASS_GENERATION].each {
println " - ${it.name()} (phase ${it.phaseNumber})"
}
Output
Groovy Compilation Phases: ============================================================ 1. INITIALIZATION (phase 1) 2. PARSING (phase 2) 3. CONVERSION (phase 3) 4. SEMANTIC_ANALYSIS (phase 4) 5. CANONICALIZATION (phase 5) 6. INSTRUCTION_SELECTION (phase 6) 7. CLASS_GENERATION (phase 7) 8. OUTPUT (phase 8) 9. FINALIZATION (phase 9) ============================================================ Total phases: 9 Phases where AST transforms can run: - CONVERSION (phase 3) - SEMANTIC_ANALYSIS (phase 4) - CANONICALIZATION (phase 5) - INSTRUCTION_SELECTION (phase 6) - CLASS_GENERATION (phase 7)
What happened here: CompilePhase is a real enum in the org.codehaus.groovy.control package – it’s the same enum used internally by the compiler and by @GroovyASTTransformation to declare which phase a transform runs in. Each phase has a numeric value (1-9) that determines execution order. AST transforms can hook into phases 3-7. Phases 1-2 (setup and parsing) and 8-9 (output and cleanup) don’t support transforms because the AST either doesn’t exist yet or is already consumed.
Example 2: Compiling Source Code with CompilationUnit
What we’re doing: Using CompilationUnit directly to compile Groovy source code and observe the compilation process.
Example 2: Compiling Source Code with CompilationUnit
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.CompilePhase
import org.codehaus.groovy.control.CompilerConfiguration
def config = new CompilerConfiguration()
def cu = new CompilationUnit(config)
// Add source code as a string
cu.addSource('MyClass.groovy', '''
class MyClass {
String name
int age
String greet() {
return "Hello, I'm $name and I'm $age years old"
}
}
''')
println "Before compilation:"
println " Sources added: 1"
// Compile through all phases
cu.compile()
println "\nAfter compilation:"
println " Compilation successful!"
// Get the compiled classes
def classes = cu.classes
println " Classes generated: ${classes.size()}"
classes.each { cls ->
println " - ${cls.name} (${cls.bytes.length} bytes of bytecode)"
}
Output
Before compilation: Sources added: 1 After compilation: Compilation successful! Classes generated: 1 - MyClass (1847 bytes of bytecode)
What happened here: CompilationUnit is the heart of the Groovy compiler. It manages the entire pipeline – from accepting source inputs through all 9 phases to producing bytecode. We created one with a default CompilerConfiguration, added source code as a string (you can also add files), and called compile() which runs all phases. The classes property returns a list of GroovyClass objects, each with name and bytes properties. This is the same mechanism that GroovyShell and GroovyClassLoader use internally.
Example 3: Phase-by-Phase Compilation
What we’re doing: Compiling source code one phase at a time to observe what changes at each step.
Example 3: Phase-by-Phase Compilation
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.CompilePhase
import org.codehaus.groovy.control.Phases
def cu = new CompilationUnit()
cu.addSource('Calculator.groovy', '''
class Calculator {
int add(int a, int b) { a + b }
int multiply(int a, int b) { a * b }
}
''')
// Compile phase by phase
def phaseNames = [
(Phases.INITIALIZATION): 'INITIALIZATION',
(Phases.PARSING): 'PARSING',
(Phases.CONVERSION): 'CONVERSION',
(Phases.SEMANTIC_ANALYSIS): 'SEMANTIC_ANALYSIS',
(Phases.CANONICALIZATION): 'CANONICALIZATION',
(Phases.INSTRUCTION_SELECTION): 'INSTRUCTION_SELECTION',
(Phases.CLASS_GENERATION): 'CLASS_GENERATION',
(Phases.OUTPUT): 'OUTPUT',
(Phases.FINALIZATION): 'FINALIZATION'
]
println "Compiling Calculator.groovy phase by phase:"
println "=" * 55
phaseNames.each { phaseNum, phaseName ->
try {
cu.compile(phaseNum)
def status = "OK"
// Check what's available after each phase
if (phaseNum == Phases.PARSING) {
def modules = cu.AST?.modules?.size() ?: 0
status = "OK (${modules} module(s) parsed)"
} else if (phaseNum == Phases.CONVERSION) {
def classes = cu.AST?.classes?.size() ?: 0
status = "OK (${classes} class node(s) in AST)"
} else if (phaseNum == Phases.SEMANTIC_ANALYSIS) {
def methods = cu.AST?.classes?.collectMany { it.methods }?.size() ?: 0
status = "OK (${methods} method(s) resolved)"
} else if (phaseNum == Phases.CLASS_GENERATION) {
def generated = cu.classes?.size() ?: 0
status = "OK (${generated} class(es) generated)"
}
println " Phase ${String.valueOf(phaseNum).padLeft(2)}: ${phaseName.padRight(25)} -> ${status}"
} catch (Exception e) {
println " Phase ${String.valueOf(phaseNum).padLeft(2)}: ${phaseName.padRight(25)} -> FAILED: ${e.message}"
}
}
Output
Compiling Calculator.groovy phase by phase: ======================================================= Phase 1: INITIALIZATION -> OK Phase 2: PARSING -> OK (1 module(s) parsed) Phase 3: CONVERSION -> OK (1 class node(s) in AST) Phase 4: SEMANTIC_ANALYSIS -> OK (2 method(s) resolved) Phase 5: CANONICALIZATION -> OK Phase 6: INSTRUCTION_SELECTION -> OK Phase 7: CLASS_GENERATION -> OK (1 class(es) generated) Phase 8: OUTPUT -> OK Phase 9: FINALIZATION -> OK
What happened here: CompilationUnit.compile(int phase) advances the compilation to the specified phase. Calling it multiple times with increasing phase numbers lets you pause and inspect the state between phases. After PARSING, the module structure exists. After CONVERSION, class nodes are in the AST. After SEMANTIC_ANALYSIS, methods and types are fully resolved. After CLASS_GENERATION, bytecode exists. This is exactly how AST transforms work – they hook into a specific phase and manipulate the AST before the next phase runs.
Example 4: Inspecting the AST After Parsing
What we’re doing: Compiling to the SEMANTIC_ANALYSIS phase and then walking the AST to see the class structure the compiler built.
Example 4: Inspecting the AST After Parsing
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.Phases
def cu = new CompilationUnit()
cu.addSource('Person.groovy', '''
class Person {
String name
int age
String greet(String greeting) {
return "$greeting, I'm $name ($age)"
}
boolean isAdult() {
return age >= 18
}
static Person create(String name, int age) {
return new Person(name: name, age: age)
}
}
''')
// Compile up to semantic analysis so types are resolved
cu.compile(Phases.SEMANTIC_ANALYSIS)
// Walk the AST
def ast = cu.AST
println "Module: ${ast.modules.size()} module(s)"
println "=" * 55
ast.classes.each { classNode ->
println "\nClass: ${classNode.name}"
println " Superclass: ${classNode.superClass.name}"
println " Interfaces: ${classNode.interfaces*.name ?: 'none'}"
println "\n Fields:"
classNode.fields.each { field ->
println " ${field.type.name} ${field.name}"
}
println "\n Properties:"
classNode.properties.each { prop ->
println " ${prop.type.name} ${prop.name}"
}
println "\n Methods:"
classNode.methods.each { method ->
def params = method.parameters.collect { "${it.type.name} ${it.name}" }.join(', ')
def modifiers = java.lang.reflect.Modifier.toString(method.modifiers)
println " ${modifiers} ${method.returnType.name} ${method.name}($params)"
}
}
Output
Module: 1 module(s)
=======================================================
Class: Person
Superclass: java.lang.Object
Interfaces: none
Fields:
java.lang.String name
int age
Properties:
java.lang.String name
int age
Methods:
public java.lang.String greet(java.lang.String greeting)
public boolean isAdult()
public static Person create(java.lang.String name, int age)
What happened here: After SEMANTIC_ANALYSIS, the AST is fully populated with resolved types. ClassNode contains the class structure – superclass, interfaces, fields, properties, and methods. Each MethodNode has its return type, parameters, and modifiers resolved. FieldNode and PropertyNode are separate in Groovy’s AST – properties generate both a field and getter/setter methods. This is the same AST that AST transformations manipulate. You can add, remove, or modify any of these nodes to change the compiled output.
Example 5: GroovyClassLoader – Compile and Load at Runtime
What we’re doing: Using GroovyClassLoader to compile Groovy source code and immediately load the resulting class – the mechanism behind Groovy’s dynamic class loading.
Example 5: GroovyClassLoader – Compile and Load at Runtime
def gcl = new GroovyClassLoader()
// Compile and load a class from a string
def clazz = gcl.parseClass('''
class MathHelper {
static double circleArea(double radius) {
return Math.PI * radius * radius
}
static long factorial(int n) {
if (n <= 1) return 1
return n * factorial(n - 1)
}
static boolean isPrime(int n) {
if (n < 2) return false
for (int i = 2; i <= Math.sqrt(n); i++) {
if (n % i == 0) return false
}
return true
}
}
''')
println "Compiled class: ${clazz.name}"
println "ClassLoader: ${clazz.classLoader.getClass().simpleName}"
println "Methods: ${clazz.declaredMethods*.name.findAll { !it.startsWith('$') }.sort()}"
// Use the compiled class
def instance = clazz.newInstance()
println "\nCircle area (r=5): ${clazz.circleArea(5.0).round(4)}"
println "Factorial(10): ${clazz.factorial(10)}"
println "Primes up to 20: ${(2..20).findAll { clazz.isPrime(it) }}"
// Compile another class that references the first
def clazz2 = gcl.parseClass('''
class GeometryReport {
def generate(double radius) {
def area = MathHelper.circleArea(radius)
return "Circle with radius $radius has area ${area.round(2)}"
}
}
''')
def report = clazz2.newInstance()
println "\n${report.generate(7.5)}"
gcl.close()
Output
Compiled class: MathHelper ClassLoader: GroovyClassLoader Methods: [circleArea, factorial, isPrime] Circle area (r=5): 78.5398 Factorial(10): 3628800 Primes up to 20: [2, 3, 5, 7, 11, 13, 17, 19] Circle with radius 7.5 has area 176.71
What happened here: GroovyClassLoader wraps CompilationUnit and the standard Java ClassLoader into a single API. parseClass(String) compiles the source through all 9 phases and then defines the class in the JVM. The resulting Class object is fully functional – you can create instances, call methods, and reference it from other classes compiled by the same GroovyClassLoader. This is how GroovyShell, GroovyScriptEngine, and Grails’ hot-reload feature work internally. Each GroovyClassLoader maintains its own namespace, so classes compiled by different loaders are isolated.
Example 6: Script vs Class Compilation
What we’re doing: Comparing what the compiler generates for a Groovy script versus a Groovy class – and revealing the hidden class that wraps every script.
Example 6: Script vs Class Compilation
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.Phases
// Compile a SCRIPT (no class declaration)
def scriptCu = new CompilationUnit()
scriptCu.addSource('MyScript.groovy', '''
def message = "Hello from script"
println message
def add(int a, int b) { a + b }
println "Sum: ${add(3, 4)}"
''')
scriptCu.compile(Phases.SEMANTIC_ANALYSIS)
println "=== Script Compilation ==="
scriptCu.AST.classes.each { classNode ->
println "Class: ${classNode.name}"
println " Superclass: ${classNode.superClass.name}"
println " Is Script: ${classNode.superClass.name == 'groovy.lang.Script'}"
println " Methods: ${classNode.methods*.name}"
}
// Compile a CLASS (explicit class declaration)
def classCu = new CompilationUnit()
classCu.addSource('MyClass.groovy', '''
class MyClass {
String message = "Hello from class"
void printMessage() {
println message
}
int add(int a, int b) { a + b }
}
''')
classCu.compile(Phases.SEMANTIC_ANALYSIS)
println "\n=== Class Compilation ==="
classCu.AST.classes.each { classNode ->
println "Class: ${classNode.name}"
println " Superclass: ${classNode.superClass.name}"
println " Is Script: ${classNode.superClass.name == 'groovy.lang.Script'}"
println " Methods: ${classNode.methods*.name}"
}
// Compile a SCRIPT that also contains a class
def mixedCu = new CompilationUnit()
mixedCu.addSource('Mixed.groovy', '''
class Helper {
static String format(String s) { s.toUpperCase() }
}
println Helper.format("mixed mode")
''')
mixedCu.compile(Phases.SEMANTIC_ANALYSIS)
println "\n=== Mixed Script + Class ==="
mixedCu.AST.classes.each { classNode ->
println "Class: ${classNode.name}"
println " Superclass: ${classNode.superClass.name}"
println " Is Script: ${classNode.superClass.name == 'groovy.lang.Script'}"
}
Output
=== Script Compilation === Class: MyScript Superclass: groovy.lang.Script Is Script: true Methods: [main, run, add] === Class Compilation === Class: MyClass Superclass: java.lang.Object Is Script: false Methods: [printMessage, add] === Mixed Script + Class === Class: Helper Superclass: java.lang.Object Is Script: false Class: Mixed Superclass: groovy.lang.Script Is Script: true
What happened here: When Groovy compiles a file without a class declaration, it generates a class that extends groovy.lang.Script. The class name comes from the filename (MyScript.groovy becomes MyScript). The script body goes into a run() method, and a main(String[]) method is generated so you can execute it from the command line. Functions defined in the script become methods on this class. When a file contains both loose code and class declarations (mixed mode), the compiler generates the explicit class plus a Script wrapper for the loose code. This is why groovy MyScript.groovy works – it compiles the file and calls main() on the generated Script class.
Example 7: How AST Transforms Hook Into Phases
What we’re doing: Demonstrating how Groovy’s built-in AST transformations declare which compilation phase they run in, and inspecting the transform registry.
Example 7: How AST Transforms Hook Into Phases
import org.codehaus.groovy.control.CompilePhase
import org.codehaus.groovy.transform.GroovyASTTransformationClass
import groovy.transform.*
// Inspect what phase common AST transforms run at
def transforms = [
ToString,
EqualsAndHashCode,
TupleConstructor,
Immutable,
Canonical,
CompileStatic,
TypeChecked,
Sortable,
Builder
]
println "Built-in AST Transform Phases:"
println "=" * 60
transforms.each { annotation ->
try {
def astAnno = annotation.getAnnotation(GroovyASTTransformationClass)
if (astAnno) {
def transformClasses = astAnno.value() ?: []
transformClasses.each { className ->
try {
def transformClass = Class.forName(className)
def phaseAnno = transformClass.getAnnotation(
org.codehaus.groovy.transform.GroovyASTTransformation
)
if (phaseAnno) {
println " @${annotation.simpleName.padRight(22)} -> ${phaseAnno.phase().name()} (phase ${phaseAnno.phase().phaseNumber})"
}
} catch (ClassNotFoundException e) {
println " @${annotation.simpleName.padRight(22)} -> transform class: $className"
}
}
}
} catch (Exception e) {
println " @${annotation.simpleName.padRight(22)} -> ${e.message}"
}
}
println "\nLocal vs Global AST Transforms:"
println " Local: Triggered by annotations (@ToString, @Builder, etc.)"
println " Global: Run on ALL code (registered via META-INF/services)"
Output
Built-in AST Transform Phases: ============================================================ @ToString -> CANONICALIZATION (phase 5) @EqualsAndHashCode -> CANONICALIZATION (phase 5) @TupleConstructor -> CANONICALIZATION (phase 5) @Immutable -> CANONICALIZATION (phase 5) @Canonical -> CANONICALIZATION (phase 5) @CompileStatic -> INSTRUCTION_SELECTION (phase 6) @TypeChecked -> INSTRUCTION_SELECTION (phase 6) @Sortable -> CANONICALIZATION (phase 5) @Builder -> CANONICALIZATION (phase 5) Local vs Global AST Transforms: Local: Triggered by annotations (@ToString, @Builder, etc.) Global: Run on ALL code (registered via META-INF/services)
What happened here: Every AST transform annotation is meta-annotated with @GroovyASTTransformationClass, which points to the actual transform implementation. That implementation class carries @GroovyASTTransformation(phase=CompilePhase.X), declaring which phase it runs in. Most code-generation transforms (@ToString, @Builder, @EqualsAndHashCode) run at CANONICALIZATION (phase 5) because they need resolved types but must finish before bytecode generation. Type-checking transforms (@CompileStatic, @TypeChecked) run at INSTRUCTION_SELECTION (phase 6) because they need the AST in its final form. See our Groovy AST Transformations guide for building your own transforms.
Example 8: CompilerConfiguration Options
What we’re doing: Exploring the CompilerConfiguration class that controls how the Groovy compiler behaves – encoding, classpath, target bytecode, and more.
Example 8: CompilerConfiguration Options
import org.codehaus.groovy.control.CompilerConfiguration
// Default configuration
def defaultConfig = new CompilerConfiguration()
println "Default CompilerConfiguration:"
println " Source encoding: ${defaultConfig.sourceEncoding}"
println " Target bytecode: ${defaultConfig.targetBytecode}"
println " Verbose: ${defaultConfig.verbose}"
println " Debug: ${defaultConfig.debug}"
println " Tolerance: ${defaultConfig.tolerance}"
println " Script base class: ${defaultConfig.scriptBaseClass ?: 'groovy.lang.Script'}"
// Custom configuration
def customConfig = new CompilerConfiguration()
customConfig.sourceEncoding = 'UTF-8'
customConfig.targetBytecode = '17'
customConfig.verbose = false
customConfig.debug = true
// Add a custom script base class
customConfig.scriptBaseClass = 'groovy.lang.Script'
println "\nCustom configuration applied:"
println " Target bytecode: ${customConfig.targetBytecode}"
println " Source encoding: ${customConfig.sourceEncoding}"
// Use the custom config to compile
def gcl = new GroovyClassLoader(this.getClass().classLoader, customConfig)
def clazz = gcl.parseClass('''
println "Compiled with custom config"
return 42
''')
def script = clazz.newInstance()
script.binding = new Binding()
def result = script.run()
println "Script returned: $result"
// Show available target bytecode versions
println "\nSupported target bytecodes: 1.8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21"
gcl.close()
Output
Default CompilerConfiguration: Source encoding: UTF-8 Target bytecode: 17 Verbose: false Debug: false Tolerance: 10 Script base class: groovy.lang.Script Custom configuration applied: Target bytecode: 17 Source encoding: UTF-8 Compiled with custom config Script returned: 42 Supported target bytecodes: 1.8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
What happened here: CompilerConfiguration controls every aspect of the compilation process. targetBytecode sets the JVM version for generated bytecode (matching Java’s -target flag). sourceEncoding sets the character encoding for source files. scriptBaseClass lets you change the default superclass for scripts – useful for domain-specific languages where you want all scripts to inherit custom methods. tolerance controls how many errors the compiler collects before aborting. Gradle’s groovyOptions and the groovyc command-line flags map directly to this class.
Example 9: Detecting Compilation Errors Programmatically
What we’re doing: Catching and inspecting compilation errors using the MultipleCompilationErrorsException that the compiler throws.
Example 9: Detecting Compilation Errors Programmatically
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.MultipleCompilationErrorsException
def testCompilation(String name, String source) {
println "Testing: $name"
def cu = new CompilationUnit()
cu.addSource("${name}.groovy", source)
try {
cu.compile()
println " Result: SUCCESS"
} catch (MultipleCompilationErrorsException e) {
def collector = e.errorCollector
println " Result: FAILED (${collector.errorCount} error(s))"
collector.errors.each { error ->
def cause = error.cause
if (cause) {
println " Line ${cause.line}, Column ${cause.startColumn}: ${cause.message?.take(80)}"
} else {
println " Error: ${error.message?.take(80)}"
}
}
}
println ""
}
// Test 1: Valid code
testCompilation('ValidCode', '''
class Greeter {
String greet(String name) { "Hello, $name" }
}
''')
// Test 2: Syntax error
testCompilation('SyntaxError', '''
class Broken {
String greet(String name) {
return "Hello, $name"
// Missing closing brace
''')
// Test 3: Type error with @CompileStatic
testCompilation('TypeError', '''
import groovy.transform.CompileStatic
@CompileStatic
class StrictMath {
int add(int a, int b) {
return a + "not a number"
}
}
''')
// Test 4: Duplicate method
testCompilation('DuplicateMethod', '''
class Dup {
void doWork() { println 'v1' }
void doWork() { println 'v2' }
}
''')
Output
Testing: ValidCode Result: SUCCESS Testing: SyntaxError Result: FAILED (1 error(s)) Line 5, Column 1: unexpected token: Testing: TypeError Result: FAILED (1 error(s)) Line 6, Column 20: [Static type checking] - Cannot return value of type java.la Testing: DuplicateMethod Result: FAILED (1 error(s)) Line 3, Column 5: Repetitive method name/signature for method 'void doWork()' i
What happened here: The compiler throws MultipleCompilationErrorsException which contains an ErrorCollector. Each error has a SyntaxException cause with line number, column, and message. Syntax errors are caught during PARSING (phase 2), type errors during INSTRUCTION_SELECTION (phase 6, when @CompileStatic is active), and semantic errors like duplicate methods during SEMANTIC_ANALYSIS (phase 4). This is the same error reporting that IDEs use – they compile your code through specific phases and extract the error list to show red squiggles.
Example 10: Joint Java-Groovy Compilation
What we’re doing: Understanding how Groovy and Java classes can reference each other in the same project through the joint compilation mechanism.
Example 10: Joint Java-Groovy Compilation
import org.codehaus.groovy.control.CompilerConfiguration
// Demonstrate how joint compilation works conceptually
println "Joint Java-Groovy Compilation Process:"
println "=" * 55
def phases = [
"1. Groovy compiler scans ALL source files (.groovy + .java)",
"2. Creates 'stubs' for Groovy classes (skeleton .java files)",
" - Stubs contain class/method signatures but no bodies",
" - Method bodies replaced with 'throw new RuntimeException()'",
"3. Java compiler (javac) compiles .java files + Groovy stubs",
" - Java code can now reference Groovy classes via stubs",
"4. Groovy compiler compiles .groovy files normally",
" - Groovy code can reference Java classes from step 3",
"5. Groovy compiler output REPLACES the stub .class files",
" - Final bytecode has real implementations, not stubs"
]
phases.each { println " $it" }
// Show how to enable joint compilation in CompilerConfiguration
def config = new CompilerConfiguration()
def jointConfig = config.jointCompilationOptions
println "\nJoint compilation options: ${jointConfig ?: 'not enabled (default)'}"
// In Gradle, joint compilation is enabled like this:
println "\nGradle setup for joint compilation:"
println '''
// build.gradle
sourceSets {
main {
groovy {
srcDirs = ['src/main/groovy', 'src/main/java']
}
java {
srcDirs = [] // Let Groovy compile everything
}
}
}
'''
// Demonstrate a cross-reference scenario
def gcl = new GroovyClassLoader()
// Groovy class
def groovyClass = gcl.parseClass('''
class GroovyService {
String process(String input) {
return input.reverse().toUpperCase()
}
}
''')
// Another Groovy class that references the first
def clientClass = gcl.parseClass('''
class ServiceClient {
def service = new GroovyService()
String execute(String data) {
return "Result: ${service.process(data)}"
}
}
''')
def client = clientClass.newInstance()
println "Cross-reference result: ${client.execute('groovy')}"
Output
Joint Java-Groovy Compilation Process:
=======================================================
1. Groovy compiler scans ALL source files (.groovy + .java)
2. Creates 'stubs' for Groovy classes (skeleton .java files)
- Stubs contain class/method signatures but no bodies
- Method bodies replaced with 'throw new RuntimeException()'
3. Java compiler (javac) compiles .java files + Groovy stubs
- Java code can now reference Groovy classes via stubs
4. Groovy compiler compiles .groovy files normally
- Groovy code can reference Java classes from step 3
5. Groovy compiler output REPLACES the stub .class files
- Final bytecode has real implementations, not stubs
Joint compilation options: not enabled (default)
Gradle setup for joint compilation:
// build.gradle
sourceSets {
main {
groovy {
srcDirs = ['src/main/groovy', 'src/main/java']
}
java {
srcDirs = [] // Let Groovy compile everything
}
}
}
Cross-reference result: Result: YVOORG
What happened here: Joint compilation solves the circular dependency problem between Java and Groovy files. Without it, Java files can’t reference Groovy classes because javac runs first and the Groovy classes don’t exist yet. The stub generation trick creates temporary Java source files with just the class signatures, enough for javac to compile against. Then the Groovy compiler produces the real bytecode and replaces the stub output. This is why Gradle’s Groovy plugin recommends putting all sources under src/main/groovy – the Groovy compiler handles both languages. For more on Groovy with Gradle, see our Groovy Gradle guide.
Example 11: Adding a Phase Operation (Compilation Listener)
What we’re doing: Attaching a listener to the CompilationUnit that runs custom code at specific compilation phases – the mechanism behind global AST transforms.
Example 11: Adding a Phase Operation (Compilation Listener)
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.Phases
import org.codehaus.groovy.control.SourceUnit
import org.codehaus.groovy.control.CompilePhase
def cu = new CompilationUnit()
// Track what we see at each phase
def phaseLog = []
// Add a phase operation that runs after CONVERSION
cu.addPhaseOperation({ source, context, classgen ->
def classes = cu.AST?.classes?.size() ?: 0
phaseLog << "After CONVERSION: ${classes} class(es) in AST"
} as org.codehaus.groovy.control.CompilationUnit.IPrimaryClassNodeOperation, Phases.CONVERSION)
// Add a phase operation that runs after SEMANTIC_ANALYSIS
cu.addPhaseOperation({ source, context, classgen ->
cu.AST?.classes?.each { classNode ->
def methodCount = classNode.methods?.size() ?: 0
phaseLog << "After SEMANTIC_ANALYSIS: ${classNode.name} has ${methodCount} method(s)"
}
} as org.codehaus.groovy.control.CompilationUnit.IPrimaryClassNodeOperation, Phases.SEMANTIC_ANALYSIS)
// Add source code
cu.addSource('Monitored.groovy', '''
class OrderService {
List orders = []
void addOrder(String item, double price) {
orders << [item: item, price: price]
}
double getTotal() {
orders.sum { it.price } ?: 0.0
}
int getOrderCount() {
orders.size()
}
}
''')
println "Compiling with phase listeners..."
cu.compile()
println "\nPhase operation log:"
phaseLog.each { println " $it" }
println "\nClasses generated: ${cu.classes.size()}"
cu.classes.each { cls ->
println " ${cls.name}: ${cls.bytes.length} bytes"
}
Output
Compiling with phase listeners... Phase operation log: After CONVERSION: 1 class(es) in AST After SEMANTIC_ANALYSIS: OrderService has 3 method(s) Classes generated: 1 OrderService: 3254 bytes
What happened here: addPhaseOperation() registers a callback that the compiler invokes at a specific phase. This is the programmatic equivalent of a global AST transform – your code runs during compilation and can read or modify the AST. The IPrimaryClassNodeOperation interface receives the source unit, compilation context, and class generator. Global AST transforms registered via META-INF/services/org.codehaus.groovy.transform.ASTTransformation use this same mechanism internally. This is how tools like CodeNarc (Groovy’s static analysis tool) and Spock’s test framework hook into the compiler.
Example 12: What Each Phase Produces – A Complete Walkthrough
What we’re doing: Walking through the complete compilation of a realistic class, showing what the AST looks like at key phases and what each phase contributes.
Example 12: Complete Compilation Walkthrough
import org.codehaus.groovy.control.CompilationUnit
import org.codehaus.groovy.control.Phases
def source = '''
import groovy.transform.ToString
@ToString(includeNames=true)
class Employee {
String name
String department
double salary
boolean isHighEarner() {
salary > 100000
}
String getSummary() {
"$name ($department) - \\$${String.format('%.0f', salary)}"
}
}
'''
println "Source code:"
println "-" * 40
println source.trim()
println "-" * 40
// Phase 1-2: Parse
def cu1 = new CompilationUnit()
cu1.addSource('Employee.groovy', source)
cu1.compile(Phases.PARSING)
println "\nAfter PARSING (phase 2):"
println " Modules: ${cu1.AST.modules.size()}"
println " Statement blocks parsed: ${cu1.AST?.modules?.getAt(0)?.statementBlock != null}"
// Phase 3: Conversion
cu1.compile(Phases.CONVERSION)
println "\nAfter CONVERSION (phase 3):"
cu1.AST.classes.each { cn ->
println " Class: ${cn.name}"
println " Annotations: ${cn.annotations*.classNode*.name}"
println " Properties: ${cn.properties*.name}"
println " Methods (declared): ${cn.methods*.name}"
}
// Phase 4: Semantic Analysis
cu1.compile(Phases.SEMANTIC_ANALYSIS)
println "\nAfter SEMANTIC_ANALYSIS (phase 4):"
cu1.AST.classes.each { cn ->
println " Class: ${cn.name}"
println " Resolved superclass: ${cn.superClass.name}"
cn.methods.each { m ->
println " Method: ${m.returnType.name} ${m.name}(${m.parameters*.type*.name.join(', ')})"
}
}
// Phase 5: Canonicalization (where @ToString runs)
cu1.compile(Phases.CANONICALIZATION)
println "\nAfter CANONICALIZATION (phase 5):"
cu1.AST.classes.each { cn ->
println " Class: ${cn.name}"
println " Methods after AST transforms: ${cn.methods*.name}"
def hasToString = cn.methods.any { it.name == 'toString' }
println " @ToString generated toString(): $hasToString"
}
// Phase 7: Class Generation
cu1.compile(Phases.CLASS_GENERATION)
println "\nAfter CLASS_GENERATION (phase 7):"
cu1.classes.each { cls ->
println " ${cls.name}: ${cls.bytes.length} bytes of JVM bytecode"
}
// Load and test the compiled class
def gcl = new GroovyClassLoader()
def clazz = gcl.parseClass(source)
def emp = clazz.newInstance(name: 'Nirranjan', department: 'Engineering', salary: 125000)
println "\nRunning compiled code:"
println " toString(): ${emp.toString()}"
println " isHighEarner(): ${emp.isHighEarner()}"
println " getSummary(): ${emp.getSummary()}"
gcl.close()
Output
Source code:
----------------------------------------
import groovy.transform.ToString
@ToString(includeNames=true)
class Employee {
String name
String department
double salary
boolean isHighEarner() {
salary > 100000
}
String getSummary() {
"$name ($department) - \$${String.format('%.0f', salary)}"
}
}
----------------------------------------
After PARSING (phase 2):
Modules: 1
Statement blocks parsed: true
After CONVERSION (phase 3):
Class: Employee
Annotations: [ToString]
Properties: [name, department, salary]
Methods (declared): [isHighEarner, getSummary]
After SEMANTIC_ANALYSIS (phase 4):
Class: Employee
Resolved superclass: java.lang.Object
Method: boolean isHighEarner()
Method: java.lang.String getSummary()
After CANONICALIZATION (phase 5):
Class: Employee
Methods after AST transforms: [isHighEarner, getSummary, toString]
@ToString generated toString(): true
After CLASS_GENERATION (phase 7):
Employee: 5832 bytes of JVM bytecode
Running compiled code:
toString(): Employee(name:Nirranjan, department:Engineering, salary:125000.0)
isHighEarner(): true
getSummary(): Nirranjan (Engineering) - $125000
What happened here: This example traces the complete lifecycle of a class with an @ToString AST transform. After PARSING, the module structure exists but types aren’t resolved. After CONVERSION, the class node has properties and methods, and the @ToString annotation is visible but hasn’t been processed. After SEMANTIC_ANALYSIS, all types are resolved. After CANONICALIZATION, the @ToString transform has run and injected a toString() method into the AST – this is the magic of compile-time metaprogramming. After CLASS_GENERATION, we have JVM bytecode that includes the generated method. The final test proves the generated toString() works exactly as specified by @ToString(includeNames=true).
Common Pitfalls
Working with the Groovy compilation lifecycle has some subtle traps. Here are the most common ones.
Pitfall 1: AST Transform Running at the Wrong Phase
If your AST transform runs too early, type information isn’t available yet. If it runs too late, the bytecode generator might have already locked down the AST structure.
Bad: Transform at CONVERSION (too early for types)
// BAD - at CONVERSION phase, types are not yet resolved
// method.returnType might show 'Object' instead of the real type
@GroovyASTTransformation(phase = CompilePhase.CONVERSION)
class TooEarlyTransform implements ASTTransformation {
void visit(ASTNode[] nodes, SourceUnit source) {
// classNode.methods[0].returnType.name
// might be unresolved at this point!
}
}
Good: Transform at SEMANTIC_ANALYSIS or CANONICALIZATION
// GOOD - at CANONICALIZATION, types are fully resolved
@GroovyASTTransformation(phase = CompilePhase.CANONICALIZATION)
class CorrectPhaseTransform implements ASTTransformation {
void visit(ASTNode[] nodes, SourceUnit source) {
// All types resolved, safe to inspect and modify the AST
}
}
Pitfall 2: GroovyClassLoader Memory Leaks
Every call to parseClass() creates a new class in memory. In long-running applications that compile scripts dynamically, this causes ClassLoader memory leaks because old classes are never garbage-collected.
Bad: Creating a new GroovyClassLoader per compilation
// BAD - leaks classloaders over time
def compileScript(String source) {
def gcl = new GroovyClassLoader() // New classloader every time!
return gcl.parseClass(source)
// gcl is never closed, class can't be GC'd
}
Good: Reuse GroovyClassLoader or close it properly
// GOOD - reuse one classloader, or close after use
def gcl = new GroovyClassLoader()
try {
def clazz1 = gcl.parseClass(source1)
def clazz2 = gcl.parseClass(source2)
// Use the classes...
} finally {
gcl.close() // Clean up when done
}
// For hot-reload: clear the class cache
gcl.clearCache()
Pitfall 3: Script Names and Class Name Collisions
When you call parseClass() with the same source twice, GroovyClassLoader caches the class based on the source text’s hash. If the source changes but the filename stays the same, you might get the cached version.
Bad: Relying on implicit caching behavior
// BAD - if you change the source, caching can bite you
def gcl = new GroovyClassLoader()
def v1 = gcl.parseClass('class Foo { int x = 1 }')
def v2 = gcl.parseClass('class Foo { int x = 2 }') // Different class loaded!
// v1 and v2 are different classes with the same name - confusing
Good: Use unique names or clear cache between versions
// GOOD - clear cache before reloading, or use unique class names
def gcl = new GroovyClassLoader()
gcl.clearCache()
def fresh = gcl.parseClass('class Foo { int x = 2 }')
Conclusion
The Groovy compilation lifecycle is a 9-phase pipeline that transforms source text into JVM bytecode: Initialization, Parsing, Conversion, Semantic Analysis, Canonicalization, Instruction Selection, Class Generation, Output, and Finalization. Each phase has a specific job – parsing creates the raw AST, conversion builds proper class nodes, semantic analysis resolves types, canonicalization runs most AST transforms, and class generation produces bytecode. Understanding these phases is essential for anyone writing AST transforms, debugging compilation errors, or building tools that process Groovy code.
The key classes are CompilationUnit (the compiler pipeline), CompilerConfiguration (compiler settings), GroovyClassLoader (compile-and-load in one step), and CompilePhase (the enum that names the phases). Scripts compile into classes that extend groovy.lang.Script, and joint compilation uses stub generation to let Java and Groovy reference each other. All of this is public API – you can instantiate a CompilationUnit, feed it source code, compile phase by phase, and inspect the AST at every step.
For hands-on AST manipulation, see our Groovy AST Transformations guide. To understand how @Grab hooks into the compilation process to download dependencies, check Groovy @Grab Dependency Management. And for the difference between scripts and classes in practical terms, our next post on Groovy Script vs Class covers the runtime behavior side of what we explored here at the compiler level.
Best Practices
- DO use CANONICALIZATION (phase 5) for AST transforms that generate code – types are resolved but bytecode hasn’t been generated yet.
- DO close
GroovyClassLoaderinstances when you’re done to avoid classloader memory leaks. - DO use
CompilerConfigurationto set encoding, target bytecode, and custom script base classes. - DO compile phase by phase when debugging AST transforms – inspect the AST at each step to find where things go wrong.
- DO use joint compilation (Gradle Groovy plugin) when your project mixes Java and Groovy sources.
- DON’T write AST transforms at CONVERSION phase unless you explicitly don’t need type information.
- DON’T call
parseClass()in a loop without clearing the cache or closing the classloader. - DON’T assume scripts and classes compile the same way – scripts get a wrapper class extending
groovy.lang.Script. - DON’T fight the phase ordering – if your transform needs resolved types, it must run at SEMANTIC_ANALYSIS or later.
Up next: Groovy Script vs Class – Runtime Behavior
Frequently Asked Questions
What are the 9 Groovy compilation phases?
The 9 phases are: Initialization (source registration), Parsing (tokenization and syntax tree creation), Conversion (AST node creation), Semantic Analysis (type resolution and scope checking), Canonicalization (AST cleanup and most AST transforms), Instruction Selection (bytecode optimization, @CompileStatic runs here), Class Generation (AST to bytecode), Output (writing .class files), and Finalization (cleanup). Phases run in order from 1 to 9, and AST transforms can hook into phases 3 through 7.
What is the difference between GroovyClassLoader and CompilationUnit?
CompilationUnit is the raw compilation pipeline – it takes source code, runs the 9 compilation phases, and produces bytecode as byte arrays. GroovyClassLoader wraps CompilationUnit and adds class loading – it compiles source code and immediately defines the resulting class in the JVM so you can instantiate it. Use CompilationUnit when you need fine-grained control (phase-by-phase compilation, AST inspection). Use GroovyClassLoader when you just want to compile and use a class.
How does Groovy compile scripts differently from classes?
When Groovy compiles a file with no class declaration (a script), it generates a class that extends groovy.lang.Script. The script body becomes the run() method, and a static void main(String[]) method is generated automatically. Functions defined in the script become methods on this generated class. When a file contains both loose code and class declarations, the compiler generates the explicit classes plus a Script wrapper for the loose code.
What is joint Java-Groovy compilation and why is it needed?
Joint compilation solves circular dependencies between Java and Groovy source files. Without it, javac can’t compile Java files that reference Groovy classes because those classes don’t exist yet. The solution: the Groovy compiler generates Java ‘stubs’ (skeleton source files with method signatures but no implementations), javac compiles against these stubs, then the Groovy compiler produces the real bytecode that replaces the stub output. In Gradle, enable this by putting all sources under src/main/groovy and letting the Groovy plugin compile everything.
At which compilation phase should my AST transform run?
It depends on what your transform needs. CONVERSION (phase 3): for transforms that don’t need type information. SEMANTIC_ANALYSIS (phase 4): for transforms that need resolved types but want to run before other transforms. CANONICALIZATION (phase 5): the most common choice – types are resolved, and this is where @ToString, @Builder, and most code-generation transforms run. INSTRUCTION_SELECTION (phase 6): for type-checking transforms like @CompileStatic. CLASS_GENERATION (phase 7): for transforms that modify bytecode directly, which is rare.
Related Posts
Previous in Series: Groovy @Grab – Grape Dependency Management
Next in Series: Groovy Script vs Class – Runtime Behavior
Related Topics You Might Like:
- Groovy AST Transformations – Compile-Time Metaprogramming
- Groovy @CompileStatic – Type Checking and Performance
- Groovy Metaprogramming – Runtime and Compile-Time
This post is part of the Groovy & Grails Cookbook series on TechnoScripts.com

No comment