2026/03/22
Understanding Kotlin Compilation: A Practical Internal Tour
A walkthrough of lexing, parsing, FIR, IR, output generation, and compiler plugins.
Sometimes compilation can feel mysterious. You write a Kotlin code like this:
fun greet(name: String) {
println("Hello, $name")
}and somehow it turns into something the machine can run.
This post is about that journey.
Start With The Big Picture
Kotlin compilation is not one giant step. It is a pipeline where each phase has one job.
Compilation Pipeline
Turn text into tokens
The compiler splits raw source code into small pieces such as keywords, names, operators, and punctuation.
Build structure
Those tokens are organized into a syntax tree so the compiler can understand the shape of the program.
Understand meaning
Kotlin resolves names, checks types, infers missing information, and reports many errors here.
Prepare transformations
The program is converted into a lower-level form that is easier for compiler passes and backends to transform.
Generate output
For the JVM target, Kotlin emits class files that the Java Virtual Machine can execute.
At a beginner level, you can think about the pipeline like this:
- Read the text.
- Understand the structure.
- Understand the meaning.
- Transform the program into a more compiler-friendly form.
- Generate the final output.
First Phases: Lexical Analysis And Parsing
The earliest compiler phases are not specific to Kotlin. Most compilers do something like this.
Lexical analysis
Lexical analysis, often called lexing, takes raw text and breaks it into tokens.
For example, the compiler can split this:
fun greet(name: String)into pieces such as:
fungreet(name:String)
At this point, the compiler has not understood the whole program yet. It only knows the building blocks.
Parsing
Parsing takes those tokens and organizes them into a structure.
In Kotlin tooling, one of the first structured forms built from the source is the PSI.
PSI means Program Structure Interface.
You can think of PSI as a tree representation of the source file that is very close to the code you wrote. It keeps the code organized in a way that tools and the compiler frontend can navigate.
Now the compiler can say things like:
- this is a function declaration
- the function is named
greet - it has one parameter called
name - that parameter has type
String
This structure is represented as a tree.
Another term you will often hear here is AST, which means Abstract Syntax Tree.
The easiest way to understand the difference is this:
PSIis a source-oriented tree used heavily by the Kotlin tooling and frontend infrastructureASTis the more general compiler concept of a syntax tree that represents the program structure
At a beginner level, it is enough to think of both as "tree-shaped representations of your code", with PSI being the concrete structure Kotlin tooling works with early on.
PSI / AST Visual
Source Code
fun greet(name: String) {
println("Hello, $name")
}Parsed Tree
So the transformation looks roughly like this:
- Raw text
- Tokens from lexing
- PSI / syntax tree from parsing
After that, the compiler can move from just understanding structure to understanding meaning.
See One Small Function Move Through The Pipeline
This is the easiest way to build intuition:
Code Walkthrough
Source Code
fun greet(name: String) {
println("Hello, $name")
}Lexing
Transformationfun | greet | ( | name | : | String | )
The compiler reads characters and groups them into tokens. It still does not know the program meaning, only the pieces that exist.
Parsing
TransformationKtFile
KtNamedFunction name=greet
KtParameterList
KtParameter name=name typeReference=String
KtBlockExpression
KtCallExpression callee=println
KtValueArgumentList
KtStringTemplateExpression
KtLiteralStringTemplateEntry("Hello, ")
KtSimpleNameStringTemplateEntry(name)Now the compiler knows there is a function, it has one parameter, and it returns a String. The structure is clear.
FIR
TransformationFirSimpleFunction name=greet returnType=Unit
FirValueParameter name=name type=String
FirBlock
FirFunctionCall callee=println returnType=Unit
argument[0]: FirStringConcatenation type=String
FirConstExpression("Hello, ")
FirQualifiedAccessExpression(name) type=StringFIR keeps a tree shape but enriches it with resolved symbols and inferred/checked types.
IR
TransformationIrFunction name=greet returnType=kotlin.Unit
IrValueParameter name=name type=kotlin.String
IrBlockBody
IrCall symbol=println returnType=kotlin.Unit
valueArgument[0]: IrStringConcatenation type=kotlin.String
IrConst("Hello, ")
IrGetValue(name) type=kotlin.StringIR represents a lowered, backend-friendly form that compiler passes can transform before bytecode generation.
Bytecode
Transformationpublic static final void greet(java.lang.String); 0: aload_0 1: ldc #10 // String name 3: invokestatic #16 // Method kotlin/jvm/internal/Intrinsics.checkNotNullParameter:(Ljava/lang/Object;Ljava/lang/String;)V 6: getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 9: new #24 // class java/lang/StringBuilder 12: dup 13: invokespecial #27 // Method java/lang/StringBuilder.<init>:()V 16: ldc #29 // String Hello, 18: invokevirtual #33 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 21: aload_0 22: invokevirtual #33 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 25: invokevirtual #37 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 28: invokevirtual #43 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 31: return
The backend emits JVM instructions in the generated .class file, including parameter null-checks and the println call sequence.
What FIR Actually Is
FIR means Front-end Intermediate Representation.
That name sounds intimidating, but the core idea is simple:
- Kotlin has already read the code
- Kotlin has already built structure from it
- Now Kotlin wants to understand what the program means
Inside FIR, the compiler answers questions like:
- what does this name refer to?
- what type does this expression have?
- is this function call valid?
- is there a type mismatch here?
So if someone asks, "what is FIR?", an answer could be:
FIR is the compiler's internal model for understanding your Kotlin program at the semantic level.
But FIR is not only an "idea" of meaning. It is also a concrete tree-like structure.
For this code:
fun greet(name: String) {
println("Hello, $name")
}you can imagine a simplified FIR shape like:
FirSimpleFunction name=greet returnType=Unit
FirValueParameter name=name type=String
FirBlock
FirFunctionCall callee=println returnType=Unit
argument[0]: FirStringConcatenation type=String
FirConstExpression("Hello, ")
FirQualifiedAccessExpression(name) type=StringThe key point is: FIR nodes carry semantic info.
- each declaration has resolved symbols
- each expression has a computed type
- each call can be checked against argument and parameter types
This is where many diagnostics come from. For example:
val x: Int = "hello"FIR is where the compiler can understand that the right side is a String, the left side expects an Int, and therefore a type error should be reported.
So FIR is both:
- A tree representation of your program
- A semantically enriched tree used for validation and diagnostics
What IR Actually Is
IR means Intermediate Representation.
IR comes after the compiler already understands the meaning of the code.
The goal of IR is different from FIR:
- FIR is mainly about understanding and validating the program
- IR is mainly about transforming the program into a lower-level, more uniform representation
Why does this help?
Because compilers often need a form that is easier to rewrite, optimize, and send to different backends.
So if someone asks, "what is IR?", a good answer could be:
IR is the compiler's internal model for transforming a valid program before generating final output.
And like FIR, IR is also a tree structure with concrete nodes.
A simplified IR shape for the same greet function might look like:
IrFunction name=greet returnType=Unit
IrValueParameter name=name type=kotlin.String
IrBlockBody
IrCall symbol=println
valueArgument[0]: IrStringConcatenation
IrConst("Hello, ")
IrGetValue(name)Then later compiler passes can lower or rewrite that tree into forms that are easier for backends.
For example, a lowering pass might conceptually turn string concatenation into a sequence of lower-level operations:
IrCall println(
IrCall StringBuilder.append("Hello, ")
IrCall StringBuilder.append(name)
IrCall StringBuilder.toString()
)Exact shapes vary by pass and Kotlin version, but the idea is stable: IR is a transform-friendly tree.
In practice, IR is where the compiler can do many lowering and transformation steps before generating JVM bytecode.
What Happens At The End
Once the program has gone through those analysis and transformation stages, the backend can generate the final output.
For Kotlin, this is where multiplatform support becomes important: the compiler can target different runtimes.
Common targets include:
- JVM: emits
.classbytecode executed by the Java Virtual Machine. - JavaScript: emits JavaScript output for browser or Node.js environments.
- Native: emits native binaries for platforms like iOS, macOS, Linux, and Windows.
- Wasm: emits WebAssembly for modern web and runtime scenarios.
In this post, most concrete output examples are shown with a JVM-oriented lens (for example FIR/IR-to-bytecode style illustrations), because JVM output is the most familiar reference point for many Kotlin developers.
In Kotlin Multiplatform projects, shared code can pass through a common frontend pipeline, and then each backend emits target-specific output for its platform.
What Compiler Plugins Are
Compiler plugins are pieces of code that extend what the compiler can do.
They are not regular application code. They participate in the compilation process itself.
You can think of them as extra tools that can:
- Inspect code during compilation
- Report custom diagnostics
- Generate extra code
- Transform internal compiler representations
They can join the compilation in different places depending on what they need.
Where Plugins Join
Before deeper meaning is finished
Frontend / FIR-side extensions
Useful when a plugin wants to inspect declarations, validate rules, or produce diagnostics while the compiler is still understanding code.
When Kotlin is moving into IR
FIR -> IR bridge
Useful when a plugin needs to generate extra program pieces before the lower-level transformation stage begins.
After the program is already in IR
IR transformations
Useful when a plugin wants to rewrite behavior, inject calls, or reshape code before final output generation.
Examples of what plugins might do:
- Validate rules in a codebase
- Generate extra declarations from annotations
- Transform IR before final output generation
K1 vs K2 (Quick Context)
You will often see two names in Kotlin compiler discussions:
- K1: the older compiler frontend architecture.
- K2: the newer frontend architecture built around FIR, designed for better consistency, performance, and future evolution.
In practical terms, K2 makes the frontend pipeline more unified, which helps diagnostics quality and long-term compiler extensibility.
This post explains the K2 compilation pipeline and terminology.
Real-World Examples
The compiler pipeline becomes easier to trust when you look at real features that rely on internal rewrites.
Suspend functions (language feature)
At source level, you write something simple:
suspend fun loadUser(id: String): User {
val profile = api.fetchProfile(id)
return User(profile)
}Internally, Kotlin lowers this into a state-machine style form (conceptually):
- Extra continuation parameter is introduced
- Local state is stored across suspension points
- Execution resumes by jumping to the correct state label
That transformation is what lets suspend code look sequential while running asynchronously.
Jetpack Compose (compiler plugin)
At source level, UI looks declarative:
@Composable
fun Greeting(name: String) {
Text("Hello $name")
}The Compose compiler plugin rewrites this so runtime can track recomposition efficiently (conceptually):
- Hidden composition parameters are threaded through calls
- Stability/change tracking data is propagated
- Function bodies are split so only invalidated parts re-run
So Compose is a strong real-world example of plugin-driven IR transformations that directly affect runtime behavior and performance.
One more common plugin example: kotlinx.serialization
You write:
@Serializable
data class User(val id: String, val age: Int)The serialization plugin generates serializer machinery during compilation, so encoding/decoding can work without handwritten boilerplate.
This is another case where compiler plugins do meaningful internal code generation/transformation before final backend output.
Conclusion
Kotlin compilation is easier to reason about when you stop thinking of it as one opaque step.
The compiler reads text, builds structure, enriches that structure with meaning, transforms it into backend-friendly representations, and finally emits output for a specific target.
With K2, that pipeline becomes especially useful to understand because FIR and the newer frontend architecture make the flow more consistent and easier to extend over time.
You do not need to memorize every internal node type to benefit from this model. What matters is understanding the roles of PSI, FIR, IR, and the backend, and recognizing that many Kotlin features and plugins work by rewriting code internally before final output is generated.
Once that clicks, features like suspend, Jetpack Compose, and serialization stop feeling magical and start feeling like well-structured compiler transformations.