Skip to content

feat: lambda表达式支持 (invokedynamic)#3

Merged
Eatgrapes merged 12 commits into
Eatgrapes:masterfrom
Neamyoo-dev:feat/lambda-invokedynamic
May 24, 2026
Merged

feat: lambda表达式支持 (invokedynamic)#3
Eatgrapes merged 12 commits into
Eatgrapes:masterfrom
Neamyoo-dev:feat/lambda-invokedynamic

Conversation

@Neamyoo-dev
Copy link
Copy Markdown
Contributor

@Neamyoo-dev Neamyoo-dev commented May 24, 2026

Summary

实现 Java lambda 表达式的完整编译器支持,包括 CST 解析、HIR lowering 和 invokedynamic 字节码生成。

实际上就是忘记commit了

Changes

  • CST Parser: 新增 LambdaExprLambdaParam 节点,支持 () -> bodyident -> body 两种语法
  • HIR Lowering: Expr::Lambda { params, body } 的 lowering 支持
  • Classfile Writer: 新增 visit_invokedynamic_insn 方法
  • Bytecode Gen: scan_and_gen_lambdas 扫描方法体生成合成方法 lambda$method$Nemit_lambda 生成 invokedynamic 指令 + BootstrapMethods 属性
  • Test: 新增 LambdaTest.java fixture

关键修复

  • is_cast() -> 开头时误判为 cast 表达式 → 跳过空白检查 Arrow
  • invokedynamic descriptor 必须返回 SAM 接口类型而非 Object

验证

  • parse_all_java_fixtures
  • parser_builds_green_tree_for_all_java_fixtures
  • javac_accepts_all_java_fixtures
  • cargo fmt / cargo clippy 无警告
  • java -cp target/lambda-out LambdaTest 输出 hello from lambda

@Neamyoo-dev Neamyoo-dev force-pushed the feat/lambda-invokedynamic branch from a6a0860 to 6da74f0 Compare May 24, 2026 11:13
devin-ai-integration[bot]

This comment was marked as resolved.

@Eatgrapes
Copy link
Copy Markdown
Owner

crates/javac-bytecode/src/class_gen.rs:128
SAM interface selection is a hardcoded heuristic based on parameter count only
The SAM interface mapping at crates/javac-bytecode/src/class_gen.rs:128-140 selects the functional interface solely based on lambda parameter count (0→Supplier, 1→Function, 2+→BiFunction). This means:

  1. Lambda assigned to Consumer<T> (1 param, void return) will incorrectly be wired to Function.apply instead of Consumer.accept.
  2. Lambdas with 3+ params mapped to BiFunction will produce incorrect SAM method types.
  3. Predicate<T>, Runnable, Comparator<T>, and other common functional interfaces are not handled.

This is a known design limitation of the initial implementation, but it will silently generate incorrect bytecode for any functional interface that doesn't match the hardcoded mapping. A proper implementation would need type inference to determine the target functional interface from context (e.g., method parameter type, variable declaration type).

@Neamyoo-dev Neamyoo-dev reopened this May 24, 2026
@Neamyoo-dev
Copy link
Copy Markdown
Contributor Author

关于add a comment无法cancel又是个什么东西

@Neamyoo-dev Neamyoo-dev marked this pull request as draft May 24, 2026 11:58
devin-ai-integration[bot]

This comment was marked as resolved.

@Neamyoo-dev Neamyoo-dev marked this pull request as ready for review May 24, 2026 12:44
@Eatgrapes Eatgrapes merged commit 7df2b8a into Eatgrapes:master May 24, 2026
1 check passed
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 6 new potential issues.

Open in Devin Review

Comment on lines +236 to +238
expr_gen::gen_expr(&mut mw, &mut ctx, &method.body, *body_expr_id);
let body_ty = expr_gen::expr_ty(&ctx, &method.body, *body_expr_id);
mw.visit_insn(return_opcode(&body_ty));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Lambda synthetic method uses expression type for return opcode instead of declared SAM return type

In scan_and_gen_lambdas, for LambdaBody::Expr, the return opcode is computed from body_ty (the expression's inferred type) rather than ctx.return_ty / sam_info.return_ty (the declared return type of the synthetic method). When the expression type differs from the SAM return type, this produces invalid bytecode. For example, a Consumer<String> lambda whose body is list.add(x) (returns boolean) would emit IRETURN in a method declared as (Ljava/lang/Object;)V, causing a JVM VerifyError. Similarly, a Supplier<Integer> lambda like () -> 42 would emit IRETURN instead of ARETURN, since the expression type is int but the erased impl descriptor returns Ljava/lang/Object;. No coercion between body_ty and the method's return type is applied either.

Prompt for agents
In scan_and_gen_lambdas (class_gen.rs), the LambdaBody::Expr arm at lines 235-239 computes return_opcode from body_ty (the expression type) instead of ctx.return_ty (the SAM return type set at line 218). This causes a type mismatch when the expression type differs from the SAM's declared return type.

For example:
- Consumer lambda with boolean-returning body: emits IRETURN in a void method
- Supplier lambda with int-returning body: emits IRETURN in an Object-returning method

The fix should:
1. Use ctx.return_ty (or sam_info.return_ty) to determine the return opcode
2. If ctx.return_ty is Void but body_ty is not Void, emit a pop instruction (pop_ty) to discard the expression value before RETURN
3. If both are non-void but differ, apply coercion (crate::expr_gen::coerce) from body_ty to ctx.return_ty before the return instruction

The relevant function is crate::expr_gen::coerce for type coercion and crate::expr_gen::pop_ty for discarding values. The return_opcode function is in crate::local_var::return_opcode.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +603 to +605
let body = if self.at_lambda_block() {
self.skip_block_tokens();
LambdaBody::Block(Block { stmts: vec![] })
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Block lambda bodies are silently discarded, producing empty synthetic methods

When a lambda uses block syntax (e.g., x -> { System.out.println(x); }), the HIR lowering calls skip_block_tokens() which skips all tokens inside the braces and then creates LambdaBody::Block(Block { stmts: vec![] }) — an empty block. All statements in the block body are silently lost. The generated synthetic method will contain only a default return instruction. This occurs in both lambda forms: the single-parameter Ident -> { ... } path (line 604) and the parenthesized (params) -> { ... } path (line 652). The result is that block lambdas compile without errors but produce no-op methods at runtime.

Prompt for agents
In crates/javac-hir/src/lowering/expr.rs, both lambda parsing paths (single-ident at lines 603-605 and paren-form at lines 651-653) call skip_block_tokens() and create LambdaBody::Block(Block { stmts: vec![] }) when the lambda body is a block. This discards all the statements in the block.

The problem is that skip_block_tokens() advances pos past all the block tokens without lowering them. The ExprLowerer operates on a flat token stream (ExprToken[]) and doesn't have access to the CST nodes needed to call lower_block().

Possible approaches:
1. Instead of operating on the flat token stream, restructure the lambda lowering to use the CST nodes from the parser. The parser (parser/expr.rs) already correctly parses block bodies via stmt::block(p), so the CST contains the full block structure. The HIR lowering (stmt.rs lower_block) can lower CST block nodes.
2. As a simpler interim fix, if the block body cannot be lowered from the token stream, emit an error instead of silently producing an empty block. This would at least prevent silent incorrect behavior.
3. Parse the block tokens into statements within the ExprLowerer by extracting statement-level tokens and recursively lowering them (complex but keeps the current architecture).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +188 to +189
let ret_ty = signature.return_type.clone();
body_builder.resolve_lambda_target_types(&ret_ty);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 resolve_lambda_target_types not called for constructors

In crates/javac-hir/src/lowering/member.rs:188-189, resolve_lambda_target_types is called only in lower_method_decl, not in lower_constructor_decl (lines 115-153). Any lambda in a constructor body would have target_ty: None, causing resolve_sam_interface in crates/javac-bytecode/src/class_gen.rs:113-148 to fall back to param_count-based heuristics. For example, Consumer<String> c = x -> ... in a constructor would be treated as Function (1-param fallback) instead of Consumer. Field initializers with lambdas would similarly be affected since lower_field_decl also doesn't call resolve_lambda_target_types.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +191 to +210
pub fn functional_interface_method(&self, internal_name: &str) -> Option<MethodRef> {
if !self.interfaces.contains(internal_name) {
return None;
}

let mut sam: Option<MethodRef> = None;
for ((owner, _), methods) in &self.methods {
if owner == internal_name {
for m in methods {
if m.is_interface {
if sam.is_some() {
return None;
}
sam = Some(m.clone());
}
}
}
}
sam
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: functional_interface_method doesn't distinguish abstract from default/static methods

The new functional_interface_method in crates/javac-call-resolver/src/catalog.rs:191-210 filters by m.is_interface to find the SAM method, but is_interface only means the method's owner is an interface — it doesn't distinguish abstract methods from default or static methods. This works today because the platform catalog (crates/javac-call-resolver/src/platform/java_util_function.rs) only registers the abstract SAM methods. If default methods (like Consumer.andThen) or static methods were added to the catalog, functional_interface_method would incorrectly count them and return None (thinking the interface has multiple abstract methods).

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +70 to +80
pub(super) fn resolve_lambda_target_types(&mut self, method_return_ty: &Ty) {
let mut targets: Vec<(ExprId, Ty)> = Vec::new();
for (_, stmt) in self.body.stmts.iter() {
self.collect_lambda_targets(stmt, method_return_ty, &mut targets);
}
for (expr_id, ty) in targets {
if let Expr::Lambda { target_ty: t, .. } = &mut self.body.exprs[expr_id] {
*t = Some(ty);
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: resolve_lambda_target_types visits statements multiple times due to arena iteration

resolve_lambda_target_types at crates/javac-hir/src/lowering/expr.rs:72 iterates over ALL statements in the body arena via self.body.stmts.iter(). Then collect_lambda_targets recursively descends into child statements. Since the arena contains both parent and child statements, children are visited twice: once from the top-level iter() and once from their parent's recursive descent. This means push_lambda_target may be called multiple times for the same lambda. This is only an inefficiency, not a correctness issue — the target_ty is set idempotently to the same value each time.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +181 to +185
Expr::MethodCall {
target: _,
method: _,
args: _,
} => {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 collect_expr_lambda_targets deliberately ignores method call arguments

In crates/javac-hir/src/lowering/expr.rs:181-185, Expr::MethodCall is handled as a no-op in collect_expr_lambda_targets. This means lambdas passed as method arguments (e.g., list.forEach(x -> ...)) will NOT have their target_ty resolved. They'll fall through to param_count-based heuristics in resolve_sam_interface. Resolving target types for method arguments would require method overload resolution at the HIR level, which is significantly more complex. This is a known feature gap in the initial lambda implementation.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants