Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions dev/design/BARE_BLOCK_RETURN_FIX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Bare Block Return Value Fix

## Problem Statement

Test 5 in `context_semantics.t` fails: `sub foo { { 99 } }` should return 99 but returns undef.

When attempting to fix this by adding `blockIsSubroutine` to the bare block handling in `EmitBlock.java`, Test2 breaks with "context() was called to retrieve an existing context" errors.

## Root Cause Analysis

### Discovery

Through debugging, we found THREE separate issues:

1. **Indirect object syntax bug** - `(release $ctx, "V")` was dropping the "V" (FIXED)
2. **Hash literal detection bug** - `{ %{$_} }` was parsed as bare block instead of hash literal (FIXED)
3. **Bare block return value** - `sub foo { { 99 } }` returns undef (FIXED)

### The Indirect Object Syntax Bug (FIXED)

Perl's indirect object syntax `release $ctx, expr` should:
1. Call `$ctx->release()`
2. Evaluate `expr`
3. Return a list containing both results

**Fix applied:** Modified `SubroutineParser.java` to only consume the object argument when detecting indirect object syntax, leaving trailing arguments for the outer list context.

### The Hash Literal Detection Bug (FIXED)

The parser's `isHashLiteral()` function in `StatementResolver.java` was incorrectly treating `{ %{$_} }` as a bare block because it didn't find a `=>` fat comma indicator.

**Fix applied:** Added `firstTokenIsSigil` check - if first token is `%` or `@` and no block indicators are found, treat as hash literal. This correctly handles patterns like `map {{ %{$_} }} @data`.

### The Bare Block Return Value Issue (FIXED)

Adding `blockIsSubroutine` to the EmitBlock.java condition was the correct approach, but it only worked after fixing the hash literal detection bug above.

**Why the fix now works:**
- `sub foo { { 99 } }` - inner `{ 99 }` is correctly identified as bare block, returns value
- `map {{ %{$_} }} @data` - inner `{ %{$_} }` is correctly identified as hash literal, not affected

## Fix Strategy

### Phase 1: Fix Indirect Object Syntax ✓ COMPLETED

**Files modified:**
- `src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java` - Fixed to only consume the object, not trailing args

**Test added:**
- Added test cases to `src/test/resources/unit/indirect_object_syntax.t`

### Phase 2: Fix Hash Literal Detection ✓ COMPLETED

**Files modified:**
- `src/main/java/org/perlonjava/frontend/parser/StatementResolver.java` - Added `firstTokenIsSigil` check in `isHashLiteral()` function

### Phase 3: Fix Bare Block Return Value ✓ COMPLETED

**Files modified:**
- `src/main/java/org/perlonjava/backend/jvm/EmitBlock.java` - Added `blockIsSubroutine` to the condition at line 271

## Verification

### All Tests Pass ✓

```bash
# Bare block return value
./jperl -e 'sub foo { { 99 } } print foo(), "\n"'
# Output: 99 ✓

# Hash literal in map (Test2 pattern)
./jperl -e 'my @data = ({a=>1}, {b=>2}); my @result = map {{ %{$_} }} @data; print scalar(@result), " refs\n"'
# Output: 2 refs ✓

# Test2 based tests
./jperl src/test/resources/unit/transliterate.t
# All tests pass without "context() was called" errors ✓

# Context semantics unit test
./jperl src/test/resources/unit/context_semantics.t
# All 12 tests pass ✓
```

## Current State - ALL FIXED ✓

- **Branch:** `feature/cpan-client-phase11`
- **Indirect object syntax:** ✓ FIXED
- **Hash literal detection:** ✓ FIXED
- **File-level bare blocks:** ✓ FIXED
- **SCALAR/LIST context bare blocks:** ✓ FIXED
- **Subroutine bare blocks:** ✓ FIXED

## Files Modified

1. `src/main/java/org/perlonjava/frontend/parser/Parser.java` - Added `isFileLevelBlock` annotation
2. `src/main/java/org/perlonjava/backend/jvm/EmitBlock.java` - Handle file-level and subroutine bare blocks
3. `src/main/java/org/perlonjava/backend/jvm/EmitStatement.java` - Register spilling for SCALAR/LIST bare blocks
4. `src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java` - Fixed indirect object syntax
5. `src/main/java/org/perlonjava/frontend/parser/StatementResolver.java` - Fixed hash literal detection
6. `src/test/resources/unit/indirect_object_syntax.t` - Added tests for comma-separated args
7. `src/test/resources/unit/bare_block_return.t` - Comprehensive test file
8. `src/test/resources/unit/context_semantics.t` - Removed TODO markers (all tests pass)

## Completed Steps

1. ~~Create unit test for indirect object syntax~~ ✓ DONE
2. ~~Find and fix the parser bug for `method OBJECT, ARGS`~~ ✓ DONE
3. ~~Fix hash literal detection for `{ %{$_} }` patterns~~ ✓ DONE
4. ~~Add `blockIsSubroutine` to EmitBlock.java condition~~ ✓ DONE
5. ~~Update context_semantics.t to remove TODO markers~~ ✓ DONE
46 changes: 43 additions & 3 deletions dev/design/cpan_client.md
Original file line number Diff line number Diff line change
Expand Up @@ -619,8 +619,48 @@ The JUnit parallel test failures ("ctx is null" errors) were caused by stale INI
**Files changed:**
- `src/test/resources/unit/bare_block_return.t` - TODO tests for bare block return values

### Proposed Approach: File-Level Annotation

The key insight is that RUNTIME context is used for both:
- File-level code (where bare blocks SHOULD return values)
- Subroutine bodies (where the existing behavior is correct)

**Solution:** Annotate only file-level bare blocks before compilation.

**Implementation:**
1. In `EmitterMethodCreator.createClassWithMethod()`, before visiting the AST:
- Check if the last statement is a For3Node with `isSimpleBlock=true`
- If so, add annotation `"fileLevelReturnValue" = true` to that node
2. In `EmitStatement.emitFor3()`:
- Check for `"fileLevelReturnValue"` annotation
- If present AND context is RUNTIME, use the register-spilling approach (same as SCALAR/LIST)

This approach:
- Only affects file-level bare blocks (targeted annotation)
- Doesn't change how subroutine bodies compile
- Uses existing register-spilling mechanism that's already proven to work

**Files to modify:**
- `EmitterMethodCreator.java` - Add annotation to file-level bare blocks
- `EmitStatement.java` - Check for annotation in RUNTIME context

### Failed Approaches (2026-03-19)

**Approach 1: Register spilling for RUNTIME context**
- Modified `EmitStatement.emitFor3()` to use resultRegister mechanism for RUNTIME (like SCALAR/LIST)
- Result: JVM verification errors in tests with complex control flow (tie_scalar.t, typeglob.t, etc.)
- Reason: The resultRegister mechanism changes local variable state, causing inconsistent stack frames at merge points

**Approach 2: Visit body in RUNTIME context directly**
- Modified `EmitStatement.emitFor3()` to call `node.body.accept(emitterVisitor)` for RUNTIME simple blocks
- Result: Same JVM verification errors
- Reason: Visiting body in non-VOID context changes bytecode generation throughout the block, affecting control flow

**Root cause:** The issue is not just about capturing the return value. It's that any change to how the bare block body is compiled (different context, register spilling) affects the entire bytecode structure, causing JVM verifier failures when there are complex control flow constructs like `local`, tied variables, or nested blocks with jumps.

### Next Steps

1. **Investigate bare block return value fix** - Need to capture last expression value without changing interior statement compilation
2. **Test Package::Stash::PP loading** - May need explicit `1;` at end of file
3. **Alternative**: Create bundled DateTime.pm wrapper that skips heavy dependencies
1. **Investigate alternative approaches** - File-level AST transformation before compilation
2. **Consider interpreter fallback** - For files ending with bare blocks, use interpreter backend
3. **Test Package::Stash::PP** - May need workaround (explicit `1;` at file end)
4. **Alternative**: Create bundled DateTime.pm wrapper that skips heavy dependencies
9 changes: 9 additions & 0 deletions src/main/java/org/perlonjava/backend/jvm/EmitBlock.java
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,15 @@ public static void emitBlock(EmitterVisitor emitterVisitor, BlockNode node) {
// Visit in SCALAR context to get a value, store it, then pop
element.accept(emitterVisitor.with(RuntimeContextType.SCALAR));
mv.visitVarInsn(Opcodes.ASTORE, resultReg);
} else if (emitterVisitor.ctx.contextType == RuntimeContextType.RUNTIME
&& (node.getBooleanAnnotation("isFileLevelBlock") || node.getBooleanAnnotation("blockIsSubroutine"))
&& element instanceof For3Node for3
&& for3.isSimpleBlock
&& for3.labelName == null) {
// Bare block (no label) as last statement in file-level RUNTIME context
// or inside a subroutine. This handles do "file", require, and sub { { 99 } }.
// Visit with SCALAR context to get the block's return value.
element.accept(emitterVisitor.with(RuntimeContextType.SCALAR));
} else {
element.accept(emitterVisitor);
}
Expand Down
43 changes: 25 additions & 18 deletions src/main/java/org/perlonjava/backend/jvm/EmitStatement.java
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,20 @@ public static void emitFor3(EmitterVisitor emitterVisitor, For3Node node) {
Label redoLabel = new Label();
mv.visitLabel(redoLabel);

// For simple blocks (bare blocks like { ... }) in non-void context,
// use register spilling to capture the result: allocate a local variable,
// tell the block to store its last element's value there, then load it after endLabel.
// This ensures consistent stack state across all code paths (including last/next jumps).
// Apply for SCALAR/LIST contexts - bare blocks always return their value in Perl.
// Note: Only apply to UNLABELED bare blocks. Labeled blocks like TODO: { ... } should
// not return their value (this would break Test::More's TODO handling).
// RUNTIME context is NOT included because it causes issues with Test2 context handling.
boolean needsReturnValue = node.isSimpleBlock
&& node.labelName == null // Only bare blocks, not labeled blocks
&& (emitterVisitor.ctx.contextType == RuntimeContextType.SCALAR
|| emitterVisitor.ctx.contextType == RuntimeContextType.LIST);
int resultReg = -1;

if (node.useNewScope) {
// Register next/redo/last labels
if (CompilerOptions.DEBUG_ENABLED) emitterVisitor.ctx.logDebug("FOR3 label: " + node.labelName);
Expand All @@ -181,26 +195,18 @@ public static void emitFor3(EmitterVisitor emitterVisitor, For3Node node) {
isUnlabeledTarget);

// Visit the loop body
// For simple blocks (bare blocks like { ... }) in non-void context,
// use register spilling to capture the result: allocate a local variable,
// tell the block to store its last element's value there, then load it after.
// This ensures consistent stack state across all code paths.
// Only apply for SCALAR/LIST contexts, not RUNTIME (which is for subroutine bodies).
boolean needsReturnValue = node.isSimpleBlock
&& (emitterVisitor.ctx.contextType == RuntimeContextType.SCALAR
|| emitterVisitor.ctx.contextType == RuntimeContextType.LIST);
if (needsReturnValue) {
// Allocate a local variable for the result
int resultReg = emitterVisitor.ctx.symbolTable.allocateLocalVariable();
// Initialize it to undef
resultReg = emitterVisitor.ctx.symbolTable.allocateLocalVariable();
// Initialize it to undef (in case last/next is called before last statement)
EmitOperator.emitUndef(mv);
mv.visitVarInsn(Opcodes.ASTORE, resultReg);
// Tell the block to store its last element's value in this register
node.body.setAnnotation("resultRegister", resultReg);
// Visit body in VOID context (consistent stack state)
node.body.accept(voidVisitor);
// Load the result
mv.visitVarInsn(Opcodes.ALOAD, resultReg);
// NOTE: Don't load the result here! We load it after endLabel so that
// all paths (normal, last, next) converge with empty stack, then load.
} else {
node.body.accept(voidVisitor);
}
Expand Down Expand Up @@ -249,12 +255,13 @@ public static void emitFor3(EmitterVisitor emitterVisitor, For3Node node) {
emitterVisitor.ctx.symbolTable.exitScope(scopeIndex);
}

// If the context is not VOID, push "undef" to the stack
// Exception: for simple blocks in SCALAR/LIST context, we already loaded the result from the register
boolean simpleBlockHandled = node.isSimpleBlock
&& (emitterVisitor.ctx.contextType == RuntimeContextType.SCALAR
|| emitterVisitor.ctx.contextType == RuntimeContextType.LIST);
if (emitterVisitor.ctx.contextType != RuntimeContextType.VOID && !simpleBlockHandled) {
// If the context is not VOID, push a value to the stack
// For simple blocks with resultReg, load the captured result
// Otherwise, push undef
if (needsReturnValue && resultReg >= 0) {
// Load the result from the register (all paths converge here with empty stack)
mv.visitVarInsn(Opcodes.ALOAD, resultReg);
} else if (emitterVisitor.ctx.contextType != RuntimeContextType.VOID) {
EmitOperator.emitUndef(emitterVisitor.ctx.mv);
}

Expand Down
7 changes: 7 additions & 0 deletions src/main/java/org/perlonjava/frontend/parser/Parser.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import org.perlonjava.app.cli.CompilerOptions;

import org.perlonjava.backend.jvm.EmitterContext;
import org.perlonjava.frontend.astnode.AbstractNode;
import org.perlonjava.frontend.astnode.FormatNode;
import org.perlonjava.frontend.astnode.Node;
import org.perlonjava.frontend.astnode.OperatorNode;
Expand Down Expand Up @@ -126,6 +127,12 @@ public Node parse() {
tokens.addFirst(new LexerToken(LexerTokenType.NEWLINE, "\n"));
}
Node ast = ParseBlock.parseBlock(this);
// Mark the AST as a top-level file block for proper bare block return value handling
// This annotation is checked in EmitBlock to handle RUNTIME context bare blocks
if (!isTopLevelScript && ast instanceof AbstractNode) {
// For do "file" and require, mark the block so bare blocks return their value
((AbstractNode) ast).setAnnotation("isFileLevelBlock", true);
}
if (!getHeredocNodes().isEmpty()) {
ParseHeredoc.heredocError(this);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,8 +89,11 @@ static Node parseSpecialBlock(Parser parser) {
// Execute other special blocks normally
runSpecialBlock(parser, blockName, block);

// Return an undefined operator node
return new OperatorNode("undef", null, parser.tokenIndex);
// Return an undefined operator node marked as compile-time-only
// so it doesn't affect the file's return value
OperatorNode result = new OperatorNode("undef", null, parser.tokenIndex);
result.setAnnotation("compileTimeOnly", true);
return result;
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -739,9 +739,18 @@ public static boolean isHashLiteral(Parser parser) {
boolean hasHashIndicator = false; // Found =>, or comma in hash-like context
boolean hasBlockIndicator = false; // Found ;, or statement modifier
boolean hasContent = false; // Track if we've seen any content
boolean firstTokenIsSigil = false; // Track if first token is % or @ (hash/array)

if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral START - initial braceCount: " + braceCount);

// Check if the first token is % or @ - this strongly suggests a hash literal
// e.g., { %hash } or { @array } or { %{$_} }
LexerToken firstToken = TokenUtils.peek(parser);
if (firstToken.text.equals("%") || firstToken.text.equals("@")) {
firstTokenIsSigil = true;
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral first token is sigil: " + firstToken.text);
}

while (braceCount > 0) {
LexerToken token = consume(parser);
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral token: '" + token.text + "' type:" + token.type + " braceCount:" + braceCount);
Expand Down Expand Up @@ -869,9 +878,11 @@ public static boolean isHashLiteral(Parser parser) {
// - If we found => it's definitely a hash
// - If we found block indicators, it's a block
// - Empty {} is a hash ref
// - If first token is % or @ (sigil) and no block indicators, it's a hash
// e.g., { %hash }, { @array }, { %{$ref} }
// - Otherwise, default to block (safer when parsing is incomplete)
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral FINAL DECISION - hasHashIndicator:" + hasHashIndicator +
" hasBlockIndicator:" + hasBlockIndicator + " hasContent:" + hasContent);
" hasBlockIndicator:" + hasBlockIndicator + " hasContent:" + hasContent + " firstTokenIsSigil:" + firstTokenIsSigil);

if (hasHashIndicator) {
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral RESULT: TRUE - hash indicator found");
Expand All @@ -882,6 +893,10 @@ public static boolean isHashLiteral(Parser parser) {
} else if (!hasContent) {
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral RESULT: TRUE - empty {} is hash ref");
return true; // Empty {} is a hash ref
} else if (firstTokenIsSigil) {
// { %hash } or { @array } or { %{$ref} } - treat as hash constructor
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral RESULT: TRUE - starts with sigil (% or @)");
return true;
} else {
if (CompilerOptions.DEBUG_ENABLED) parser.ctx.logDebug("isHashLiteral RESULT: FALSE - default for ambiguous case (assuming block)");
return false; // Default: assume block when we can't determine
Expand Down
Loading
Loading