張耀仁: 晚餐接機大作戰(半導體廠製程排程): Part III Standard LLM vs. Reasoning

Scheduling Parallelism in Plans problem (Eng) (from source)

Claude Sonnet 3.5 illustration of solution (happens to be optimum)

ChatGPT o1 reasoning feasible outcome, optimized by human

41 s (misinterpreted)

4 little experiments in a row (edits)

not always woks (can misinterpret) 85 s

3.7 extended 35 s (misinterpreted)

3.7 Extended, 7 s (misinterpretation), prompt "Use A*"

What if Emily arrived at the airport at 4:30

Sonnet 3.5 illustration (attention bias occurs. some constraints forgotten)

ChatGPT o1 feasible (also optimum) at the first try. Solution space is tremendously limited.

What if Emily arrived at the airport at 2:30

Sonnet 3.5 illustration (attention bias occurs. some constraints forgotten)

ChatGPT o1 reasoning feasible outcome, optimized by human

張耀仁