A novel LLM-based framework provides flexible evaluation of mathematical reasoning, addressing limitations of symbolic methods.