The word "agentic" is doing a lot of work in 2024 and 2025 marketing copy. Every AI product that touches a workflow now describes itself as agentic. Chatbots that route queries to a human are called agents. RPA tools with a language model bolted on the front are called agents. Form-parsing APIs packaged in a UI are called agents. The result is a label that has become nearly useless for understanding what a system actually does.
This matters for enterprise procurement and compliance teams because they are being asked to evaluate AI solutions under time pressure, with a vocabulary that has been deliberately stretched. Understanding what genuine agentic execution looks like — and how to distinguish it from well-packaged simpler tools — is not just an academic exercise. The difference determines whether the system can handle your actual process or will break at the first edge case.
The Three Things a Real Agent Can Do That a Chatbot Cannot
The distinction between an AI agent and an AI assistant comes down to three capabilities: multi-step execution without human steering, decision branching based on observed state, and the ability to take consequential actions with durable effects.
Multi-step execution without human steering means the system can complete a process that requires more than one operation in sequence. A chatbot answers a question. An agent receives a task, determines what steps are needed, executes them in order, handles intermediate results, and produces a final output — without requiring a human to confirm each step. This is not the same as a scripted workflow. A scripted workflow follows a fixed path. An agent determines the path based on what it finds at each step.
Decision branching based on observed state means the agent's path through a process depends on what it actually observes, not just what it was configured to expect. In procurement approval, this means: if the PO total exceeds the department budget ceiling, route to finance; if the vendor is on the preferred list, approve with reduced checks; if the item category triggers export control considerations, escalate to compliance. These branches need to be evaluated dynamically, against live data from your ERP and policy store, not just pattern-matched against a fixed rule table.
Consequential actions with durable effects means the agent can write to systems, not just read from them. Approving a PO in your ERP. Sending a notification to a vendor. Flagging a record in your compliance database. Triggering a downstream workflow. If the system can only produce a recommendation that a human must manually execute, it is an assistant, not an agent. The durability of the action — the fact that it changes system state — is what creates both the productivity value and the need for proper audit trails.
Where RPA Wrappers Fall Short
Robotic process automation has been a fixture in enterprise back-office automation for over a decade. RPA tools are good at high-volume, low-variation tasks: extracting fields from structured forms, populating standard templates, triggering API calls based on fixed conditions. They struggle when the input structure varies, when process logic needs to interpret rather than match, or when a step requires contextual judgment that doesn't reduce to a boolean.
When AI capabilities became practically accessible in 2023, many RPA vendors added language model components to their existing tooling and repositioned as "AI agents." The combination is real and useful — language models handle the variable-input parsing that RPA couldn't, while RPA tooling handles the deterministic execution steps. But the resulting system is not an agent in the sense described above. It is a language-model-enhanced automation script. The process logic is still largely pre-scripted. The branching decisions are still largely rule-matched. The intelligence layer reads inputs; the execution layer follows a fixed program.
The practical gap shows up in three places. First, when a document arrives in an unexpected format — a non-standard PO template from a new vendor — the RPA layer cannot adapt and the process errors out. Second, when a policy update requires changing approval logic, the change requires script modification rather than policy-layer reconfiguration. Third, when an edge case requires contextual judgment — "this vendor is technically over the spend limit, but the exception policy for year-end inventory purchases probably applies" — the system cannot make the call and escalates to a human by default.
What Multi-Step Decision Chains Look Like in Practice
A genuine agentic system for procurement approval might handle a single PO through something like the following sequence:
- Parse the PO from whatever format it arrives in — PDF, ERP export, email attachment, API payload — and extract structured fields
- Query the vendor registry to confirm vendor status, preferred tier, and any open compliance flags
- Pull the current budget period balance for the relevant cost center from the ERP (via JDBC connector to the financial module)
- Evaluate the PO against the applicable spend policy for that department, using the policy document as a rules source
- If the item category triggers secondary checks — export controls, environmental compliance, vendor diversity requirements — branch to those checks and execute them in sequence
- If all checks pass, write the approval record to the ERP, send confirmation to the requestor, and log the decision with the full reasoning trace
- If any check fails or produces an ambiguous result, construct an escalation package — the PO, the check results, the specific policy clause at issue — and route it to the appropriate approver with context, not just a flag
This sequence involves seven distinct operations across at least three systems (vendor registry, ERP, policy store), multiple conditional branches, and two possible terminal states (approval or escalation). A chatbot cannot do this. An RPA script can approximate it but cannot handle format variation at step 1 or interpret policy ambiguity at step 4. A genuine agent can execute the full sequence, adapt at steps where inputs don't conform to expectations, and produce a documented decision trail.
The Audit Trail Requirement
Enterprise process automation that makes consequential decisions — approval, rejection, escalation — has a compliance requirement that goes beyond accuracy: the system must be able to explain its decisions. This is not optional for regulated industries, and it is increasingly relevant for internal audit purposes across sectors.
This is where many AI systems marketed as enterprise-ready fall short. They produce outputs. They don't produce reasoning records. An approval decision that says "approved" is not useful for a post-hoc audit query asking why a specific PO was processed without escalation. An approval decision that records which policy clauses were evaluated, what data was observed at each check, what conditions triggered each branch, and what the final determination was based on — that is something an auditor or a compliance officer can work with.
Genuine agentic systems maintain execution traces — structured logs of each step in a decision chain, including the inputs observed, the logic applied, and the output produced. This is not incidental to the system design. It has to be built in from the start. Systems that retrofit explainability onto an existing decision engine produce plausible-sounding post-hoc rationales, not genuine execution traces. The difference is verifiable: does the explanation match the actual data that was observed during execution, or is it generated after the fact from the final output?
How to Evaluate Whether a System Is Actually Agentic
When evaluating AI systems for enterprise process automation, a few specific questions cut through the marketing language.
Ask to see the execution trace for a sample decision. A genuine agent can show you, step by step, what data it retrieved, what logic it evaluated, and what decision it made at each branch. A system that cannot produce this is not operating as a genuine agent; it is either a language model producing a final output with no intermediate logging, or a scripted automation producing structured outputs with no interpretive layer.
Ask what happens when the input doesn't match the expected format. Run a test with a non-standard document. A genuine agent adapts and extracts the relevant fields with some degradation in confidence, which it should communicate. An RPA wrapper fails or produces incorrect field extraction silently.
Ask how policy updates are made. In a genuine agent, the policy layer is separate from the execution engine. Updating a spending threshold or an escalation rule changes the policy store, and the agent reflects the change on the next execution. In a scripted system, a policy change requires a code change. This distinction tells you a lot about where the intelligence actually lives.
We built dodoAI around these properties because they are what the enterprise operations problem actually requires. The market will continue applying "agentic" to simpler tools, and many of those tools are useful for the right use cases. But when you are deploying a system to make real decisions about real money in your operations, the distinction between genuine multi-step agentic execution and a well-packaged automation script is not semantic. It is the difference between a system that handles your process and one that handles your process until it doesn't.