AI is getting better at execution, but you still have to point it at the right problem

One aspect of the coming AI agentic era I’ll be exploring is, which parts of knowledge are shifting to machines and which remain distinctly human.

AI increasingly gets to handle how humans do things, while we get to direct attention toward why anything gets done.

Today’s post is about the what in between the two.

I asked Lumo AI what capabilities for it changed this past week.

Here’s part of what it said:

…straight answer: I don’t have reliable information about any changes to my capabilities in the past week. My training data cuts off in April 2024, so anything happening in 2026 is outside what I can speak to with confidence.

I wouldn’t want to speculate or fabricate a changelog — that’d be worse than useless.

While Lumo AI didn’t answer my question, I still found the output interesting. (If Lumo AI training data cuts off in April 2024, that means there’s a WHOLE lot is doesn’t and hasn’t been considering as part of output production!)

I asked ChatGPT the same question (using the same prompt).

Here’s an abridged list of changes output:

• A roughly 52.5% reduction in hallucinated claims in high-stakes domains (law, medicine, finance)

• A roughly 37% reduction in hallucinated claims in difficult real-world interactions

• User-reported reduction in confidently wrong answers and better self-correction behavior

• Tries harder to infer intent instead of padding answers (e.g., with over-explanations)

• Improvements in brainstorming, coding help, Q&A, and strategy discussions

• Increased speed in answering the actual question

• Improved context pull from past chats, saved memory, uploaded files, and connected Gmail so users can see why the model personalized a response

• General improvements in when the model decides to search the web versus not, image understanding, and multi-step reasoning/tool use

• Increased willingness to verify facts and maintain context across longer workflows—especially for coding and research tasks

• Increased focus on persistent AI collaboration

• New memory transparency controls—viz., what gets remembered, how long it persists, and how transparent the system is

• Less sycophancy, less excessive validation, less emotional mirroring, and more restrained tone

I particularly appreciate this output:

That’s the stage AI systems seem to be entering now: less “look what I can generate,” more “can I actually help someone think and execute continuously?”

This ChatGPT output tracks. I’ve noticed an increased in my personal use of AI to do a lot of the heavy lifting for skills like collaboration, communication, creative thinking, emotional intelligence, ethical reasoning, information literacy, logical reasoning, monitoring change processes, and project management.

I’ve even experienced palpable moments of continuous improvement outputs thanks to these AI capabilities updates.

But.

AI genuinely falls short in foundational skills that foster both my as well as collective intelligence. I lose flow and productivity trying to get AI to understand skills like critiquing and enhancing my inputs, general problem solving and creativity, reflective thinking, cognitive flexibility, and right-problem identification.

I also find myself having to direct AI’s problem solving effort as well as point AI at the right dots to connect—which is an ideal tension, considering.

Case in point: I tested an input that essentially tells AI that it’s been making too many critical errors. I tell it (as part of the prompt) that part of this development isn’t its fault, since it can only create outputs, in part, based on given inputs as well as accessible information. However, the part that IS its fault (I tell it) revolves around it not doing nor sticking to its job. Then I tell it, “So let’s change things up.”

Depending on the conversation, I’ll then add a change-up portion of the prompt (e.g., audit directive, clarifying question instruction, etc.).

Each AI app I tested this with agreed with me. At first I thought, “Great—sycophancy again.” But the reason given in each conversation/project differed, which convinced me that between capabilities updates and my human-AI-skill improvement over time, this was progress.

That’s how my AI-generated 12-month $100k revenues strategy became more week-to-week agile.

That’s how my AI-generated travel itineraries became less presumptive and more feasible yet adventurous.

Overall, that’s how my AI-generated outputs became more connected to my why.

And that’s of course, what it’s all about…for me anyway. Forming a clear, actionable view of the reasons why to summon resources to do anything—and then forming that view with a deep sense of why to do that thing with AI and machines.

It’ll always be up to me, the human, to ask the right questions—including those that lead AI and other human as well as AI collaborators to ask better questions. (Important caveat here: For the change I wish to evince in the world, “better” = “specific” = “peculiar”.) As part of that process and skill building, that means keeping tabs on what AI can do and ways to work with it—and then sharing that experience with others.

Fun!

AI is getting better at execution, but you still have to point it at the right problem

Leave a Comment Cancel Reply

Contact us today to get a free consultation!

Quick Links

Company

Info