Google's Gemini and Pixel 10 natural language image editing is the future of Photoshop
Adobe is doing a good job staying on top of AI developments and integrating them into its product suite, but Google's latest could steal their prosumer base - and spawn new creative powerhouses, too
Google has made some very interesting product announcements over the past week around AI-powered image editing. There are actually two separate announcements, and no overts link between the two, but it seems likely that they at least share some common DNA under the hood.
Today (Tuesday, August 26), Google announced that its DeepMind team developed an extremely sophisticated and capable image editing model nicknamed ‘nano banana,’ which, aside from calling to mind Nintendo’s excellent new Switch 2 exclusive Donkey Kong game, also brings very powerful editing tools to its Gemini AI-powered assistant.
And then last week, Google unveiled its new Pixel device lineup, including the Pixel 10, Pixel 10 Pro and Pixel 10 Pro Fold smartphones, plus new Pixel Buds hardware and capabilities, and a new Pixel Watch 4. One of the highlight features of the new Pixel 10 line is the ability to edit photos taken with the devices in Google Photos using simple, natural language editing commands - things as simple as ‘make this photo look better,’ as well as more specific commands like ‘fix the lighting in this picture’ or ‘frame the subject in a more interesting composition.’
Google — and any other smartphone maker or photo editing software provider — have long offered ‘auto’ adjustment features that make an algorithmically-determined set of optimizations and changes to the pictures people take. And Google has used generative AI-powered editing for things like ‘magically’ erasing elements from pictures or improving specific parts of photos, but these editing tools go beyond that to offer more comprehensive, powerful and flexible editing options with a user interface that’s accessible to even the most non-expert and non-technical user.
These features are not dissimilar from some of the more recent feature releases from companies like Adobe whose entire business is focused on image editing. Adobe offers generative erase/removal and canvas expansion features, for instance, that either automatically try to intuit what you want to accomplish, or that can take natural language instructions to steer their results.
Adobe’s tools are still aimed at and optimized for professionals who want to start to make use of advanced AI features to lighten the load of highly repetitive, painstaking manual work during edits. Google’s tools are aimed squarely at non-expert users, many of whom have probably never opened a dedicated image editing app in their entire lives. They’re very different ideal customer profiles, but they also meet in the middle, and I think the bigger question is whether Google’s features and services start pushing up into Adobe’s market before Adobe’s tools start pushing down into Google’s.
Even Google’s own tools are both aimed at slightly different use cases and segments: The Google Photos editor is focused more on naturalistic, small changes to existing photos that keep them looking mostly like the photos you actually took. Gemini’s enhanced editing capabilities are designed for more imaginative use cases, like changing your entire outfit or setting, or combining subjects from multiple photos into one generated image. But they’re both on a spectrum that runs the gamut of uses that act as entry points for many Photoshop-curious users who pick up Adobe’s suite for the first time and end up becoming lifelong users.
I wrote previously about how OpenAI’s latest image generation tools powered by GPT-4o represented a huge shift in AI-powered graphics and visuals, and Google’s newest tools represent a continuation and significant evolution of that shift. It’s actually much easier to anticipate a future in which ‘vibe photo editing’ is a going professional concern and paid opportunity than it is to envision a scenario where ‘vibe coder’ is a real and sustainable job. Rick Rubin on 60 Minutes famously talked about how he accomplished everything he did without any technical ability, ascribing his success instead to taste, and photo editing has more in common with music production than it does with coding.
What I think might result from continued improvements like natural language editing in Google Photos and Gemini’s new ‘nano banana’ features is a changed creative industry where taste and style, rather than technical capability, will determine who becomes most successful. I don’t actually think it represents a path to ‘democratization’ of creative skills and expression: instead, I think it means value will accrue better to those who actually deserve it, rather than to people who are able to decode the specific technical tools and advanced software incantations that separate ‘professionals’ from ‘amateurs’ today. An editor with poor or mediocre creative vision, but an adept-level grasp of Photoshop or Adobe Premiere is unlikely to get very far in this predicted world, but a true visionary who can barely operate their iPhone stands a much better chance, by contrast.
Even speaking as someone who has spent 20+ years developing proficiency with all kinds of ‘professional’ level creative software, personally, I’d love to see more of those visionaries get a real chance.



