Apple Unveils Open-Source AI for Image Editing with Natural Language Commands
Introducing "MGIE" (MLLM-Guided Image Editing) by Apple, utilizing multimodal large language models for interpreting user requests and executing pixel-level manipulations.
Apple AI, Image Editing with Natural Language Input |
The model excels in diverse image enhancements, covering global adjustments like brightness, contrast, and artistic effects, alongside local modifications for specific regions. Photoshop-style edits encompass cropping, resizing, rotating, adding filters, changing backgrounds, and blending images.
For instance, a user can request to "make a pizza photo look healthier," prompting the model to intelligently add vegetable toppings. Global optimizations, such as "adding contrast for simulated light," and Photoshop-style modifications like removing background elements, showcase the model's versatility.
In collaboration with University of California researchers, Apple presented MGIE at ICLR 2024, making the model, including code, data, and pre-trained models, accessible on GitHub.
This marks Apple's second recent breakthrough in AI research, following their innovative flash memory utilization technique for deploying large language models on memory-constrained devices, disclosed in late December.
Apple is actively developing an "Apple GPT" competitor to rival ChatGPT, underscoring the company's prioritization of AI, as reported by Bloomberg's Mark Gurman. Apple is constructing an "Ajax" framework tailored for large language models, highlighting their commitment to advancing AI technology.
Reports from The Information and analyst Jeff Pu suggest that Apple aims to introduce a generative AI feature on the iPhone and iPad by late 2024, coinciding with the release of iOS 18. Gurman anticipates that iOS 18 could mark a significant milestone in the iPhone's history, featuring an enhanced Siri with ChatGPT-like generative AI capabilities.