Apple is not currently one of the top players in the AI market. However, the company’s new open-source artificial intelligence model for image editing shows how much the technology giant can contribute in this field.
The model in question is called MLLM Guided Image Editing (MGIE). Here, images are interpreted by processing text-based commands through multimodal large language models. In other words, Apple’s tool can organize photos based on the text users type.
The company developed MGIE with researchers from the University of California, Santa Barbara. MLLMs have the power to transform simple or vague text prompts into more detailed and clear instructions that the photo editor can follow. For example, if a user wants to edit a photo of pepperoni pizza to “make it healthier,” MLLMs might interpret this as “add veggie toppings” and edit the photo that way.
In addition to making major changes to images, MGIE can crop, resize, and rotate photos via text prompts, as well as improve their brightness, contrast, and color balance. It can also edit certain areas of a photo and, for example, change the hair, eyes and clothing of a person in the photo or remove elements from the background.
Apple released its new tool via GitHub
Apple released the model via GitHub, but those interested can now access the demo via Hugging Face Spaces. The company did not provide information on how it plans to benefit from what it learned from this project.