Predicting the Next Multimodal Token with Emu3, and Python relative import issue.
I recently came across BAAI’s Emu3, an exciting model that aims to unify multimodal understanding and generation within a single architecture. It has the potential to revolutionize how we approach tasks involving text, images, and videos. In this article, I’ll share my experience exploring the Emu3 model, its setup, and how I overcame a common error encountered when using it.
Discovering Emu3
The Emu3 model immediately caught my attention due to its unique design based on the Transformer architecture. It’s built to handle multiple modalities — such as text, images, and videos — using a single model. Although the ultimate goal is to generate every modality using a single set of weights, the current implementation still requires separate sets of model weights for different tasks, which are planned to be unified in future updates.
Unfortunately, the model weights for video generation haven’t been open-sourced yet, but we can still explore the capabilities of the Gen and Chat models. The Chat model handles text generation, while the Gen model focuses on image generation.
Gen model: https://huggingface.co/BAAI/Emu3-Gen
Chat model: https://huggingface.co/BAAI/Emu3-Chat
Encountering a problem
Eager to try out Emu3, I started with the code snippet provided in the official repository. However, I quickly ran into an issue with the following lines:
import sys
sys.path.append(PATH_TO_BAAI_Emu3-Gen_MODEL)
from processing_emu3 import Emu3Processor
Python threw a “NameError: name ‘PATH_TO_BAAI_Emu3-Gen_MODEL’ is not defined,” indicating that the PATH_TO_BAAI_Emu3-Gen_MODEL
constant was not defined. To find the model path, I manually downloaded the model using the Hugging Face CLI command:
huggingface-cli download --resume-download BAAI/Emu3-Chat
The model was downloaded to the following path as shown in the log
/root/.cache/huggingface/hub/models-BAAI-Emu3-Gen/snapshots/a9da517b1a51b160360947020c4c24900529495f
However, simply replacing PATH_TO_BAAI_Emu3-Gen_MODEL with the downloaded path didn’t solve the problem. I encountered a new error:
ImportError: attempted relative import with no known parent package
The error message suggests two things:
- Relative Imports in the Emu3 Code: The Emu3 codebase uses relative imports — for example, in the model repository,
modeling_emu3.py
importsconfiguration_emu3.py
from the same folder. - Unknown Parent Package: Python couldn’t identify the parent package, making it impossible to resolve the relative imports.
Understanding Python’s Import Mechanism
To grasp why this error occurred, I revisited how Python handles imports. When Python encounters an import statement, it follows these steps:
- Module Cache Check: Python checks if the module is already in
sys.modules
. If found, it uses the cached module. - Searching
sys.path
: If not cached, Python searches for the module in the directories listed insys.path
. - Module Loading: Upon finding the module, Python loads it and creates a new module object.
- Resolving Relative Imports: If the module contains relative imports (e.g.,
from .util import ...
), Python resolves them based on the module's__package__
attribute. - Handling Import Errors: If Python can’t resolve a relative import due to an unknown parent package, it raises an
ImportError
.
While the CPython implementation can be complex and challenging to interpret, the importlib
module offers a clearer example of how relative imports are handled. Although the error messages may differ slightly, the underlying logic remains consistent and aids our understanding. The relevant code for handling relative imports can be found in the importlib.__init__.py
file:
def import_module(name, package=None):
"""Import a module.
The 'package' argument is required when performing a relative import. It
specifies the package to use as the anchor point from which to resolve the
relative import to an absolute import.
"""
level = 0
if name.startswith('.'):
if not package:
raise TypeError("the 'package' argument is required to perform a "
f"relative import for {name!r}")
for character in name:
if character != '.':
break
level += 1
return _bootstrap._gcd_import(name[level:], package, level)
The Role of __package__
The __package__
attribute is crucial for resolving relative imports. It defines the package context for a module. When a module is imported as part of a package (e.g., import my_package.my_module
), __package__
is set to 'my_package'
. If a module is run directly or imported without a package context, __package__
is set to None
.
In our case, since we were adding the model directory directly to sys.path
and importing modules from there, Python treated the modules as top-level, standalone modules. Consequently, __package__
was None
, causing relative imports within those modules to fail.
from processing_emu3 import Emu3Processor
Solving the Problem
To resolve the import issue, I needed to ensure that Python recognized the modules as part of a package, such as snapshot.processing_emu3
. This involved organizing the modules within a proper package structure.
Creating a Symbolic Link
First, I created a symbolic link (symlink
) to the snapshot directory in my current working directory:
ln -s /root/.cache/huggingface/hub/models--BAAI--Emu3-Gen/snapshots/a9da517b1a51b160360947020c4c24900529495f snapshot
This allowed me to reference the model files relative to my current directory.
Modifying sys.path
and Importing Correctly
Next, I modified my script to adjust sys.path
and import the Emu3Processor
correctly:
import sys
sys.path.append(".")
from snapshot.processing_emu3 import Emu3Processor
By appending "."
(the current directory) to sys.path
, I ensured that Python would search the current directory for modules. Importing Emu3Processor
from snapshot.processing_emu3
preserved the package structure, allowing relative imports within processing_emu3.py
to resolve correctly.
With this setup in place, the rest of the code runs smoothly. It’s worth noting that the gen model takes some time to generate an image, so patience is key.
Result
This is the 720x720 image that I generated — not bad, right?
That said, the model doesn’t support interleaving text and image ingeneration, which would be a fantastic feature. Hopefully, this will be added in a future release!
— — —
PS: You can try to generate image with the chat model and generate text with the gen model and see what you got as a fun exercise :) Leave me a comment below for what you got :)
Appendix: Reproduction of the import issue
To better understand the issue, let’s replicate it with a simple setup:
my_module/
├── lib.py
└── util.py
user/
├── main.py
└── test.ipynb
wrong_usage/
└── main.py
In this structure, the my_module package contains two modules: lib.py and util.py. The lib.py module uses a relative import to access functions from util.py:
from .util import add, subtract, multiply
Now, let’s consider two scenarios:
- Running
python user/main.py
:
import sys
sys.path.append("..")
print(sys.path)
import my_module.lib
print(my_module.lib.add(1, 2))
print(my_module.lib.subtract(1, 2))
print(my_module.lib.multiply(1, 2))
This script runs successfully because it imports my_module.lib using an absolute import, and the parent directory is added to sys.path.
- Running
python wrong_usage/main.py
:
import sys
sys.path.append("../my_module")
print(sys.path)
import lib
print(lib.add(1, 2))
print(lib.subtract(1, 2))
print(lib.multiply(1, 2))
This script raises the “ImportError: attempted relative import with no known parent package” because it tries to import lib
directly as a top-level module, i.e. a standalone module, without specifying the parent package.
This is the same regardless of running it as
- python main.py
- python -m main
- Or running it inside a jupyter notebook
That said, the model doesn’t support generation for interleaving text and image otherwise it will be super cool. Hopefully this will get added in the new release!
— — —
PS: You can try to generate image with the chat model and generate text with the gen model and see what you got as a fun exercise :) Leave me a comment below for what you got :)
(Co-written by Claude Opus + ChatGPT 4o)