Background
In an old system, there is a piece of code like this:
scope = dict(globals(), **locals())
exec(
"""
global_a = 123
def func_a():
print(global_a)
"""
, scope)
exec("func_a()", scope)
The first segment of user code defines a function, and the second segment executes the function (don’t ask why this is done, because the user is always right). After executing the first code block, func_a
and global_a
are added to the scope. Since the second code block uses the same scope, calling func_a
in the second block correctly outputs 123.
However, using exec
to execute user code is neither elegant nor safe, so the exec
function was encapsulated in a Python sandbox environment (simply put, another Python service that receives the code and scope, and executes exec(code, scope)
in the sandbox environment). Essentially, each call to exec
was replaced with an RPC request to the sandbox service.
The code then becomes:
scope = dict(globals(), **locals())
scope = call_sandbox(
"""
global_a = 123
def func_a():
print(global_a)
"""
, scope)
call_sandbox("func_a()", scope)
Issue with Passing Scope Across Services
Since multiple RPC calls need to use the same scope, the sandbox service returns a new scope to ensure the scope isn’t lost on subsequent calls. However, when executing the code, you’ll find that the second call_sandbox
call returns an error:
global name ‘global_a’ is not defined
Initially, it was suspected that the scope wasn’t updated after the first call. However, if the scope wasn’t updated, it should have reported that func_a
couldn’t be found. This error indicates that during the second call, func_a
exists in the scope, but func_a
can’t find the variable global_a
. By printing the scope before the second call_sandbox
, you’ll find that both global_a
and func_a
exist:
print(scope.keys())
# ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__',
# '__builtins__', 'global_a', 'func_a']
call_sandbox("func_a()", scope)
This proves that during the second call_sandbox
, the scope was correctly passed in, and the absence of an error about func_a
confirms this. By retrieving and printing globals()
and locals()
inside func_a
:
def func_a():
inner_scope = dict(globals(), **locals())
print(inner_scope.keys())
# ['__builtins__']
You can see that the outer scope of func_a
is normal, but inside func_a
, the scope only contains builtins, effectively clearing the scope. It’s suspected that the function’s caller points to the sandbox environment’s scope, and when the scope is returned, the caller isn’t updated, so the function can’t find the outer scope. Checking Python function magic methods:
Reveals a globals variable pointing to the current scope, essentially the function’s caller. Verify whether the globals of func_a
in the scope returned by the sandbox service matches the current scope:
scope["func_a"].__globals__ == globals() # False
Indeed, they are different. Next, try setting scope["func_a"].__globals__
to globals()
, which should resolve the issue.
Optimizing Scope Update Logic
The root cause of the issue is now clear:
- The first
exec
statement and the secondexec
statement are executed in Python services A and B, respectively. The scope offunc_a
defined in the firstexec
statement is service A (func_a.__globals__ == A
). - When the scope is returned to service B,
global_a
andfunc_a
are copied to service B’s scope, butfunc_a.__globals__
still points to service A’s scope, resulting infunc_a
being callable but unable to findglobal_a
withinfunc_a
. - Setting
func_a.__globals__
to B allows the code to execute correctly in service B.
As documented, the function globals is a read-only variable, so it can’t be directly assigned. Instead, a function copying method is needed:
import copy
import types
import functools
def copy_func(f, globals=None, module=None):
if globals is None:
globals = f.__globals__
g = types.FunctionType(f.__code__, globals, name=f.__name__,
argdefs=f.__defaults__, closure=f.__closure__)
g = functools.update_wrapper(g, f)
if module is not None:
g.__module__ = module
return g
Update the scope returned by the sandbox call. If a value in the scope is a function, update its globals to the scope using the copying method:
scope = dict(globals(), **locals())
scope = call_sandbox(
"""
global_a = 123
def func_a():
print(global_a)
"""
, scope)
for k, v in scope.items():
if isinstance(v, types.FunctionType):
scope[k] = copy_func(v, scope, __name__)
call_sandbox("func_a()", scope)
Re-run the code, and both call_sandbox
calls execute correctly, resolving the issue.
References
https://docs.python.org/3/reference/datamodel.html
https://stackoverflow.com/questions/2904274/globals-and-locals-in-python-exec/2906198