Ansible 源码解析 - ansible 命令的调用

这一系列文章将通过一些常见的 Ansible 使用例子来解析 Ansible 内部代码实现

Ansible 有很多命令,ansible、ansible-playbook、ansible-doc、ansible-galaxy、ansible-console 等,笔者在看 Ansible 入口代码的时候发现 Ansible 对于不同命令的处理方式很特别

笔者的环境是 Github ansible/ansible 仓库的 stable-2.2 分支

$ git branch

* (detached from origin/stable-2.2)
devel

先来看看入口文件 setup.py,了解 setuptools 的朋友应该对这个参数不陌生,简单来说,就是这些文件将会被复制到系统 PATH 中,供用户调用

scripts=[
'bin/ansible',
'bin/ansible-playbook',
'bin/ansible-pull',
'bin/ansible-doc',
'bin/ansible-galaxy',
'bin/ansible-console',
'bin/ansible-vault',
]

到这里,没有什么问题,正常的写法应该是每一个命令对应某一个 Cli 类去处理,但是 Ansible 不是这样做的,可以看到所有的命令其实都是 ansible 命令的软链接,也就是说,无论我们调用 ansible-playbook、ansible-doc、ansible-galaxy 命令最终都是去执行 ansible 命令

$ ls bin -l

total 8
-rwxr-xr-x. 1 root root 4795 Jun 16 11:03 ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-console -> ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-doc -> ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-galaxy -> ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-playbook -> ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-pull -> ansible
lrwxrwxrwx. 1 root root 7 Jun 16 10:17 ansible-vault -> ansible

那 Ansible 是如何区分不同命令调用的?

ansible 命令解析

代码路径 ansible/bin/ansible

下面是简化过的 ansible 命令入口代码

"""
ignore from import
"""

if __name__ == '__main__':

display = LastResort()
cli = None
me = os.path.basename(sys.argv[0])
"""
me 为调用的命令名称,如果调用 ansible-playbook 命令则为 ansible-playbook
ansible-doc 则为 ansible-doc ....
"""

try:
display = Display()
display.debug("starting run")

sub = None
target = me.split('-')
"""
如果调用的命令是 ansible-playbook,这里的 target 为 ['ansible', 'playbook']
"""
if target[-1][0].isdigit():
# Remove any version or pthon version info as downstreams
# sometimes add that
target = target[:-1]

if len(target) > 1:
sub = target[1]
myclass = "%sCLI" % sub.capitalize()
""""如果调用的命令是 ansible-playbook,这里的 myclass 为 PlaybookCLI"""
elif target[0] == 'ansible':
sub = 'adhoc'
myclass = 'AdHocCLI'
else:
raise AnsibleError("Unknown Ansible alias: %s" % me)

try:
mycli = getattr(__import__("ansible.cli.%s" % sub, fromlist=[myclass]), myclass)
"""
这里通过 python 内建函数 __import__ 通过类名获取到对应的 CLI 类

如果被调用的命令是 ansible-playbook 那么这里的 mycli
则为 ansible.cli.playbook.PlaybookCLI
"""
except ImportError as e:
# ImportError members have changed in py3
if 'msg' in dir(e):
msg = e.msg
else:
msg = e.message
if msg.endswith(' %s' % sub):
raise AnsibleError("Ansible sub-program not implemented: %s" % me)
else:
raise

try:
args = [to_text(a, errors='surrogate_or_strict') for a in sys.argv]
except UnicodeError:
display.error('Command line args are not in utf-8, unable to continue. Ansible currently only understands utf-8')
display.display(u"The full traceback was:\n\n%s" % to_text(traceback.format_exc()))
exit_code = 6
else:
cli = mycli(args)
cli.parse()
exit_code = cli.run()
"""实例化 --> 解析命令行 --> 调用 run 方法"""

except AnsibleOptionsError as e:
cli.parser.print_help()
display.error(to_text(e), wrap_text=False)
exit_code = 5
except AnsibleParserError as e:
display.error(to_text(e), wrap_text=False)
exit_code = 4
# TQM takes care of these, but leaving comment to reserve the exit codes
# except AnsibleHostUnreachable as e:
# display.error(str(e))
# exit_code = 3
# except AnsibleHostFailed as e:
# display.error(str(e))
# exit_code = 2
except AnsibleError as e:
display.error(to_text(e), wrap_text=False)
exit_code = 1
except KeyboardInterrupt:
display.error("User interrupted execution")
exit_code = 99
except Exception as e:
have_cli_options = cli is not None and cli.options is not None
display.error("Unexpected Exception, this is probably a bug: %s" % to_text(e), wrap_text=False)
if not have_cli_options or have_cli_options and cli.options.verbosity > 2:
log_only = False
else:
display.display("to see the full traceback, use -vvv")
log_only = True
display.display(u"the full traceback was:\n\n%s" % to_text(traceback.format_exc()), log_only=log_only)
exit_code = 250
finally:
# Remove ansible tempdir
shutil.rmtree(C.DEFAULT_LOCAL_TMP, True)

sys.exit(exit_code)

Ansible 通过调用命令的名称来动态的引入对应的类

__imoprt__ 是 Python 的内置函数,import 也是调用 __import__ 实现的,下面是两种方式引用的对比

>>> from ansible.cli.playbook import PlaybookCLI as mycli

>>> mycli
ansible.cli.playbook.PlaybookCLI


>>> mycli = getattr(__import__('ansible.cli.playbook', fromlist=['PlaybookCLI']), 'PlaybookCLI')

>>> mycli
ansible.cli.playbook.PlaybookCLI

__import__ 可以以字符串的形式引入模块/类/方法,更适合动态引入

这样我们可以得知,如果调用 ansible 命令,会引入 ansible.cli.adhoc.AdhocCLI 类,而 调用 ansible-playbook 命令,会引入 ansible.cli.playbook.PlaybookCLI 类,然后实例化对象,将命令行参数传递给对象进行解析,再调用对象的 run 方法完成此次调用

CLI 类

那么我们来看看具体 CLI 类的实现,从 ansible/cli/adhoc.py 这个文件中可以看出,所有 Ansible 的子命令类都是继承于 ansible.cli.CLI (ansible/cli/init.py) 实现的

CLI 是一个抽象类,具体看 Python装饰器、metaclass、abc模块学习笔记 这篇文章

ansible.cli.CLI 最核心的方法是下面这几个

@staticmethod
def base_parser(usage="", output_opts=False, runas_opts=False, meta_opts=False, runtask_opts=False, vault_opts=False, module_opts=False,
async_opts=False, connect_opts=False, subset_opts=False, check_opts=False, inventory_opts=False, epilog=None, fork_opts=False,
runas_prompt_opts=False, desc=None):

# 调用了 abstractmethod 装饰器的方法,子类都必须实现这个方法,不然无法实例化对象
@abstractmethod
def parse(self):

@abstractmethod
def run(self):

base_parser 定义一些基础分类的命令行选项,并返回 SortedOptParser 对象

子类通过传递参数来选择是否需要特定分类的命令行选项,再通过重写的 parse 方法中调用 CLI.base_parser 并添加额外选项,再解析传入的命令行参数,AdhocCLI 的实现如下


def parse(self):
''' create an options parser for bin/ansible '''

self.parser = CLI.base_parser(
usage='%prog <host-pattern> [options]',
runas_opts=True,
inventory_opts=True,
async_opts=True,
output_opts=True,
connect_opts=True,
check_opts=True,
runtask_opts=True,
vault_opts=True,
fork_opts=True,
module_opts=True,
desc="Define and run a single task 'playbook' against a set of hosts",
epilog="Some modules do not make sense in Ad-Hoc (include, meta, etc)",
)

# options unique to ansible ad-hoc
self.parser.add_option('-a', '--args', dest='module_args',
help="module arguments", default=C.DEFAULT_MODULE_ARGS)
self.parser.add_option('-m', '--module-name', dest='module_name',
help="module name to execute (default=%s)" % C.DEFAULT_MODULE_NAME,
default=C.DEFAULT_MODULE_NAME)

super(AdHocCLI, self).parse()

if len(self.args) < 1:
raise AnsibleOptionsError("Missing target hosts")
elif len(self.args) > 1:
raise AnsibleOptionsError("Extraneous options or arguments")

display.verbosity = self.options.verbosity
self.validate_conflicts(runas_opts=True, vault_opts=True, fork_opts=True)

AdhocCLI 子类添加了两个新的选项 -a-m 用来指定模块参数和模块名称,然后通过调用 validate_conflicts 来验证命令行参数合法性

run 方法则是用来运行特定任务的,AdhocCLI 中用来创建单个 play task ,关于这里的代码,下篇文章会有介绍

ansible.cli.adhoc.AdhocCLI.run

总结

画了个思维导图,不是很清晰,但是大致就是这样,点击图片可放大查看

ansible-cli