纳金网

标题: NUKE PYTHON 性能分析 [打印本页]

作者: 铁锹 时间: 2017-7-4 14:45
标题: NUKE PYTHON 性能分析

使用性能计时器

　　在nuke中打开性能计时器时，就可以通过python读取性能信息。就能知道每个节点的运算时间。

　　在调试运行较慢的脚本，找出耗时瓶颈上很有帮助。

　　注意，打开性能计时器会影响nuke的性能，因为会频繁打开、关闭计时器，同步线程。因此打开

　　此功能，nuke会更慢，但是能让你获取nuke脚本处理时间的快照。

　　从命令行加上“-p” 会打开此功能，或者调用python命令来打开计时器

　　 nuke.startperformancetimers()

　　检查计时器是否允许性能计时器：

　　 nuke.usingperformancetimers()

　　关闭计时器：

　　 nuke.resetperformancetimers()

　　nuke架构上的注意事项

　　当nuke内的节点要渲染时，有4个步骤：

　　store ╟ 第一件事就是存下用户在knobs上选择的数据

　　validate ╟ 节点告知nuke其输出的信息，例如处理的哪个通道，以及大小

　　request ╟ 在这节点弄明白了其需要哪些输入才能产生对应的输出(例如，需要通道，需要区域)

　　engine ╟ 在这节点干了大部分的工作，并输出。这也是花费时间最多的地方。

　　更多信息，请看nuke 开发文档

　　通过python获取能耗时间

　　nuke的性能计时器收集这四个处理阶段的信息，并可以通过python来获取。另外，性能计时器激活时，计时信息会显示在节点图中。节点也会根据耗时比重显示不同的颜色，从绿(最快)到红(最慢)。函数nuke.node.performanceinfo()会打印某个节点的时间信息。例如，下面的代码段就能打印当前节点树种每个节点的时间信息(包括在组内的节点)：

　　 for n in nuke.allnodes(recursegroups=true):

　　 print n.fullname()

　　 print n.performanceinfo()

　　对一个简单的只有 checkerboard-> blur -> defocus -> viewer 的节点树输出和下面的很像：

　　 defocus1

　　 {'callcount': 10228, 'timetakenwall': 28524348, 'timetakencpu': 624512794}

　　 blur1

　　 {'callcount': 9607, 'timetakenwall': 9906815, 'timetakencpu': 151406143}

　　 viewer1

　　 {'callcount': 0, 'timetakenwall': 0, 'timetakencpu': 0}

　　 checkerboard1

　　 {'callcount': 34396, 'timetakenwall': 3923322, 'timetakencpu': 29982254}

　　如上 nuke.node.performanceinfo()返回了一个包含下列性能统计信息的字典：

　　callcount 这个过程被调用的次数

　　timetakenwall 用墙上挂钟记录所花费的时间，用户实际要等待此处理要结束的时间。以毫秒计算。

　　timetakencpu

　　linux上是cpu执行代码的时间，也是毫秒计时。是所有cpu上的用时总和。例如，多线程engine的处理时间就要比实际用时长很多。如果平均cpu时间(timetakencpu初始使用的线程数)比每个线程执行时间短，说明cpu线程花了很长时间但没有执行代码。例如，等待锁，说明性能又问题。

　 mac和win上，cpu时间还不能用，mac上这个值和处理的总时间差不多。

　　在linux nuke开24线程上获取的时间信息，我们看下最耗时的两个节点blur和defocus：

　　 defocus1

　　 {'callcount': 10228, 'timetakenwall': 28524348, 'timetakencpu': 624512794}

　　 blur1

　　 {'callcount': 9607, 'timetakenwall': 9906815, 'timetakencpu': 151406143}

　　blur节点的cpu时间是 wall time的24倍，defocus节点的cpu时间是我们期望值的三分之二。说明engine的线程都被blur节点占用了，同事defoucs节点花费了相当长的时间来等待，同时我们也发现了以后优化nuke的一个方向!

　　其他性能统计

　　默认，nuke.node.performanceinfo()会给出engine处理部分,通常也是最好是部分的用时信息. 也可以通过传入下面的参数获取其他处理部分的用时信息:

　　 nuke.profile_store

　　 nuke.profile_validate

　　 nuke.profile_request

　　 nuke.profile_engine

　　例如,获取defocus节点在上面的数中的所有用时,可以用下面的代码:

　　 n = nuke.tonode("defocus1")

　　 print "defocus1"

　　 print "store"

　　 print n.performanceinfo(nuke.profile_store)

　　 print "validate"

　　 print n.performanceinfo(nuke.profile_validate)

　　 print "request"

　　 print n.performanceinfo(nuke.profile_request)

　　 print "engine"

　　 print n.performanceinfo(nuke.profile_engine)

　　结果如下:

　　 # result: defocus1

　　 store

　　 {'callcount': 108, 'timetakenwall': 6571, 'timetakencpu': 6563}

　　 validate

　　 {'callcount': 53, 'timetakenwall': 1451, 'timetakencpu': 1445}

　　 request

　　 {'callcount': 108, 'timetakenwall': 1017, 'timetakencpu': 1009}

　　 engine

　　 {'callcount': 10228, 'timetakenwall': 28524348, 'timetakencpu': 624512794}

　　正如预料的,defocus大部分时间花在了engine处理上,store,validate,request相对都很快.

　　如果并非如此,或者callcount显示store,validate,request调用次数很多,就说明这是一个影响性能的问题了.

　　注意某些节点就是要比例子中的defocus在store,validate,request阶段花费更多时间.例如,reader在validate阶段更长,因为需要打开文件,如果是网络文件就会更慢. scanlinereader节点是另一个例子,validate要更长,同事store节点比rotopaint更慢,因为需要bake曲线.带有耗时表达式的knbo也会在store节点花费更多时间.

　　性能信息写入xml

　　当使用 “-pf “参数运行nuke时,性能数据连带部分系统数据会自动写入xml文件.

　　此模式下,性能计时器会再渲染开始前启动,渲染结束后就讲数据写入xml文件. 当nuke在

　　此模式运行时,调用:nuke.performanceprofilefilename()会返回xml文件名。

via yiihuu

欢迎光临纳金网 (http://go.narkii.com/club/)