Investigating pypy frame drop
pypy has spikes in time it takes occasionally when using cpyext, otherwise known as pauses.
This is because it's deallocating lots of CPython objects all in one frame, rather than incrementally. This is likely to be addressed in a future release of pypy.
To work around it for now,
Because pypy currently doesn't compile matplot lib you need to run it in two steps. First to do the run (let it run for 20+ seconds). Then to load the data pickle, and show the graph.
To run with the GC tweak...
This final result for pypy 'ok' because we want under 0.0166 seconds used per frame. Improvements inside pypy itself in the future should hope to remove many of these spikes.
Other things I tried was to do a gc.collect(0) just before display.flip(), however this didn't appear to help in this case. This is because we can know if we have 5-10ms free waiting for the display VSYNC to flip (when running at 60Hz). Perhaps this can be attempted again as the cpyext GC is improved.
This is because it's deallocating lots of CPython objects all in one frame, rather than incrementally. This is likely to be addressed in a future release of pypy.
Mainly it's a problem when creating and deleting lots of CPython objects (like pygame.Rect). Not pure python objects.
To work around it for now,
- the src/rect.c has been changed to have a free list of python objects, so it doesn't alloc and dealloc them. Instead it maintains it's own list of rect PyObject pointers, and reuses them. This is done in PR #431
- Use PYPY_GC_NURSERY=1m incminimark environment variables when running pypy.
hacked up benchmark.
The hacked up benchmark is in the branch pypy-hack-frame-bench of the pygame repo. It's based off examples.testsprite.Because pypy currently doesn't compile matplot lib you need to run it in two steps. First to do the run (let it run for 20+ seconds). Then to load the data pickle, and show the graph.
pypy examples/testsprite.py -plot -noupdate_rects -width 320 -height 200 100
python examples/testsprite.py -plotpickle
To run with the GC tweak...
PYPY_GC_NURSERY=1m pypy examples/testsprite.py -plot -noupdate_rects -width 320 -height 200 100
graph colors
- Blue is the first 1200 frames
- orange is the last 1200 frames of the run.
Time per frame on pypy 5.10.0.
For this example we don't see any 'jit warm up' behavior.Time per frame on pypy 5.10.0, rect freelist, standard GC.
Time per frame on pypy 5.10.0, rect freelist, PYPY_GC_NURSERY=1m.
This final result for pypy 'ok' because we want under 0.0166 seconds used per frame. Improvements inside pypy itself in the future should hope to remove many of these spikes.
Time per frame on python3.6
Here we see both the average time per frame and maximums are smaller on python3.6.Time per frame on python3.6, rect freelist
It seems the freelist for rects on python3.6 has a minimal affect.Time per frame on python3.6, gc.disable()
Here we can see using gc.disable() seems to have no affect in this program (on python 3.6). I also tried this on pypy, and it had no affect either (on this program).Other things I tried was to do a gc.collect(0) just before display.flip(), however this didn't appear to help in this case. This is because we can know if we have 5-10ms free waiting for the display VSYNC to flip (when running at 60Hz). Perhaps this can be attempted again as the cpyext GC is improved.
Comments
ambbet