Friday, March 31, 2017

Data aware jit/blit - drawing 1.25 to 1.45 times faster.

Drawing different types of pixels can be quicker if you know about the image you are drawing, and if you know that drawing parts of the image with specialised blitters is quicker.

A good example is if your image is 25% areas of large blocks of either white or black. Using a specialised fill routine to just draw those big blocks of color is lots quicker. This is because there is usually an optimised, and hardware accelerated fill routine.

See all this whitespace? (all the white parts on your screen) These can be drawn really quickly with ASIC fill hardware rather than a slow GPU running a general purpose image blitter.

Another example is like this Alien image. The edges of the image are transparent, but the middle has no transparency. Since drawing transparent images is slow, using a different drawing routine for the middle part than the edges turns out to be faster.

Alien graphic used in Pygame Zero teaching framework documentation.
Here is a proof of concept which draws an image used by pygame zero in 80% of the time it normally takes. That is about 1.25 times quicker.

Alien sectioned up, drawn with 5 different blitters, each perfect for the section.

The results vary dramatically depending on the image itself. But the 1.25 times faster is fairly representative of transparent images where the middle part isn't. If it finds sections where the image is a plain colour, that can be 1.42 times faster. Or more. Larger images give you different results as does different hardware. Obviously a platform with a fast path hardware accelerated image fills, or 16 bit image rendering but slow 32bit alpha transparency is going to get a lot bigger speedups with this technique.

Further work is to develop a range of image classifiers for common situations like this, which return custom blitters depending on the image data, and the hardware which it is running on.

(this is one of several techniques I'm working on for drawing things more quickly on slow computers)

Thursday, March 30, 2017

Four new pygame things for slow computers.

There's four things I'd like to work on for faster pygame apps on slower computers (like the Raspberry Pi/Orange Pi).
  • Dirty rect optimizations, to speed up naieve drawing.
  • SDL2, for better hardware support.
  • C/Cython versions of pygame.sprite
  • Surface.blits, a batching API for drawing many images at once.
The general idea with all of these is to take techniques which have proven to improve the performance of real apps on slower computers. Even though, for some of them there are still open questions, I think they are worth pursuing.  Even though more advanced techniques can be used by people to work around these issues, this should be fast even if people do things the easy way on slower computers.

Anyway... this is a summary of the discussions and research on what to do, and a declaration of intent.

Dirty Rect optimizations.

First a couple of definitions. "Dirty Rect" is a technique in graphics where you only update the parts that changed (the dirty parts), and rect is a rectangle encompassing the area that is drawn.

We already have plenty of code for doing this is pretty fantastic ways. LayeredDirty is a particularly good example, which includes overlapping layers and automatically deciding on rendering technique based on what sort of things you're drawing. However, when people just update the whole screen like in pygame zero (the project which has a mission to make things simple for newbies), then there can be performance issues. This change is aimed at making those apps work more quickly without them needing to do anything extra themselves.

So, on to the technique...

Because rectangles can overlap, it's possible to reduce the amount drawing done when using them for dirty rect updating. If there's two rectangles overlapping, then we don't need to over draw the overlapping area twice. Normally the dirty rect update algorithm used just makes a bigger rectangle which contains two of the smaller rectangles. But that's not optimal.

But, as with all over draw algorithms, can it be done fast enough to be worth it?

Here's an article on the topic:

jmm0 made some code here:

DR0ID, also did some with tests and faster code...

So far DR0ID says it's not really fast enough to be worth it in python. However, there are opportunities to improve it. Perhaps a more optimal algorithm, or one which uses C or Cython.

"worst case szenario with 2000 rects it takes ~0.31299996376 seconds"
If it's 20x faster in C, then that gets down to 0.016666. Still too slow perhaps, but maybe not.

For our use case, there might only be 2-10 things drawn. Which seems way under the worst case scenario for performance. Additionally we can use some heuristics to turn it off when it is not worth doing. Like when the whole screen is going to be updated anyway, we don't do it.

Below is one of the cases that benefits the most.
Currently the update method combines two rects like this:
|     |
|   __|____
|___|_|   |
    |     |
    |     |

Into a big rect like this:
|     ####|
|     ####|
|   XX    | # - updated, but no need.
|###      | X - over drawn.
|###      |

Note in these crude diagrams the # area is drawn even though it doesn't need to be, and the XX is over draw. When there are some overlapping rects like this that are large, that can be quite a saving of pixels not needed to be drawn. But what we want is three rects, which do not give us overdraw and do not needlessly update pixels which we do not need to.

|     |
|___|     |
    |     |
    |     |

This can easily save having to draw millions of pixels. But can it be done fast enough, with the right heuristics to be useful?

For a lot of people, and apps, this won't be useful. But I hope it will for those trying to draw things for the first time on slow computers.

There's some more discussion on the pygame mailing list about it, drawbacks, and links to old school Apple region rendering techniques.

SDL 2 hardware support

Version 2 of SDL contains hardware support which makes it faster on some platforms (for some types of apps). This includes the Raspberry PI.

There are actually a few different modules that already have implemented a pygame API subset using SDL2. Which shows me compatibility is important to some.

The approach I want to try going forward is to use a single source SDL1, and SDL2 code base with a compile time option. (like we have single source py2 and py3). There are new SDL2 APIs for hardware acceleration, which can be added later in a way which fits in nicely with our existing API.

Lennard Linstrom has made patches for pygame using SDL2 available here for some years:

The first step is to do an ifdef build for linux, and then do some more testing to confirm areas of compatibility and that the approach is ok.

Additionally we agreed that using Cython is a good idea.

There's quite a long discussion on the pygame mailing list, with more to it than this. But this is the general idea.

C/Cython versions of pygame.sprite

Sprites are high level objects which represent game objects. Graphics with logic.

I've already mentioned things like LayeredDirty, which does all sorts of back ground optimizations and scene management for you.

A lot of this code is in python, and could do with some speed ups. Things like collision detection, and drawing lots of objects are not always bottlenecked by blitting. We've known this ever since the psyco python jit came out. There are other techniques to work around these things, like using something like pymunk for physics, or using a tile based renderer, or using a fast particle renderer. However people still use sprites, for ease of use mainly.

So the plan is to compile them with Pyrex...  err... I mean Cython, and see what sort of improvements we can get that way. I expect naieve collision detection, and drawing will get speed ups.

Surface.blits, a batching API for drawing.

Another area that has been proven to speed up games is by creating a batching API for drawing images. Rather than draw each one at a time, draw them all at once. This way you can avoid the overhead of repeatedly locking the display surface, and you can avoid lots of expensive python function calls and memory allocations.

The proposed names over the years have included blit_list, blits, blit_mult, blit_many call...

I feel we should call it "Surface.blits". To go with drawline/drawlines.
It would take a sequence of tuples which match up with the blit arguments.

The original Surface.blit API.

  blit(source, dest, area=None, special_flags = 0) -> Rect

The new blits API.
    blits(args) -> [rects]
    args = [(source: Surface, dest: Rect, area: Rect = None, special_flags: int = 0), ...]
    Draws a sequence of Surfaces onto this Surface...

    >>> surf.blits([(source, dest),
(source, dest),
(source, dest, area),
(source, dest, area, BLEND_ADD)]
    [Rect(), Rect(), Rect(), Rect()]

One potential option...
  • Have a return_rects=False argument, where if you pass it, then it can return None instead of a list of rects. This way you avoid allocating a list, and all the rects inside it. I'll benchmark this to see if it's worth it -- but I have a feeling all those allocations will be significant. But some people don't track updates, so allocating the rects is not worth it for them. eg. the implementation from Leif doesn't return rects.

It can handle these use cases:
  • blit many different surfaces to one surface (like the screen)
  • blit one surface many times to one surface.
  • when you don't care about rects, it doesn't allocate them.
  • when you do care about update tracking, it can track them.
It can *not* handle (but would anyone care?):
  • blit many surfaces, to many other surfaces.
Areas not included in the scope of this:
  • This could be used by two sprite groups quite easily (Group, RenderUpdates). But I think it's worth trying to compile the sprite groups with Cython instead, as a separate piece of work.
  • Multi processing. It should be possible to use this API to build a multi process blitter. However, this is not addressed in this work. The Surface we are blitting onto could be split up into numberOfCore tiles, and rendered that way. This is classic tile rendering, and nothing in this API stops an implementation of this later.

There's an example implementation by Leif Theden here:

There is always a trade off between making the API fast, and simple enough to be used in lots of different use cases. I feel this API does that. But more benchmarking and research needs to be done first.

There's some discussion of the blits proposal on the pygame mailing list.

Other performance related things.

There was also some discussion of using the quad CPUs on the newer raspberry PI, which are faster than the video hardware on there for some tasks. Unfortunately that won't help the millions of older ones, and even newer ones like the new zero model. I've seen those CPUs outperform the hardware jpeg, and also when used for image classification. However, like taking advantage of METH_FASTCALL in python 3.6, that will have to wait for another time. Separately there's also work going on by other people on other optimization topics. An interesting one is the libjit based graphics blit accelerator.

Thursday, March 23, 2017

pip is broken


Since asking people to use pip to install things, I get a lot of feedback on pip not working. Feedback like this.

"Our fun packaging Jargon"

What is a pip? What's it for? It's not built into python?  It's the almost-default and almost-standard tool for installing python code. Pip almost works a lot of the time. You install things from pypi. I should download pypy? No, pee why, pee eye. The cheeseshop. You're weird. Just call it pee why pee eye. But why is it called pip? I don't know.

"Feedback like this."

pip is broken on the raspberian

pip3 doesn't exist on windows

People have an old pip. Old pip doesn't support wheels. What are wheels? It's a cute bit of jargon to mean a zip file with python code in it structured in a nice way. I heard about eggs... tell me about eggs? Well, eggs are another zip file with python code in it. Used mainly by easy_install. Easy install? Let's use that, this is all too much.

The pip executable or script is for python 2, and they are using python 3.

pip is for a system python, and they have another python installed. How did they install that python? Which of the several pythons did they install? Maybe if they install another python it will work this time.

It's not working one time and they think that sudo will fix things. And now certain files can't be updated without sudo. However, now they have forgotten that sudo exists.

"pip lets you run it with sudo, without warning."

pip doesn't tell them which python it is installing for. But I installed it! Yes you did. But which version of python, and into which virtualenv? Let's use these cryptic commands to try and find out...

pip doesn't install things atomically, so if there is a failed install, things break. If pip was a database (it is)...

Virtual environments work if you use python -m venv, but not virtualenv. Or some times it's the other way around. If you have the right packages installed on Debian, and Ubuntu... because they don't install virtualenv by default.

What do you mean I can't rename my virtualenv folder? I can't move it to another place on my Desktop?

pip installs things into global places by default.

"Globals by default."

Why are packages still installed globally by default?

"So what works currently most of the time?"

python3 -m venv anenv
. ./anenv/bin/activate
pip install pip --upgrade
pip install pygame

This is not ideal. It doesn't work on windows. It doesn't work on Ubuntu. It makes some text editors crash (because virtualenvs have so many files they get sick). It confuses test discovery (because for some reason they don't know about virtual environments still and try to test random packages you have installed). You have to know about virtualenv, about pip, about running things with modules, about environment variables, and system paths. You have to know that at the beginning. Before you know anything at all.

Is there even one set of instructions where people can have a new environment, and install something? Install something in a way that it might not break their other applications? In a way which won't cause them harm? Please let me know the magic words?

I just tell people `pip install pygame`. Even though I know it doesn't work. And can't work. By design. I tell them to do that, because it's probably the best we got. And pip keeps getting better. And one day it will be even better.

Help? Let's fix this.

Tuesday, March 14, 2017

Comments on community

Some notes about the current state of comments, and thoughts about future plans are below.

0) Spam.

So far there hasn't been comment spam on the new comment system(yet!)... but eventually some will get through the different layers of defense. Which are a web app firewall (through cloudflare) (which helps block bots and abusive proxy servers), user signups required, limits on the number of accounts and comments that can be posted per hour, making the spam useless for SEO(nofollow on links) and then a spam classifier.

The spam classifier is pretty basic. It only uses the message text so far, and not other features like 'how old is the poster account', or 'does the user account have another other accounts linked'. Despite that, and only having been trained on a few thousand messages it seems classification works pretty well. It scores 0.97 for ham, and 0.76 for spam when it is cross validated on the test set.

It's sort of weird having source code available for a website, and also trying to ward off spammers. Because if they looked, they can see exactly the measures taken to prevent spam. People who are dedicated to it will be able to easily spam, but casual and automated spam should be able to be stopped.

We used to have a 'grey listing' style account signup, where people could only sign up with a secret link. Whilst this worked ok, it also made it quite challenging. You needed to reach out to the community, or know someone who was already in it. This really did reduce the amount of spam though.

Disqus (a service) commenting was removed, and comments imported from there. This was because they added advertising without getting consent(which I received a lot of complaints about). Additionally we didn't have much control over managing the comments in a way which more suited our community (more on this below).
Gravatars are being used for avatars. There's no profile image for the website itself.

What's left to do with comments...

1) Doc comments

The old "doccomments" need to be moved into the new comment system. This is because the documentation lives in static files and is not produced by our website. Additionally, you can have multiple comments on a single page. Then they need quite some moderation for spam and abuse.

2) Better moderator tools

Adding spam/unspam links for moderators to quickly classify something. Also a list of recent comments that need moderation. The current system for this is really quite clunky.
The aim is to really reduce the work needed to be done by moderators.
Moderating internet comments is soul destroying work... so let's make robots do it.


3) Optionally disabling comments on projects

After some discussions I've decided to add an option for projects to disable comments. This way people don't have to deal with unwanted silly criticism if they don't want to. So, if someone is ready to get feedback for a project they can turn on comments. If they just want to show people what they've done (and perhaps get feedback from their own circle of friends) then they can leave the comments off. There's been a number of people who got some really weird demands, abuse, and other unsavoury comment behaviour... and just quit their projects. eg. this is one project which quit Another person removed more than 20 projects after getting some nasty comments from anonymous strangers on the internet.

4) Comments only from other makers

Additionally, I think it might be a good idea to only allow people who have posted a project to post comments on other peoples projects. This will stop the drive by trolls, and make comments more a discussion amongst peers. Gathering useful feedback, and having constructive criticism is a great thing. I guess this will be an optional thing as well, since whilst feedback from peers is often of a higher quality, hearing from others is also very useful.
[comments allowed options]
   - no comments on this project
   - comments from other project owners only
   - comments from everyone with an account.

This will also be a nice signal that the pygame community is about making things, and that we place importance on making things.
"If you want to comment on this project, you first have to share a project of your own".

5) Reactions, ratings, awards, and stars/favourites

The ludumdare, and pyweek systems have multiple ratings for different aspects of a project. They ask for feedback on particular things. Sound, fun, innovation, production... etc. So I'd like to store those for comments.

Again, this will be optional for projects. Each will have a [Seek feedback.] option.
Feedback like this will make giving more useful comments easier.

Additionally, a 'didn't work for me' option people can click can let people provide that feedback easily without polluting the comments too much.
[didn't work for me] [on which OS][stack trace]
Whilst pointing out defects is useful, it can also tend towards annoying nitpicking and turn into unwanted bikeshed arguments. It also can get in the way of more long form thoughtful discussion.
Awards are fun too(as used in ludumdare/pyweek), like "best duck main character".

Favourites, and stars are useful for keeping a list of ones you personally are interested in. They're useful for following projects. Also for "which projects do other people like".

6) Social auth logins

I've also added fields to projects and user profiles for linking up your twitter username, your bitbucket, and github urls. It's useful to know github/bitbucket links for projects. This allows downloading change information from there, and even releases. Much like how the pygame community dashboard brings information in from dozens of different social platforms, I want to allow projects to have that too.

More importantly, people who want to form teams or work on their projects with others will be able to ask for contributors (or even know where to find the project!)

Allowing people to just enter their github/bitbucket/twitter/etc user names means they don't need to link their accounts for signup. However, letting people use these will allow people to join more quickly. For those truly too lazy to enter in an email address ;)

7) Putting the Python Code of Conduct in front

Putting the Python Code of Conduct in front is another conscious decision. Which in short says to respect each other, and don't be mean. It says the whole python community, along with the pygame community expects to be able to participate in a friendly constructive manner. So it's right there on the front page.
"Leave a thoughtful comment"
The messaging, and branding also tries to suggest people to be thoughtful. Rather than have a "submit comment" button we have a "leave thoughtful comment" button. It's a little thing, but hopefully it signals to people that they should play nice.

Multi coloured branding

8) How to write good criticism?

I'd like to be able to point people to articles on how to do good criticism of both software, and of arts projects. What makes good feedback? What makes a good review?

Is the purpose of a review to nitpick? Is it to help energize people, to recognize people for their work?

Articles like On Giving Feedback I want to link to.
"When it was my work being critiqued, it made me excited to push my design and thinking forward."
I'd to point out reviews of a quality, as good examples. Writing reviews is an art form in itself. My time writing arts reviews really helped me working in creative fields, as much as receiving reviews. It really is a different thing to review a creative piece, compared to reviewing a purely functional piece.

Do you know any good articles on feedback and review we should share?

Monday, March 06, 2017

Pixel perfect collision detection in pygame with masks.

"BULLSHIT! That bullet didn't even hit me!" they cried as the space ship starts to play the destruction animation, and Player 1 life counter drops by one. Similar cries of BULLSHIT! are heard all over the world as thousands of people lose an imaginary life to imperfect collision detection every day.

Do you want random people on the internet to cry bullshit at your game? Well do ya punk?

Bounding boxes are used by many games to detect if two things collide. Either a rectangle, a circle, a box or a sphere are used as a crude way to check if two things collide. However for many games that just isn't enough. Players can see that something didn't collide, so they are going to be crying foul if you just use bounding boxes.

Pygame added fast and easy pixel perfect collision detection. So no more bullshit collisions ok?

Code to go along with this article can be found here ( ).

Why rectangles aren't good enough.

Here are some screen shots of a little balloon game I made modeled after an old commodore 64 game I typed in when I was eight.  Here you can see a balloon, and a cave.  The idea is you have to move the baloon through the cave without hitting the walls.  Now if you used just bounding rectangle collisions, you will see how it would not work, and how the game would be no fun - because the rectangle(drawn in green around the balloon) would hit the sides when the balloon didn't really hit the sides.

You can download the balloon mini game code to have a look at with this article.

How is pixel perfect collision detection done? Masks.

Instead of using 8-32bits per pixel, pygames masks use only 1 bit per pixel. This makes it very quick to check for collisions. As you can compare 32 pixels with one integer compare. Masks use bounding box collision first - to speed things up.
Even though bounding boxes are a crude approximation for collisions, they are faster than using bitmasks. So pygame first does a check to see if the rectangles collide - then if the rectangles do collide, only then does it check to see if the pixels collide.

How to use pixel perfect collision detection in pygame?

There are a couple of ways you can use pixel perfect collision detection with pygame.
  • Creating masks from surfaces.
  • Using the pygame.sprite classes.
You can create a mask from any surfaces with transparency.  So you load up your images normally, and then create the masks for them.
Or you can use the pygame.sprite classes, which handle some of the complexity for you.

Mask.from_surface with Alpha transparency.

By default pygame uses either color keys, or per pixel alpha values to see which parts of an image are converted into the mask.
Color keyed images have either 100% transparent or fully visible pixels. Where as per pixel alpha images have 255 levels of transparency. By default pygame uses 50% transparent pixels as on, or ones that are to collide with.
It's a good idea to pre-calcuate the mask, so you do not need to generate it every frame.

Checking if one mask overlaps another mask.

It is fairly simple to see if one mask overlaps another mask.
Say we have two masks (a and b), and also a rect for where each of the masks is.

#We calculate the offset of the second mask relative to the first mask.
offset_x = a_rect[0] - b_rect[0]
offset_y = a_rect[1] - b_rect[1]
# See if the two masks at the offset are overlapping.
overlap = a.overlap(b, (offset_x, offset_y))
if overlap:
    print "the two masks overlap!"

Pixel perfect collision detection with pygame.sprite classes.

The pygame.sprite classes are a high level way to display your images.  They provide things like collision detection, layers, groups and lots of other goodies.

Note: that comes with this article uses sprites with masks.

If you give your sprites a .mask attribute then they can use the built in collision detection functions that come with pygame.sprite.
class Balloon(pygame.sprite.Sprite):
    def __init__(self):
        pygame.sprite.Sprite.__init__(self) #call Sprite initializer
        self.image, self.rect = pygame.image.load("balloon.png")
        self.mask = pygame.mask.from_surface(self.image)

b1 = Balloon()
b2 = Balloon()

if pygame.sprite.spritecollide(b1, b2, False, pygame.sprite.collide_mask):
    print "sprites have collided!"

Collision response - approximate collision normal.

Once two things collide, what happens next?  Maybe one of these things...
One of the things blows up, disappears, or does a dying animation.
Both things disappear.
Both things bounce off each other.
One thing bounces, the other thing stays. If something going to be bouncing, and not just disappearing, then we need to figure out the direction the two masks collided.  This direction of collision we will call a collision normal.
Using just the masks, we can not find the exact collision normal, so we find an approximation.  Often times in games, we don't need to find an exact solution, just something that looks kind of right.
Using an offset in the x direction, and the y direction, we find the difference in overlapped areas between the two masks.  This gives us the vector (dx, dy), which we use as the collision normal.
If you understand vector maths, you can add this normal to the velocity of the first moving object, and subtract it from the other moving object.

def collision_normal(left_mask, right_mask, left_pos, right_pos):

    def vadd(x, y):
        return [x[0]+y[0],x[1]+y[1]]

    def vsub(x, y):
        return [x[0]-y[0],x[1]-y[1]]

    def vdot(x, y):
        return x[0]*y[0]+x[1]*y[1]

    offset = list(map(int, vsub(left_pos, right_pos)))
    overlap = left_mask.overlap_area(right_mask, offset)
    if overlap == 0:
    """Calculate collision normal"""
    nx = (left_mask.overlap_area(right_mask,(offset[0]+1,offset[1])) -
    ny = (left_mask.overlap_area(right_mask,(offset[0],offset[1]+1)) -
    if nx == 0 and ny == 0:
        """One sprite is inside another"""
    n = [nx, ny]
    return n

Fun uses for masks.

Here's a few fun ideas that you could implement with masks, and pixel perfect collision detection.
  • A balloon game, where the bit masks are created from nicely drawn levels - which are then turned into bitmasks for pixel perfect collision detection.  No need to worry about slicing the level up, or manually specifying the collision rectangles, just draw the level and create a mask out of it. Here's a screen shot from the balloon code that comes with this article:

  • A platform game where the ground is not made out of platforms, so much as pixels. So you could have curvy ground, or single pixel things the characters could stand on.
  • Mouse cursor hit detection. Turn your mouse cursor into something, and rather than have a single pixel hit, instead have the hit be any pixel under the mouse cursor.
  • "Worms" style exploding terrain.