Showing posts from September, 2009

Spam detection on websites?

Assume you have a user content site - or you're using software that can somehow get spam links inserted into it.

How do you find out if your website has spam put on it?

It seems a common enough problem these days... people putting spam links on websites. Surely there must be a service or piece of software to detect such a thing?

I can think of a few ways to go about writing one fairly easily (using existing spam detection tools... but applying them to a spiders crawl of your website). It would be much nicer if there's already a tool which does such a thing though.

Alsa midi, timidity, fluidsynth and jack.

If you don't have a midi output on linux(cause your laptop has crappy audio hardware) you can use timidity or fluidsynth to emulate it.
timidity -iA -B2,8 -Os -EFreverb=0
Well, this piece of html has a bunch of incantations for using timidity on linux... and also gives insight into how to use alsa midi tools.

Like listing midi ports, and connection midi ports with these two commands:
$ pmidi -l
Port Client name Port name
14:0 Midi Through Midi Through Port-0
20:0 USB Axiom 25 USB Axiom 25 MIDI 1
128:0 TiMidity TiMidity port 0
128:1 TiMidity TiMidity port 1
128:2 TiMidity TiMidity port 2
128:3 TiMidity TiMidity port 3

To connect the midi input from my usb Axiom 25 keyboard to the timidity synth the aconnect is the command to use.
aconnect 20:0 128:0

The AlsaMidiOverview has more information on things.


screen for ghetto servers and startup scripts.

GNU screen is a good little tool for server administration, or running things on your own remote machines. It's even good for running things locally.

I hope this is useful for people who want to run scripts every time they login, or reboot... and who need interactive access to those scripts. Or useful for those people who are already using screen, but would like to make their setup a bit better:
scripting sessions, rather than doing them manually at each login or reboot, finding your screen sessions more easily. restarting scripts at reboot, monitoring, logging, resource control

Running things as daemons is cool... but if you'd also like interactive control occasionally, running things with screen is useful.

Most servers have screen, watch and crontab(osx is lacking watch though) - including most linux distros, *bsd, osx, windows(with cygwin). Most OSes also have their own style init scripts(scripts to run things at boot or logon). So this screen, watch, crontab combination …

Linux sound is getting better.

No I'm not talking about the free software song sung by Richard Stallman(very funny, but in a low quality .au format). Or the pronouciation of Linus and linux.

To start on this long-journey-of-a-rambling-diatribe-of-words, there's two good audio patches in the SDL bug tracker for the upcoming SDL 1.2.14 release.

One patch is for the pulse audio driver, and the other is for the alsa backend. These solve some of the high latency or scratchy sound issues some have.

That's right a new SDL release very soon... it's over a year since the last 1.2.13 release, and it seems like forever since the SDL 1.3 series begun. Most new development has been happening on the SDL 1.3 tree in the last year... so the 1.2 releases have slowed to an almost stop.

There's a good article on a x-platform atomic operation API for SDL That's one of the features that's been evolving over a few years, and is being implemented in svn.

In python ter…

Where did the 'new' module go in python 3?

Anyone know where the 'new' module went in python 3?

2to3 can't seem to find 'new', and I can't find anywhere with my favourite search engine either... filed bug at: issue6964.

A complete 2to3 tool should know about all modules that are missing at least. It needs to actually know what to do with those modules, but should be able to at least tell you which modules are missing. I'm not sure how to get a complete top level module list sanely... I guess by scanning the libs directory of python.

Or maybe there is a module to find all python modules?

Each platform would be slightly different of course... and there'd be differences based on configure. Also some modules have probably stopped importing or compiling at all these days.

Then you could just find the intersection and differences with the lovely set module :)
# find the difference between the modules.
top_level_modules_not_in_3 = set(top_level_modules3series) - set(top_level_modules1_2series)

Well mayb…

py3k(python3) more than one year on - 0.96 % packages supporting py3k.

Python3 was released more than a year ago so far, and the release candidates and beta releases much before then.

How successful has the transition been to python3 so far?

One way to measure that is to look at how many python packages have been ported to py3k. One way to measure what packages are released is to look at the python package index(aka cheeseshop, aka pypi).

Not all packages ported to python3 are listed on pypi, and not all packages are listed on pypi. Also there are many packages which haven't been updated in a long time. However it is still a pretty good way to get a vague idea of how things are going.

73 packages are listed in the python3 section of pypi, and 7568 packages in total. That's 0.96% of python packages having been ported to python3.

Another large project index for python is the website. Where there are currently over 2000 projects which use pygame. I think there are 2 projects ported to python3 on there(but I can't find them at the mom…

The many JIT projects... Parrot plans to ditch its own JIT and move towards one using LLVM.

It seems LLVM has gained another user, in the parrot multi language VM project. They plan to ditch their current JIT implementation and start using LLVM.

Full details are on their jit wiki planning page. There is more of a discussion on the parrot developer Andrew Whitworths blog here and here.

Parrot aligns very nicely with the LLVM project which itself is attempting to be used by many language projects.

Along with the unladen swallow project(python using LLVM for JIT), this brings other dynamic languages in contact with LLVM. This can only mean good things for dynamically typed languages on top of LLVM.

Mac ruby is another project switching to LLVM - they have been working on it since march.

Rubinius seems to be another ruby implementation mostly written in ruby, and the rest in C++ with LLVM. It even supports C API modules written for the main ruby implementation. 'Rubinius is already faster than MRI on micro-benchmarks but is often slower than MRI running applications'.


Linux 2.6.31 released... the good bits.

The new linux kernel has been released. Here are the human readable changes.

Here's the cool stuff (the links in the original article were broken, so I've fixed the links here):
USB 3 supportCUSE (character devices in userspace) and OSS ProxyImprove desktop interactivity under memory pressureATI Radeon Kernel Mode Setting supportPerformance CountersIEEE 802.15.4 Low-Rate Wireless Personal Area Networks supportGcov supportKmemcheckKmemleakFsnotifyPreliminary NFS 4.1 client supportContext Readahead algorithm and mmap readhead improvements

For me the performance counters will be the most useful thing. Also being able to use and write user space character devices is cool(especially for audio). USB3 support is awesome, but not useful right now... since there isn't even much hardware out yet!

More info on what that low power wireless support is, can be found on the wikipedia: IEEE_802.15.4-2006.

Dependency analysis, and a digression onto mock ducks.

Dependency analysis allows all sorts of fun things in software.

It can be used to reduce software defects. How? Say you have 10 components, and over time they may bit rot, or change. By reducing a dependency on as many of the components as possible, means you have less of a chance of encountering a bug. It also means you have exponentially less code to update or re-factor. Another reason, is that combining multiple components together requires more testing... exponentially more testing(which is why unit-tests are popular).

Performance can be improved with dependency analysis too. Not just by reducing the amount of code run. If code doesn't have dependencies, it can be run in separation. This is where some object oriented design is missing something. When they have methods which change the state of an object internally - then they have a dependency. At this stage it makes task, and data level parallelism harder.

Compare these two calls:
map(o.meth, data)
map(f, data)
If you had…

python build bots down... maybe they need a spectacularly adequate build page instead?

Seems the python build bot pages are down. Maybe something simpler is needed instead of build bots? Something that requires less maintenance.

update from svn
run tests if compile completes
upload results to a simple page
(configure(stdout, stderr),
build(stdout, stderr),
test results(stdout, stderr),

The beauty of this is that it can be easily de-centralized.

Could even use pypi infrastructure for this now. Each buildbot has a new project setup, which then updates the pypi project each time it builds. Have a special pypi tag(category) for python build bots, so people can easily search for them. To reduce the spam on pypi, just make it mark the releases as hidden... so they don't show up. The results are added to the pypi listing.

Probably only needs one cmd added to the python setup script to do the upload(based on existing pypi code).

No central authority, and very simple. Anyone who wants to run a 'buildbot' can without authority.

Can probably m…