Arachni: XMLRPC service dispatcher — distributed design and implementation

In the 0.2.1 version a nifty XMLRPC server interface was added allowing the system to be deployed in a distributed fashion…however there was a conscious limitation in the design.
There was an 1-to-1 relationship between XMLRPC clients and servers; meaning that the XMLRPC server wasn’t built to allow multiple clients to connect to a given XMLRPC server instance.
You see I never meant to support such functionality because:

  1. The system is designed for individuals, not organisations
  2. If an organisation wanted to use Arachni to provide services they could just develop a simple dispatcher to overcome that limitation
  3. I was planning to write such a dispatcher for my MSc thesis (supporting server pools, automated deployment in the cloud, key management and lots of other really cool stuff) and then incorporate that into the OSS version

To those of you who are inclined to react to the 3rd point like so:
Look at that cheeky bastard…he’s holding out on the community for his own gain.
My answer is: Well duh! I’ve given you a free kick-ass system stop bitching.
(But that wasn’t the point, I just wanted to use my time to keep working on Arachni; however the previous response makes me sound more colourful, heh…)

Anyways, to the point; the good folks at NopSec needed that functionality now so I obliged; mainly because it’s a lot of fun designing and developing that kind of stuff — designing and building distributed systems makes your penis feel bigger.
And I already had the design in my head, it was just a matter of writing the code.

Here’s how the dispatcher works:

  1. a client issues a ‘dispatch’ call
  2. the dispatcher starts a new XMLRPC server on a random port
  3. the dispatcher returns the port of the XMLRPC server to the client
  4. the client connects to the XMLRPC server listening on that port and does his business
  5. once the client finishes using the XMLRPC server instance it shuts it down.

I know what you’re thinking about the last point, why shut it down? Why not reset and reuse the pool?
There are some technical difficulties at the moment plus I’ve detected memory leaks so it’d be prude to shut the XMLRPC instances down. I wish I could do my own memory management so as to overcome that limitation but…we’ll see.

Here’s a diagram to help you better understand:

Allow me some time to criticise my own design now:

Questions:

  1. Why not use sessions to manage multiple clients instead of spawning 1-to-1 server instances?
  2. Why not use some other approach instead of hogging port numbers like that?
  3. I see no authentication in that design, what’s up with that?

Answers:

#1
I wanted the children (XMLRPC server processes) to be completely detached from each other and from the parent (the Dispatcher).
Meaning that should either a child or the dispatcher crashes and burns the rest won’t be affected in any way.
If I were to route everything through the Dispatcher in order to facilitate sessions that would lead to total dependency; if the dispatcher crashed then all the children would die or the clients wouldn’t be able to communicate with them.
This way there’s no single point of failure.

#2
See #1 and:
Why not use port numbers? That’s what they’re there for, to facilitate communication
Everybody nowadays wants to re-invent the wheel and do things his own way, etc… Granted there are times when this is prudent but not in this case.
I’d rather work with the system than go around it.

#3
There is actually…client authentication is implemented using SSL certificates which provide the added bonus of encrypted traffic and many other stuff.

What’s next?

I’ve no idea…while writing this post a few things come to mind though.

Management

The dispatcher in the experimental branch is very, very basic as I’m still working on it, thus there’s no management or statistics gathering of the running XMLRPC server instances.

This is next on my list…

Load balancing

This is an obvious one. Such a system would need to handle lots of traffic gracefully and efficiently amongst a pool of servers.
This is not a big concern at the moment since we’re just testing basic functionality but I’ll need to implement it sooner or later.
Hm…I just had a lightbulb moment…

Peer-to-peer communication between dispatchers in order to figure out which server has the most free resources at any given moment. Which would also decentralise the load-balancing system and avoid single points of failure.

Can’t think of anything else at the moment…

If anyone has any thoughts I’d really like to read them. :)

SociBook del.icio.us Digg Facebook Google Yahoo Buzz StumbleUpon

Posted in: Arachni, Projects

Tags: , , , , , , , , , ,



addLeave a comment