Gauntlet
Phew. The last couple of months have been hectic, albeit manageably so. My primary goal was hitting the 30th July deadline for submitting a paper to INFOCOM 2011, and that absorbed most of my time. (I submitted my final copy roughly 16 hours before the deadline. Unheard of.)
Other interesting things included:
- The modelling and analysis of networks and distributed systems workshop, which took place mid-June in Stirling.
- The annual Cosener’s pilgrimage in mid-July.
- And, just last week, I got the happy news that a paper I helped co-author was accepted to IMC 2010.
Things may start to approach some form of normality now. That is, pending edits to the IMC paper, possibly working on another paper, starting serious work on the thesis, releasing code, releasing data, etc, etc.
But, before much of that, I’ll be taking some much-needed vacation time in mid-August. Where “vacation” means “walking the West Highland Way“!
August 2, 2010 No Comments
iPhone4, Facetime, and open standards
There’s been a lot of discussion on Apple’s Facetime product, and some questions asked over whether Facetime uses the traditional phone network at all. The answer appears to be: No, not at all. There is no traditional call setup if you initiate a Facetime session outright.
At Jobs’ most recent keynote, he popped up a slide which rattled off a few open standards: SIP, ICE (and by extension, STUN and TURN), RTP, and H.264. Now, I’ve used the Skype video calling integrated into my N900, and it works well, but Skype remains a closed protocol. And while Skype borrows lots of ideas from at least ICE and RTP, it doesn’t necessarily play by the standards and so it’s difficult for third-parties to inter-operate.
So, all dues to Apple on this one: They’re using open standards, and using open standards implies that anybody can play. Maybe Apple will need to standardise some additional glue: perhaps they have some custom SIP stuff for Facetime that they need to publish, or perhaps they’re doing something neat with the vertical handover from the GSM call to the IP call and back. I haven’t seen any new drafts from Apple float past, so I don’t know what they want to standardise here.
That aside, anybody should be able to use SIP to initiate an RTP session with another iPhone. All you need is the user’s registered identity within SIP. Obviously your device must understand these protocols, and be able to encode/decode the data streams. But that’s not such a big ask: The standards have been around for a long time, and lots of hardware either already supports them, or is perfectly capable of doing so. And since all this works over IP, you need is a connection to the Internet (and of course, this means that 3G networks are valid carriers for video calls, too, even though Facetime prefers Wifi).
But that was quite the acronym soup up there. What do any of those protocols actually do? How does this work if you don’t technically make a phone call, at least in the traditional sense?
To explain, let’s take Alice and Bob and go through step-by-step the sequence of events required to set up a call. When both Alice’s and Bob’s phones are running and connected to the network, they’ll register with a SIP proxy somewhere, presumably hosted by Apple. Let’s say Alice wants some Facetime with Bob:
- Alice’s phone must first learn about its environment: This is where STUN and TURN come in. STUN is a simple protocol for a host to bounce packets off a STUN server (again, in this case presumably hosted by Apple) to learn whether it’s behind a NAT. TURN is an extension to STUN to reserve an auxiliary address on the STUN server which can be used to relay data between endpoints. So, Alice’s phone uses STUN to learn the IP address on the public side of the NAT she’s behind, and uses TURN to request a relay address at the same time, as a worst-case, if-all-else-fails data path between the two phones.
- Alice’s phone then uses SIP to send an INVITE to Bob’s phone. SIP is a standard protocol used the world over to setup and teardown calls over IP; it doesn’t handle data, it only handles the control operations and negotiation for the call. This SIP invitation contains the learned information about Alice’s network using STUN and TURN, which Bob will use later. Bob’s phone will receive the INVITE from the SIP proxy, prompting it to also learn about its network environment. Once it’s done this, it’ll send a response (ACK) with this information via the SIP proxy back to Alice.
- So, now Alice and Bob both have information about the other’s network environment. At this point, each phone runs through the ICE algorithm. ICE systematically probes pairs of these addresses learned from Alice and Bob’s current networks to determine which pairs can be used to pass data packets between the phones. By this process, both phones can learn if they’re located behind the same NAT or not. If they are, they can communicate directly, and this is always the best outcome. If they aren’t they must try the public addresses discovered via STUN to determine if those addresses will allow traffic to flow or, in the worst case, will nominate to relay data via the relay address allocated in Step 1. Usually though, ICE can figure out a combination of packets to punch holes through the NATs to allow data to flow directly between the phones without the need for the relay. The relay really is the worst case.
- Assuming ICE finds a set of valid candidates, the phones finalise the ICE algorithm by choosing the pair which they will use, and coordinate to complete the interaction. Once complete, both phones have a clear path on which to send data using RTP. RTP is a packet format for handling real-time data over UDP.
If all goes well, all these interactions happen within the exchange of a few tens of packets and hopefully not too much delay to the user. This is a reasonably complex sequence of events primarily because of the presence of those dastardly NAT boxes. In an ideal Internet, we’d all have clear paths to each other and not have to deal with this sort of negotiation, but that’s another argument for another day. From the human perspective, Alice initiates a call and within a few seconds, Bob picks up. End of story.
What I don’t know about this story is how people are identified on the SIP backend. By their phone number? IMEI? Presumably more information will become available if the product takes off.
This is a really short summary of the sequence of events. If you’d like some pictures to go with that, I have some (slightly outdated) slides which run over the sequence of events that take place.
June 25, 2010 9 Comments
Scala 2.8 Remote Actors in Scala 2.7.7-final
After a lot of heavy use of the remote actors libraries in Scala, I noticed that something seemed to be leaking memory. Unable to see anything obvious which might have been leaking, I fired up a profiler to investigate.
The short version of this post is as follows:
- Most of the memory leaks I found are contained within the Remote Actors library classes, and are cleared up in the release candidates for Scala 2.8. If you can upgrade to 2.8 now then your Remote Actors will leak less, though the final leak mentioned in this post may still catch you out.
- If, like me, you have lots of Scala code dependent on the 2.7.7 collections classes and haven’t gotten around to refactoring for 2.8 but use remote actors heavily, then you’ll be pleased to hear that the 2.8 remote actor code can be used with the rest of the 2.7.7 codebase with few modifications.
I’m happy to provide a patched-up version of 2.7.7-final for you here. Extract this to where you’d normally extract your Scala libs, and link into it as usual.
The remainder of this post covers the three distinct, but related, memory leaks I found. At least the first and last are resolved by the patched code above.
Anatomy of the leak in 2.7.7
I traced the leak back to Proxy.scala. These Proxy objects basically acts as a go-between for incoming and outgoing messages on the Remote Actor network stack. A Proxy represents an Actor located on a remote host, and handles handles four main message-passing cases:
- HostA: New outgoing message.
- HostB: New incoming message
- HostB: Outgoing response
- HostA: Incoming response
In 2.8, Proxy.scala maintains and modifies two data structures (a ‘channelMap‘ and a ‘sessionMap‘) which are modified during each of these four use-cases for only synchronous messages and futures messages. In 2.7.7-final, these structures are modified for all messages.
Here’s a high-level example of how these structures are modified: For this example I’m using two machines, ‘lifou’ which sends messages, 192.168.1.5, and ‘ikeq’, 192.168.1.3, which receives and replies to messages. The sequence of events may be as follows:
- lifou calls select(), which creates a new Proxy object to represent ikeq (e.g., Sink@Node(ikeq,51236)).
- ikeq: A new Proxy object is created at ikeq to represent lifou (creator: remotesender0@Node(192.168.1.5,51234)).
- lifou: Sends a message to ikeq, which passes through the Proxy created in step 1. A Pair is added to this Proxy’s channelMap.
- ikeq: Receives message via Proxy created in step 2, and a pair is added to it’s sessionMap.
- ikeq: Responds, via the Proxy created in step 2. The Proxy correctly removes the appropriate entry from the sessionMap.
- lifou: Receives message via Proxy created in step 1. The Proxy correctly removes the appropriate entry from the channelMap.
So while it’s very reasonable to accept that synchronous and futures messages will receive a response which will clear existing state, this is not such an easy assumption with asynchronous messages. Without a matching response, the state retained by 2.7.7 will never be cleared for many asychronous messages. Indeed, lots of my message passing is not symmetric, in terms of symmetry of messages and responses, and thus my code made Proxy.scala leak badly.
Fortunately, the 2.8 Remote Actors code behaves pretty well. So if you’re stuck on the 2.7.7 collections classes for now, but don’t want your remote actors to leak memory, then consider using the patched version above.
Outline of one remaining leak in 2.8
On sending a, say, synchronous message, the Proxy creates a new, unique, Symbol for that transaction to put into the Maps at either side. All Symbols are interned such that duplicate Symbols actually become pointers to the same backing data, reducing wastage. These Symbols are stored in a WeakHashMap, which suggests that once the Symbol is no longer reachable via any other reference chain than via this WeakHashMap then it should be collected by the garbage collector. It seems, though, that interned Symbols are not collected very quickly, and do appear to constitute a leak.
If you run a long-lived process which uses many synchronous messages, you may run into this issue. I’m not aware of any fix to this, yet.
Outline of another remaining leak in 2.8
The Remote Actors library in 2.7.7 and 2.8 currently doesn’t match IP addresses to hostnames to identify an appropriate Proxy to handle a message. If you follow the 6-step sequence of events above when using hostnames, step 6 should be replaced by the following:
- lifou: Receives incoming message, and creates a new Proxy (creator:TestActorSink?@Node(192.168.1.3,51236)). A Pair is added to it’s sessionMap.
That is, a third Proxy object is created, because the library doesn’t understand that the hostname and the IP address are equivalent. Subsequent message exchanges continue to do the following:
- Outbound messages at lifou use the first proxy created in step 1.
- Inbound messages at lifou use the third proxy created in step 6.
Put differently: this state accumulates because one Proxy handles outgoing messages, one handles incoming messages. The memory leak only occurs at lifou in the example, not ikeq. In 2.8, asynchronous message passing will still create multiple Proxies, but the channelMap/sessionMap state will not be modified.
I have created a bug report with a minor patch which resolves this issue; the patch is included in the version of Scala available above.
Alternatively, if you do not with to change your Scala runtime, the problem is really easy to work around: If the developer remembers to resolve IP addresses, then there is no problem. But they need to remember to do it every time they might see a hostname to ensure no leaking.
Summary: What do we learn?
I found the first of these bugs only because my work involves fairly heavy use of Scala and Remote Actors. The others I discovered when rooting around to figure out where the problem was. The memory leak in the 2.7.7 case with asynchronous message passing simply “goes away” by using 2.8 Remote Actors, either in the upcoming 2.8 final release, or by grabbing the build above.
The two other problems, however, do not go away quite so easily. Of these, the latter is easily worked around, but I don’t know of a solution to the former. These aren’t showstoppers, but it’s good to be aware of how the libraries you’re using actually work.
It’s been really satisfying to be able to dig around an implementation of a language and contribute something back, even if in just a small way. Please try out the build above if it suits your requirements, and let me know if it improves your work.
June 18, 2010 No Comments
On Deriving AS Links from BGP Snapshots
BGP snapshots dating back to November 1997 are available for public download, and are intended for use by the networking research community to analyse the nature of the prefix space advertised using BGP, and the paths through which these advertisements have travelled.
Much work has been done using this data, and links derived from the BGP data are often used in research. There are, however, subtle complexities in the process of deriving a set of links which I’ll discuss here.
BGP Data
First, the format of the BGP data may differ, depending on what sort of routing hardware/software you’re accessing. The two formats I’m familiar with, which are stored by different Route Views routers, are Cisco’s and MRT. The latter is considerably easier to process easily in bulk.
The Cisco output is human-readable, with one line representing a prefix and a path, with variable numbers of characters possible in some fields, and with columns padded by spaces to produce a readable table. Unfortunately, in some revisions of the software it appears possible for columns to run together where the data is too long for its own field, or for one line to be split across two. To overcome this I built a short AWK script, which I’ll link to in future, which parses out all the prefixes and paths in a given BGP snapshot in this format in a, hopefully, easier to understand format.
BGP Data to Unique Paths
From these BGP snapshots, it is easy to derive a set of unique paths. With a large list of prefixes and a path to origin for each, we need only strip off the prefix field and derive a unique set of paths from the remaining fields. Depending on how you’ve handled your BGP data, this could be as easy as:
cat $bgpdata | cut -d ” ” -f 2- | scala com.sdstrowes.util.Uniq > $link
At this point, I leave all “details” in place. By that, I mean Autonomous System (AS) Sets, or 32-bit AS numbers in asdot/asdot+ notation. Each of these adds a little complexity to the mix in the next step.
Deriving Links from Paths
Of the “details” already mentioned, AS Sets are the trickiest. AS Sets are identifiable points in the BGP routing system where multiple ASes have been advertised with an associated aggregate prefix, subsets of which will apply to each of those ASes. Such aggregation is rare within BGP though. A network participating in BGP is likely to want to peer with, or buy connectivity from, multiple networks. Since BGP operates on a ‘longest-prefix then shortest-path’ principle, if one upstream network aggregates the more-specific prefix, then BGP will always choose one of the other paths.
The sequence of steps my scripts follow is as follows:
- Strip off the braces surrounding AS Sets, using sed to remove any of these brace characters: “{}()“. In doing so, AS Sets which contain only one AS are reduced to an ordinary AS in an AS path. I interpret AS Sets with one member as possibly a router misconfiguration, or a set still configured where other members have since stopped advertising, or perhaps where those other networks have been filtered elsewhere in the network. I prefer to keep these as they’re likely to indicate a direct customer/provider link between the aggregating AS and the AS in the set.
- Identify the remaining AS Sets as those with a comma-separated list of AS numbers. Remove the entities in each path which contain “,”.
- Some AS numbers in the remaining data may be in asdot or asdot+ format. I prefer that all my pairs are listed as plain integers (“asplain” format). This is a really easy transformation:
def deriveASNum(in: String) = { val asN = in.split('.') if (asN.size > 1) asN(0).toInt * 65536 + asN(1).toInt else asN(0).toInt }
This set of simple transformations leaves a “clean” set of paths, with AS Sets containing more than one member having been removed. (The rationale for removing sets with more than one member is that it’s difficult to say anything of value about the links between these members.) With these clean paths, it’s trivial to walk along the paths to match up pairs of AS numbers, before outputting a final list of pairs representing AS-level connectivity.
Additional techniques, such as using an N-day (say, 5) window to “smooth” links or AS numbers accidentally missing at the time of the snapshot, or accumulating links over time, may help increase the accuracy of the set of links observed. The steps above are the steps I take to derive links for one BGP snapshot only.
What I also deliberately don’t attempt to do here is annotate the links with link types (that is, determine which links are provider-customer, which links are peering links, and which links indicate sibling AS numbers). This is a difficult task which has attracted much research. The output of this research has shown varying degrees of accuracy. I’m just not convinced we can achieve total accuracy without more networks opening up on their network policies, and offering high-level information on their connectivity, and offering their (BGP) views of the network.
May 3, 2010 No Comments
Really simple Uniq in Scala
I don’t know about you, but I use the standard uniq tool available in *NIX systems all the time. It only removes duplicate adjacent lines from its input stream though, so to retain only unique lines throughout your entire input you’d usually pipe your input through sort first. On larger datasets, this can start to become time consuming.
It is, however, dirt simple to build a uniq tool in Scala which outputs the unique lines from an entire input without having to sort the input first. And I really do mean dirt-simple; it took only a few moments to write. Here it is:
package com.sdstrowes.util
import scala.io.Source
import scala.collection.immutable.HashSet
object Uniq {
def main(args: Array[String]) : Unit = {
val lines = Source.fromInputStream(System.in).getLines
var uniq = HashSet[String]()
while (lines.hasNext)
uniq = uniq + lines.next
uniq.foreach(l => print(l))
}
}
This accepts input on stdin, and outputs onto stdout. So insert it into your pipeline rather than “sort | uniq“. Rudimentary testing on my inputs suggests I get a x2 speedup. Of course, your mileage may vary.
April 22, 2010 1 Comment
Aluminium Apple keyboard with Linux
Or, specifically, X; I haven’t tried to make this work outwith X yet. But I did create a ~/.Xmodmap file to map the key bindings of my Apple keyboard to a more normal PC layout so I can continue to touchtype with ease. Note that I move around some of the punctuation, so if you look at the keyboard to type these xmodmap calls will confuse you!
This is based on the excellent information held within the Ubuntu documentation.
My ~/.Xmodmap now contains the following:
keycode 11 = 2 quotedbl
keycode 12 = 3 sterling EuroSign
keycode 48 = apostrophe at
keycode 49 = backslash bar
keycode 51 = numbersign asciitilde! bind Caps Lock key as an additional Control
keycode 66 = Control_L
! Bind Apple’s “Clear” key to be NumLock
keycode 77 = Num_Lock! This is the key directly beneath escape
keycode 94 = grave notsign brokenbar! Map the fn key to Insert
keycode 118 = Insert! Alt and AltGr
keycode 133 = Meta_L
keycode 134 = Mode_switch! F13: Insert key; F14: PrintScr; F15: ScrollLock; F16: Pause/Break.
keycode 191 = Insert
keycode 192 = Print Sys_Req
keycode 193 = Scroll_Lock
keycode 194 = Pause Break
keycode 195 = XF86AudioMute
keycode 196 = XF86AudioPrev
keycode 197 = XF86AudioNext! Finalise modifiers and control statuses
clear Shift
clear Lock
clear Control
clear Mod1
clear Mod2
clear Mod3
clear Mod4
clear Mod5
add Shift = Shift_L Shift_R
add Lock = Caps_Lock
add Control = Control_L Control_R
add Mod1 = Alt_L 0x007D
add Mod2 = Num_Lock
add Mod4 = Super_L Super_R
add Mod5 = Mode_switch ISO_Level3_Shift ISO_Level3_Shift ISO_Level3_Shift
March 2, 2010 No Comments
That was the year that was… 2009.
If 2008 flew past, then I’m not so sure 2009 felt the same. That’s not to say that it dragged, but I feel like I fit in the correct amount of activity. Looking back, the marathon I had decided to run never happened thanks to complaining limbs, and I’m typing this on the very same Samsung NC10 I mentioned at the time. I have continued to play with photography, culminating in the purchase of my Canon 450D to which I may be adding some new little toys soon.
In terms of places travelled, not such an interesting selection this time round:
- Australia: Sydney, Hunter Valley, Blue Mountains, Melbourne, Frankston.
- Rome: Again; I visited Rome two and a half years ago, but it never hurts to return.
- Belgium: Brussels, and Louvain-La-Neuve.
Is that really it? I must be forgetting something… I guess that does count for about 6 weeks of travel, only about half of which was time off work. I don’t have much travel lined up for 2010, apart from a short trip toward the end of January. More on that later, if it comes to fruition, and hopefully with awesome pictures.
Perhaps less travel allows me to read more; I completed a pathetic number of books in 2009, so pathetic that I’m not willing to announce the total count here. Perhaps having 7 books unfinished at the moment will help round things out for 2010…
For 2010, I intend to continue with the working, reading, running, and photography. In general, this is a heads-down kind of year. Being in my third year of my PhD, with some good analysis work behind me, allows for a certain type of clarity. The type of clarity I must harness, else panic will ensue.
I expect 2010 to be a busy year. Hopefully busy in a similar sort of way that 2009 was.
January 2, 2010 No Comments
Dreamhost Loading Issues
Dreamhost, where this site is hosted, seem to be having a bit of bother at the moment; I’ve heard of a some people walking away from this once-excellent webhost because their shared servers are heavily overloaded.
I’m not sure what Dreamhost are doing with these shared machines, but they certainly do seem to be overloaded. I wanted to know the high loads I’d observed on my shared machine were a blip, or if they were predictable in some way.
So, I noted the load values over the weekend. I grabbed the one-minute load values, and plotted them. I wasn’t smart about this, so I haven’t noted exact dates for points, etc. This was a simple loop which woke up each minute and asked for the machine’s load. Once I was done, I piped the results into gnuplot.
The full data collected looks like this (you can click for a larger version):
These load spikes look inherently unpredictable to me. Not much of a pattern there, but there are many points considerably over a load factor of 4. (This is a 4-core machine, so I’d expect to see a load of 4 or less.) Sorting them, you can see that there are many spikes in the first graph well over this preferred maximum:
In fact, it’s overloaded about one third of the time. Grabbing the final hour of output for comparison, it’s easy to see that there’s not even a useful trend at this scale:
Sorting them again, we see that this machine is overloaded on nearly half of those samples taken:
The numbers I’ve presented here are anecdotal, but are on the low side when compared to some others. I’m not sure if Dreamhost really acknowledge a problem here, but this seems to have affected a few people hosted over there.
November 9, 2009 1 Comment
Generic Object Interning
Whipped together an nice little object interning facility:
class RefStore[T<:AnyRef]
{
import scala.collection.mutable.HashSet
var store = new HashSet[T]()
def intern(ref: T): T = {
store.findEntry(ref) match {
case None => {
store += ref
// Added new shared ref, ref
ref
}
case Some(r) => {
// Returning shared ref, r
r
}
}
}
}
Dirt simple, generic, type safe.
October 28, 2009 No Comments
Object (de)serialisation: class loaders
Following up my previous post on object serlialisation and deserialisation, a minor gotcha: Passing ‘null’ as the second parameter to JavaSerializer may throw ClassNotFoundException when reading your own custom classes back in.
So, pass to JavaSerializer a class loader you know can load your custom classes. For me, the easiest to use is the class loader which has already created one of my own objects, like so:
val js = new JavaSerializer(null, classOf[com.sdstrowes.simulator.RIB].getClassLoader())
The rest remains the same.
September 24, 2009 No Comments



