youshottheinvisibleswordsman An blog. https://youshottheinvisibleswordsman.co.uk/ The year that was 2016. <p>Change. I made some.</p> <p>2015 was <a href="/2016/01/30/that-was-the-year-that-was-2015.html">pretty stable</a>. I changed that in 2016.</p> <p>In September I wrapped up three years of <a href="/2016/09/30/ipv6-at-yahoo.html">pushing IPv6 at Yahoo</a>; I took trips to <a href="https://www.flickr.com/photos/sdstrowes/30426468356/">Arizona</a>, New Orleans, and Panama City. I moved out of my apartment in <a href="https://www.flickr.com/photos/sdstrowes/29155521863/">San Francisco</a> on November 20th. I arrived in Amsterdam on November 29th; I started work at <a href="https://www.ripe.net/">RIPE NCC</a> on December 1st.</p> <p>Let’s talk about the other changes inherent in that: the role is research-focussed; the organisation is not-for-profit; the organisation is <em>much</em> smaller than Yahoo; the work is all network-oriented, not advertising-oriented; the research is pragmatic, operationally focussed; the work is much more public than corporate culture allowed. These are all good.</p> <p>As I write this, I’m about to move into a more permanent apartment. My daily commute will be no longer than a 20 minute walk to or from the office. London is an hour away, Scotland not much more. I double my vacation allowance. It’s been winter, with some days dipping comfortably below freezing. This week, my belongings turned up after their transit from California. These changes are all good, and things are now settling down.</p> <p>Last year I also managed trips to Berlin (IETF96), Portland (twice), Seattle (twice), Ottawa, and also back home to Scotland. Squeezing all that around limited vacation time on the West coast of the US is, to say the least, tiring. I’m pleased I’ve relocated somewhere more central.</p> <p>I also managed to beat my previous times for <a href="https://www.strava.com/activities/577042030">Bay to Breakers</a>, and my time for a <a href="https://www.strava.com/activities/660509608">half marathon</a>, leaving open the question of whether I could beat those again.</p> <p>And hopefully we’ll see! I’m planning on running Bay to Breakers again in 2017. This time, I’ll have gone through a winter and I’ll probably be running with some jet lag; should add to the challenge!</p> <p>So, hey, 2017. We’re almost two months in already, but it’s going well so far. Hoping for more.</p> Sun, 26 Feb 2017 17:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2017/02/26/that-was-the-year-that-was-2016.html https://youshottheinvisibleswordsman.co.uk//2017/02/26/that-was-the-year-that-was-2016.html Reviewing the 2016 Leap Second <blockquote> <p>Last month we covered the 2015 leap second ahead of the insertion of a leap second at the very end of 2016. As stated previously, leap seconds can trigger poorly-tested code paths; leap second handling always unearths bugs and issues. This one was no exception!</p> </blockquote> <p><a href="https://labs.ripe.net/Members/stephen_strowes/reviewing-the-2016-leap-second">Continues at labs.ripe.net</a>.</p> Mon, 16 Jan 2017 12:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2017/01/16/reviewing-the-2016-leap-second.html https://youshottheinvisibleswordsman.co.uk//2017/01/16/reviewing-the-2016-leap-second.html Preparing for the 2016 Leap Second <blockquote> <p>On 31 December this year, we’re scheduled for another leap second. There are many stories about what leap seconds can do to infrastructure and applications, and rituals are built up around them. Such rituals stem from reality: leap seconds trigger poorly-tested code paths and run contrary to assumptions that system time always runs in one direction. It’s useful to be aware of how your infrastructure handles leap seconds and how NTP servers handle them, so you can plan around the event. Here, we look at some of the NTP measurements the RIPE Atlas platform took around the last leap second, and approaches for handling them.</p> </blockquote> <p><a href="https://labs.ripe.net/Members/stephen_strowes/preparing-for-the-2016-leap-second">Continues at labs.ripe.net</a>.</p> Thu, 15 Dec 2016 12:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2016/12/15/preparing-for-the-2016-leap-second.html https://youshottheinvisibleswordsman.co.uk//2016/12/15/preparing-for-the-2016-leap-second.html Three years of IPv6 at Yahoo <p>Today marks my last day at Yahoo, almost three years since I started. It’s been a fun ride!</p> <h3 id="the-job">The job</h3> <p>The job took three hats:</p> <ul> <li><strong>Steering IPv6 adoption within Yahoo</strong>: arguing for IPv6 on services, prioritising work to get to IPv6 at some point rather than never, writing internal standards to guide work prioritisation. Technical work, testing, debugging, and throwing myself under the bus as the domain expert all came in here, to make deployment as painless as possible.</li> <li><strong>Teaching and training</strong>: I gave very regular internal and external talks to various groups, including undergraduate classes, research groups, engineering teams, operations teams, management groups, security groups, and C-level individuals. All of these various talks were much like teaching; the core material was the same, but the positioning was different depending on who was listening.</li> <li><strong>Measuring IPv6 adoption</strong>: setting up the pipelines to pull access logs and count IPv6 requests, categorised by ASNs, ISPs, countries, regions (all measures of external deployment), and according to Yahoo’s edges (a measure of internal deployment and traffic steering). These brought the IPv6 message in-house, but we also <a href="http://www.worldipv6launch.org/measurements/">publish some of our stats via the Internet Society</a>, and <a href="https://sdstrowes.co.uk/publications/sdstrowes-2016-yahoo-ipv6.pdf">put out some more detail in a short paper</a> presented at <a href="https://irtf.org/anrw/2016/">ANRW 2016</a>.</li> </ul> <p>It was a great trade-off, and watching IPv6 deployment grow in one of the largest content networks in the world (<a href="http://www.alexa.com/topsites">the 5th largest</a> in September 2016) was a lot of fun.</p> <p>Of course, the measurement work was important because a lot of the interesting stuff was happening elsewhere. Measuring deployment of IPv6, as observed on Yahoo’s CDN services, was critical.</p> <h3 id="the-measurements">The Measurements</h3> <p>Three years ago, 3.4% of our traffic was carried over IPv6. Today, our global metrics show that we’re regularly handling 15% of requests over IPv6 at weekends. So we’re talking about an almost five-fold increase in traffic in consumer-facing ISPs, and an equivalent reduction in IPv4.</p> <p>Back then, there were far fewer ISPs carrying significant IPv6 traffic. Verizon Wireless, Comcast, Deutsche, Free, and Swisscom all had significant deployments (over a third of their traffic was IPv6 at the time), and other networks such as T-Mobile (US) (hovering around 7% IPv6) were picking up. This is the post-<a href="http://www.worldipv6launch.org/">IPv6-Launch event</a> world, where ISPs were getting comfortable with IPv6 for all customers.</p> <p>We’ve seen so many ISPs ramp up their IPv6 deployments since then. Comcast is now consistently 50% IPv6, approaching 60% at the weekends (work-week hours drive IPv4 traffic in many fixed-line networks, something we identified in the <a href="https://sdstrowes.co.uk/publications/sdstrowes-2016-yahoo-ipv6.pdf">ANRW paper</a>). Some networks are pushing harder: T-Mobile (US) is hitting the 80% mark, as is Verizon Wireless. These networks in particular help boost our recent assertion that <a href="http://www.worldipv6launch.org/major-mobile-us-networks-pass-50-ipv6-threshold/">mobile traffic in the US is now majority IPv6</a>. Another big jump was Sky Broadband in the UK, who successfully ramped up their IPv6 deployment to level out around 78% IPv6. In a domestic, fixed-line ISP, that’s strong work, and it’s a huge reduction in our IPv4 load.</p> <p>Many ISPs operate within a given jurisdiction or market, and therefore countries are a natural way to aggregate the traffic stats. We have so many countries with traffic today that had almost none in 2013.</p> <p>At the country level, in 2013 traffic originating from ASNs registered in the US was close to 7% IPv6; Germany was around 8%, Romania was around 11%, and Switzerland was around 16%, but many were near-zero. So small compared to today. Now, the US is consistently around 30% IPv6 to Yahoo. Sky’s light-up brings the UK to around 15% IPv6. Belgium these days is over 40% IPv6, and Germany’s a comfortable 20%.</p> <p>Building out a list of countries where we see more than 5% of traffic carried over IPv6, here’s what we see, now and then:</p> <p>Europe:</p> <ul> <li>Belgium (50-58%, then 3%)</li> <li>Austria (12%, then &lt;0.1%)</li> <li>Czech Republic (10%, then 1%)</li> <li>Germany (20%, then 10%)</li> <li>Estonia (20%, then &lt;0.1%)</li> <li>Finland (16-20%, then 0.5%)</li> <li>France (10%, then 8.3%)</li> <li>Greece (20-25%, then 0.3%)</li> <li>Luxembourg (12-22%, then more consistently 12%)</li> <li>The Netherlands (10%, then 0.9%)</li> <li>Portugal (20%, then 0.9%)</li> <li>Romania (5-9%, similar then)</li> <li>Switzerland (20-30%, then 9.9%)</li> <li>The UK (13-18%, then 0.2%)</li> </ul> <p>Asia/Pacific:</p> <ul> <li>Australia (6%, then &lt;0.1%)</li> <li>India (8%, consistently &lt;0.1% until four months ago)</li> <li>Japan (4%, then &lt;0.1%)</li> <li>Malaysia (10%, then around 1%)</li> <li>Singapore (5%, then around 2-3%)</li> </ul> <p>Americas:</p> <ul> <li>Brazil (8%, then &lt;0.1%)</li> <li>Canada (13%, then 0.4%)</li> <li>Ecuador (12%, then &lt;0.1%)</li> <li>Trinida and Tobago (9%, then &lt;0.1%)</li> <li>The US (28-33%, then 8%)</li> </ul> <p>For the sake of clarity, country stats are derived by taking the origin ASN for each request and mapping it to the <a href="https://en.wikipedia.org/wiki/ISO_3166-1">country code</a> the origin ASN is registered to (according to data published by the registries: <a href="ftp://ftp.afrinic.net/pub/stats/afrinic/delegated-afrinic-extended-latest">AfriNIC</a>, <a href="https://ftp.apnic.net/stats/apnic/delegated-apnic-extended-latest">APNIC</a>, <a href="http://ftp.arin.net/pub/stats/arin/delegated-arin-extended-latest">ARIN</a>, <a href="http://ftp.lacnic.net/pub/stats/lacnic/delegated-lacnic-extended-latest">LACNIC</a>, <a href="http://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-extended-latest">RIPE NCC</a>); this is a rough but simple measure of where traffic is coming from.</p> <p>Obviously, countries vary wildly by population, and India’s recent uptick in IPv6 traffic is more significant – in terms of traffic, and where you handle it – than most of the European countries listed, but most of these are strong showings. IPv6 is on the way up.</p> <h3 id="in-summary">In Summary</h3> <p>The key points from some of our measurement are basically:</p> <ul> <li>We have ISPs today that are majority-IPv6. Consider this when thinking about how or where you serve your content.</li> <li>ISPs can light up quickly, when they want to. Dominant ISPs will tip a country’s or region’s IPv6 load.</li> <li>Given the above, a country’s IPv6 share may appear to stagnate; some countries have not seen significant additional deployment in the last three years, while others have seen concerted efforts (for example, <a href="http://www.internetsociety.org/deploy360/blog/2014/12/finland-planning-national-ipv6-launch-day-on-9-june-2015/">in Finland</a>).</li> <li>The set of ISPs deploying IPv6 is clearly <em>not</em> stagnant, given the number of countries now bringing non-trivial volumes of IPv6 traffic to content networks.</li> </ul> <p>In other spaces, key moves are encouraging. <a href="https://developer.apple.com/news/?id=05042016a">Apple’s IPv6 requirements for app developers</a>, and <a href="https://aws.amazon.com/blogs/aws/now-available-ipv6-support-for-amazon-s3/">Amazon’s announcement of IPv6 for S3</a> are big moves forward, reducing the number of IPv4-only services that engineers are exposed to, or writing code for.</p> <p>But from my perspective inside Yahoo, that’s that. In December, I start a new gig, measuring the network from different perspectives. Including, of course, the IPv6 perspective!</p> Fri, 30 Sep 2016 17:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2016/09/30/ipv6-at-yahoo.html https://youshottheinvisibleswordsman.co.uk//2016/09/30/ipv6-at-yahoo.html Reverse DNS Mapping IPv4 to IPv6 <p>We all know that scanning the IPv4 address space is almost trivially easy; exhaustively scanning IPv6 space is not so feasible if you expect to complete the job in any reasonable timeframe. Heuristics to reduce the space such as SLAAC addressing and common static IPv6 addressing schemes are <a href="https://www.si6networks.com/tools/">well-known</a>.</p> <p>One common approach I take to find my way around other people’s IPv6 infrastructure is to check out whether they have reverse DNS set up for a given IPv4 address. If they do, I’ll perform a AAAA lookup on the resulting domain name; for some networks, this can be surprisingly reliable. But sometimes it doesn’t work; it really depends on how the network is administered. Last year, I was curious to know how well this approach would work across the full IPv4 space.</p> <p>So I ran a measurement study that attempted to answer precisely that question. I went a step further and attempted to guesstimate when the resulting IP addresses belonged to the same host/router, and from there determine whether firewalls or services were configured differently in the IPv4 world versus the IPv6 world. There are all sorts of optimisations to this work for sure, but the exhaustive approach was useful to evaluate the approach, and was not time-consuming.</p> <p>In short, this approach found 965k IPv6 addresses (of varying quality) across 5.5k ASNs. The active scanning found that found that over half of those are responsive in some way, and there were many cases where IPv4 was more responsive than IPv6; in some cases (TCP ports 53 and 443 for example), it was more likely that IPv6 was quietly dropped by the network.</p> <p>This was submitted to <a href="http://conferences.sigcomm.org/imc/2016/">IMC</a>, admittedly a high-bar. While the reviews were reasonably positive, the paper didn’t make the cut.</p> <p>Continued study of the IPv6 space is interesting. For openness, the full copy can be found <a href="https://sdstrowes.co.uk/publications/sdstrowes-rdns-aaaa.pdf">here</a>.</p> <p>The abstract is as follows:</p> <blockquote> <p>The IPv4 address space is small enough to allow exhaustive active measurement, permitting important insight into Internet growth, policy, and evolution. The IPv6 address space, on the other hand, presents the problem that we can no longer perform exhaustive measurements in the same way, inhibiting our ability to continue studying Internet growth. Access to private datasets (e.g., HTTP access logs on content servers, flow data in ISP networks, or passive DNS traces) solves some problems but may not be feasible or desirable. This paper describes IPv6 address collection by exhaustively sweeping the reverse DNS domain for the IPv4 address space and performing AAAA queries on the results. Subsequent ICMP and TCP measurements are conducted to measure the responsiveness of the resulting set. Key outcomes include: the PTR sweep discovers 965,304 unique, globally routable IPv6 addresses originating from 5,531 ASNs. 56% of the addresses are responsive, across 4,571 ASNs. Upon inferring pairs of IPv4 and IPv6 addresses that are likely associated with the same device, the data indicates a trend toward IPv4 addresses being more responsive than their IPv6 counterparts, with a higher incidence rate of TCP connections being refused, and wide disparity on where TCP connections or ICMP echo requests fail silently when comparing IPv4 and IPv6. The disparity in IPv4 and IPv6 responsiveness is highly variable, and indicative of distinct host configuration and network policies across the two networks, presenting potential policy or security gaps as the IPv6 network matures.</p> </blockquote> Sun, 31 Jul 2016 17:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2016/07/31/rdns.html https://youshottheinvisibleswordsman.co.uk//2016/07/31/rdns.html The year that was 2015. <p>2015, like 2014, was a stable year.</p> <p>I’ve been at Yahoo for over two years. I’ve been living not just in San Francisco, but in the same apartment, for over three. I’ve been working for bay area companies for almost four. This stability for so long makes me fidget.</p> <p>A lack of travel early in the year exacerbated that feeling: travel was especially light at the start of the year. I had the shortest trip to LA in January, and made it until May until I had (<em>had</em>) to book a last minute nine-day trip to Mexico City. I might go back; the set of reachable, cost-effective destinations from SFO that work as a vacation for me (usually: big cities, culturally removed from my norm) is alarmingly small.</p> <p>Starting August, I squeezed London (for <a href="http://conferences.sigcomm.org/sigcomm/2015/">SIGCOMM</a> and a visa renewal, a short break to Montreal, then finally Tokyo then Yokohama for <a href="http://conferences2.sigcomm.org/imc/2015/">IMC</a> then <a href="https://irtf.org/raim-2015">RAIM</a> and <a href="https://www.ietf.org/meeting/94/">IETF 94</a> respectively in November. Upcoming travel? This year I’ve already passed through Portland, and I have another mini trip to Seattle planned. As for later in the year though? I’m not sure. IETF 96 is in Berlin, and SIGCOMM is in Salvador, Brazil (though I’m considering switching it out of my rota), and IMC is in the least exciting of the set so far, Santa Monica.</p> <p>On running: I logged 1,072km in 2015 (a reasonable 666 miles, if you’re into US measures). Two big things contributed to that total: first, being stationary at the start of the year gave me time to run; second, I set myself the challenge of running a marathon. On the way toward that, I <a href="https://www.strava.com/activities/306445661">completed Bay to Breakers</a> again, beating last year’s time by 4 minutes 41 seconds. My weekends were then spent gradually working up to distances over 30km in prep for the full 42km effort and, in the end, I came in with <a href="https://www.strava.com/activities/354620224">a reasonable time of 4 hours, 10 minutes, 23 seconds</a>. I had quietly hoped for a sub-4-hour effort, but this was good by me. Odds of running another marathon any time soon? Slim to none. But I plan to at least run Bay to Breakers again this year, and certainly complete a half marathon race. I’m also planning to ease into cycling after about a decade of regular running, and I’m reading everything about bikes at the moment.</p> <p>So, 2016, and what it’ll offer, is a bit of a mystery even to me. I have a feeling I’ll have to shake something loose, change something up. What that turns out to be might only be visible in hindsight. Let’s see, shall we?</p> Sat, 30 Jan 2016 23:30:00 +0000 https://youshottheinvisibleswordsman.co.uk//2016/01/30/that-was-the-year-that-was-2015.html https://youshottheinvisibleswordsman.co.uk//2016/01/30/that-was-the-year-that-was-2015.html IPv4 Occupancy, May 2015 <p>Following on from <a href="/2014/05/30/ipv4-occupancy.html">last year’s post</a> on how much of the IPv4 space is advertised in BGP now, an update for 2015.</p> <p>This time around, I’ve pulled more data; rather than one table per year, I’ve used one table per month. Otherwise, I’m calculating space in the same way as last year: pulling out the prefixes advertised over BGP, counting how many unique addresses are advertised, and tying them to either a RIR or an outright legacy allocation. I’m using the same RIR allocations as last year which is almost completely accurate. Good enough for here, and for comparison.</p> <p>As before, the way the address space is carved up means there’s a potential maximum of around 3.7 billion IPv4 addresses available. Occupancy as of May 31st 2015 looks like:</p> <table class="text-right table"> <tr> <th class="text-right">RIR</th> <th class="text-right">/8s available</th> <th class="text-right">/8s advertised</th> <th class="text-right">/8s free</th> <th class="text-right">% advertised</th> <th class="text-right">delta% 2014</th> </tr> <tr> <td><strong>ARIN</strong></td> <td>79.67</td> <td>56.93</td> <td>22.74</td> <td>71.16%</td> <td>+1.94%</td> </tr> <tr> <td><strong>APNIC</strong></td> <td>51.00</td> <td>43.63</td> <td>7.37</td> <td>85.56%</td> <td>+2.03%</td> </tr> <tr> <td><strong>RIPE NCC</strong></td> <td>39</td> <td>35.59</td> <td>3.41</td> <td>91.25%</td> <td>+0.42%</td> </tr> <tr> <td><strong>LACNIC</strong></td> <td>10</td> <td>9.30</td> <td>0.70</td> <td>93.00%</td> <td>+3.29%</td> </tr> <tr> <td><strong>AfriNIC</strong></td> <td>6</td> <td>3.46</td> <td>2.54</td> <td>57.79%</td> <td>+4.95%</td> </tr> <tr> <td><strong>Legacy</strong></td> <td>35</td> <td>15.95</td> <td>19.05</td> <td>45.59%</td> <td>+3.65%</td> </tr> <tr> <td><strong>Total</strong></td> <td>220.67</td> <td>164.87</td> <td>55.80</td> <td>74.71%</td> <td>+2.21%</td> </tr> </table> <p>That 2.21% increase in advertised space is approximately equivalent to five additional /8s being introduced in the last year, and it’s pretty close to the same rate of change we’ve seen since the middle of 2011 when APNIC started to level out.</p> <p>Here’s the time series of the actual space advertised by total and by region:</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-31-rir-occupancy-absolute.png" alt="absolute growth of RIR space between May 2002 and May 2015" /></p> <p>Here’s the time series of the proportion of space advertised by total, and by region:</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-31-rir-occupancy-relative.png" alt="relative growth of RIR space between May 2002 and May 2015" /></p> <p>For these, I’ve filtered some obviously-bad advertisements; the sort that are probably RIRs checking space prior to putting it into their pool.</p> <p>There’s a lot these images don’t show: I’m missing a lot of detail in those legacy blocks that might now be better allocated to RIRs. I’m not saying anything about how many of these addresses are or are not being advertised from the same source as previous samples. But really, the basic question I want to answer is: how much space is in use, and where.</p> <p>What’s interesting is how advertisements in the APNIC region have been less aggressive since around the point APNIC ran out of space to allocate freely, and started rationing allocations. Advertisements in Europe have also been pretty stationary since RIPE ran out in 2012. Same deal with LACNIC space. The suggestion in each case is that addresses are allocated by a RIR and are almost immediately put into use.</p> <p>I’m curious about who’s hanging onto space and not using it, or if it’s just gone missing; there’ll be address space that folks have forgotten about, possibly as companies have been merged or acquired. Even so, 90%+ in the RIPE and LACNIC regions is extremely full.</p> <p>ARIN still has a ton of space not in use, and as I write, <a href="https://www.arin.net/resources/request/ipv4_countdown.html">they’re listing 0.00978 aggregate /8s available</a> to distribute from the last of their rationed space. Given they were at 0.7 only three days ago, I imagine there’ll be an announcement in the next day or two that they’re out.</p> <p>Once the RIRs run out of free space to allocate, they’ll be able to issue space only from space returned to them or by managing transfers. I’d imagine that the removal of the top of the food chain will encourage the nascent v4 address market; one less supply source as demand continues to increase.</p> <p>There’ll be enough spare in that address space to shake out for a while yet: fragmenting of larger blocks into /24s as companies put up real money just to buy space and stay connected to the old network will see us through for a time, but it won’t be cheap or pretty. IPv6 growth is still increasing rapidly, but I’m still not willing to guess which year we’ll see advertised IPv4 space start to go back down.</p> Mon, 08 Jun 2015 22:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2015/06/08/ipv4-occupancy.html https://youshottheinvisibleswordsman.co.uk//2015/06/08/ipv4-occupancy.html Who wins? <p>I sat up and watched the UK general election last night. That’s easy when you sit 8 hours behind the UK. In fact, leisurely compared to coffee or whisky fuelled all-nighters.</p> <p>There were two big takeaways, one surprising and one not: the Conservative party mustered an outright seat majority (surprising, if you follow the polls), and the Scottish National Party almost took all of the Scottish seats (not surprising, if you follow the polls).</p> <p>The map looks different. But the map always looks weird because population density isn’t uniform. Let’s look at the numbers. I trawled numbers back to the 70s because the franchise hasn’t changed significantly since that election. Here’s the UK pattern, showing the share of the votes (upper plot) and the share of the seats (lower plot):</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-08-ge2015-uk.png" alt="" /></p> <p>I’ve stuck the SNP in here because they’re a current feature of last night’s outcome and attracting a lot of attention right now. I’ll focus on them in a bit.</p> <p>A primer: the electoral system is simple. In a UK general election, people vote in their constituency to elect their member of parliament (MP) who they think will best represent their interests. Candidates can be independent, but usually represent a party. Constituencies are won on a first-past-the-post, a.k.a. winner takes all, basis, and the party blocs that form then go on to form a government. Being constituency based, it’s an inherently local way to determine national government. Because constituency winners take everything, the system has no real notion of a broader proportional representation in the final seat distribution. Local, rather than global, optimisation on the selection process.</p> <p>So. Going back to the 70s, none of the parties attain a majority of the popular vote in any of these elections, but their seat share almost always produces a majority. The two largest parties, combined, squeezed only around two thirds of the vote in 2005, 2010, and 2015. This time around, the Conservatives have squeezed a majority seat share just over the line, much like Major in ‘92. The Lib Dems have consistently polled well, and have never taken as many seats as their vote would imply. It’s as non-proportional as we all know it is. The “feature” of the process, in the eyes of some, is that it produces “strong government”, which is kind of a pain if you’re in the majority that didn’t vote for the current party. And there’s always a majority that didn’t vote for the current party.</p> <p>Looking specifically at the disparity between vote share and seat share:</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-08-ge2015-uk-diff.png" alt="" /></p> <p>What these plots show is quite simple: %-seats minus %-vote, as an indication of how far from proportional the seat distribution is for each party. A positive value indicates the number of seats is higher than the vote, and vice-versa for a negative value.</p> <p>It’s pretty clear that Labour wins out of this system, every single time. Even last night, where the consensus is that they had a terrible night. The Conservatives win out of this system most of the time. The success of these two parties under this system is, usually, to the detriment of the other parties. The Lib Dems lose, every time. The SNP have finally won out from this setup.</p> <p>Each of the 650 constituencies represents, approximately, the same number of people. In aggregate that gives Scotland 59 of the current 650 seats, or a 9% seat share. Looking at those seats over the same time period:</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-08-ge2015-sc.png" alt="" /></p> <p>From 1970 and until last night, no party attained a majority of the vote. The SNP managed to grab one across the Scottish seats last night. For clarity: the SNP only run in Scottish seats, so they don’t poll at all elsewhere.</p> <p>Traditionally Labour have polled consistently well. But post-<a href="https://en.wikipedia.org/wiki/Scottish_independence_referendum,_2014">referendum</a> and post-<a href="https://en.wikipedia.org/wiki/First_Cameron_ministry">coalition</a>, the table has been flipped. It’s led to a very different vote share and, because it’s not a proportional system, an extremely different seat distribution: only three seats aren’t SNP, one each for the three other parties shown.</p> <p>The way the electoral system behaves means the difference between vote share and seat share in Scotland is pretty dramatic:</p> <p><img src="https://youshottheinvisibleswordsman.co.uk//images/2015-05-08-ge2015-sc-diff.png" alt="" /></p> <p>Labour, again, have almost always won out of this system, and by very clear margins. And they’ve lost the most heavily. The Conservatives have the same number of seats as 2010, largely due to the overwhelming dominance of the Labour party. This time around it’s the dominance of the SNP.</p> <p>Commentators during the count were pretty fixated on the SNP surge, but because the SNP only fields candidates in Scotland it’s worth bearing that first plot in mind. It’s still a parliament filled with Conservatives and Labour and while the Conservatives didn’t gain any seats in Scotland, they managed to soak up Lib Dem seats elsewhere. And while attention is being paid to the SNP result, the Conservatives are also the party least interested in electoral reform. That’ll be hard for them if they feel compelled to react but, given they have an overall majority, they could actually do whatever they like, including ignore the result. A more proportional voting system would see the SNP take approximately half the seats they took last night, but would have cut the Conservatives to around 240, and handed UKIP over 80 seats. (Assuming the voting patterns don’t change, which is a bad assumption.) In this sense, perhaps the Conservatives aren’t interested in committing electoral suicide. I guess the broader question is whether Cameron <a href="http://blogs.spectator.co.uk/coffeehouse/2015/05/today-britain-has-changed-changed-utterly-a-terrible-beauty-is-born/">is equipped to handle the UK’s political landscape</a>, or whether he cares.</p> Fri, 08 May 2015 21:30:00 +0000 https://youshottheinvisibleswordsman.co.uk//2015/05/08/general-election.html https://youshottheinvisibleswordsman.co.uk//2015/05/08/general-election.html The year that was 2014. <p>2013 <a href="https://youshottheinvisibleswordsman.co.uk/2014/02/07/that-was-the-year-that-was-2013.html">ended on a really nice uptick for me</a>. 2014 contined that trend.</p> <p>On reflection, 2012 and 2013 featured instability, and probably more than I was willing to admit at the time. After defending, graduating, taking a job, and moving country in 2012, then riding through a job hunt, a visa transfer, and starting a new job in 2013, 2014 featured none of those things. Ending the year employed by a large company lent stability entering 2014. I’ve been encouraging IPv6 adoption at Yahoo for over a year; I’ve been in my apartment for over two. I feel like, for the first time in a while, I’m stable. It’s interesting to consider, in hindsight.</p> <p>The new role has let me attend a few conferences I enjoy: I attented IETF89 to keep up with the industry, then SIGCOMM and the Internet Measurement Conference to keep up with the research. But my personal highlight was perhaps delivering a guest lecture <a href="http://inst.eecs.berkeley.edu/~cs168/fa14/class.html">at Berkeley</a> on IPv6. It was a year short on publications, however, despite still paying attention to some of the same trends as I have been for years now. Professionally, a good year.</p> <p>Travel wasn’t sparse, but a little uninteresting, featuring only anglosphere cities. London, Dublin, Chicago (for <a href="http://conferences.sigcomm.org/sigcomm/2014/">SIGCOMM</a>), Vancouver (for <a href="http://conferences2.sigcomm.org/imc/2014/">IMC</a>), and basically <a href="https://youshottheinvisibleswordsman.co.uk//images/2014-scotland-trip.png">a tour of Scotland</a>. Not that any of these places are bad; in fact, the North American cities were new to me, and I’m always happy to go home. They’re just places steeped in the familiarity of English-speaking world, and I enjoy time away from all of that.</p> <p>Running: 870km throughout the year. This featured the <a href="http://www.strava.com/activities/142753189">Bay to Breakers 12k</a>, and <a href="http://www.strava.com/activities/171630529">one of the San Francisco half marathons</a>, both completed in times I’m happy with. I’m planning on doing Bay to Breakers again, and completing the other half marathon, in 2015.</p> <p>Peering into the future, I reckon I find myself in a good place. I’ve started planning short trips around upcoming three day weekends. I’m planning on a similar conference rota to 2014, and similar goals for my running. Stability has allowed me to get back into being pretty healthy. I’m going to put a little more focus on health and fitness, and with luck I’ll find time to take a vacation someplace interesting. In all ways, I’m looking forward to 2015; it’s not likely to be dramatic, but it <em>is</em> likely to be good.</p> Sat, 10 Jan 2015 21:30:00 +0000 https://youshottheinvisibleswordsman.co.uk//2015/01/10/that-was-the-year-that-was-2014.html https://youshottheinvisibleswordsman.co.uk//2015/01/10/that-was-the-year-that-was-2014.html IMC 2014 Notes <p>These are my scribbled notes from <a href="http://conferences2.sigcomm.org/imc/2014/index.html">IMC 2014</a>. They’re <em>very</em> incomplete and probably inaccurate at points, but they’re what I caught from papers interesting to me when I was in the conference hall. I don’t note down questions that don’t contribute much, but I try and note questions/answers that contribute beyond what was already covered in the talk. If I’ve misrepresented your work, please get in touch and I’ll fix things up!</p> <h2 id="session-1-interdomain-routing-and-traffic">Session 1: Interdomain Routing and Traffic</h2> <p><strong>Inter-Domain Traffic Estimation for the Outsider</strong></p> <ul> <li>aim to shift focus from <em>connectivity</em> to <em>traffic</em></li> <li>traffic is all that matters for network engineering, anomaly detection, economics</li> <li>but few available traffic datasets exist</li> <li>analogy: popularity of paths from multiple connectivity measurements implies traffic volume</li> <li>urban planning: some streets are more central than others; predict path traffic based on road structure</li> <li>large traceroute datasets -&gt; AS level connectivity -&gt; apply structural analysis</li> <li>ground-truth checks against real traffic: one global tier-1, and one large IXP</li> <li>“new and adapted metrics from Space Syntax”</li> <li>ranking of AS links by traffic</li> </ul> <p>Q&amp;A:</p> <ul> <li>Q: traceroutes are collected at different dates/times; how does that affect traffic estimation? A: effectively used a two-month sample, albeit two years apart for comparison</li> <li>comment: you could go back in time and correlate traffic dynamics with particular events</li> </ul> <p><strong>Challenges in Inferring Internet Interdomain Congestion</strong></p> <ul> <li>explores the challenges in developing a system to characterise the extent of interdomain congestion</li> <li>prompted by public noise around peering disputes</li> <li>method: TSLP, Time Sequence Latency Probes; build a time series of latency probes</li> <li>want to avoid incorrectly inferring that a link is congested or uncongested given current interest</li> <li>happen to have a good view of level3-Dallas indicating congestion on AT&amp;T and verizon</li> <li>challenge: AQM and WFQ</li> <li>challenge: inferring interdomain links</li> <li>challenge: asymmetric reverse paths; record-route options generally not supported</li> <li>congestion trends indicate cogent and level3 congestion through 2013–March 2014, after which congestion dropped to zero</li> </ul> <p>Q&amp;A:</p> <ul> <li>Assertion: 64MB queue necessary to satisfy the ~50ms inflation on a 10Gbit link (rough calc: correct)</li> </ul> <p><strong>Inferring Complex AS Relationships</strong></p> <ul> <li>Looking at more complex AS relationships than the standard simple model of p2c, p2p, s2s</li> <li>new types: <ul> <li>partial transit</li> <li>hybrid (dual transit/peering)</li> </ul> </li> <li>both can be inferred with a high level of confidence</li> <li>partial transit is p2c with restricted scope, implying hierarchy of providers</li> <li>hybrid implies ASes that establish different relationship types at different points of presense</li> <li>requires inference of prefix export policies to evaluate based on how providers propagate prefixes</li> <li>limitations: topology incompleteness (we can only model what we see); city-level geoloc (hybrid links within a city region may be hidden); difficult to neatly categorise more complex relationships</li> <li>model indicates 3.3% of links inferred to partial transit, and 1.2% inferred to hybrid relationships</li> <li>some data on the size of customer cones/traffic levels for hybrids</li> <li>hybrid relationships can be unintentional</li> </ul> <p>Q&amp;A:</p> <ul> <li>European ASes are over-represented in the results, partially explained by the different ecosystem</li> </ul> <p><strong>Peering at Peerings: On the Role of IXP Route Servers</strong></p> <ul> <li>questions: what are IXP route servers, how do they work, what peering opportunities do they offer</li> <li>more peering leads to greater benefit for each member, but peerings require effort, coordination, push low-end routers</li> <li>IXPs offer route servers as a solution; ISP establishes a single session with the route server, making peering easy</li> <li>route server filters prefix import (avoid hijacking) and filters on export by each peer</li> <li>route server prefix distribution is bimodal: lots of prefixes advertised to all members using the RS, and lots that are advertised to very few</li> <li>in terms of traffic and coverage, the data indicates the majority of ASes use the RS but still have bilateral peering agreements with other networks to exchange data</li> <li>using the RS can mask who is at fault when something fails (the RS or the other peer)</li> <li>thus, bi-lateral peering is used for traffic-intensive peering arrangements</li> </ul> <p>Q&amp;A:</p> <ul> <li>Q: are route servers built to be highly available? do peers have fallbacks? A: the IXPs they spoke to have 100% uptime, so … we don’t know.</li> <li>Q: What do RSes mean for net neutrality? A: the RS will expose peering policies the peers apply to the RS, but not if they have bilateral agreement; bilateral agreements more likely if you want to violate net neutrality</li> <li>Q: are IXPs in other regions trying to catch up with europe?</li> </ul> <h2 id="session-2-understanding-mobile-broadband-networks">Session 2: Understanding (Mobile) Broadband Networks</h2> <p><strong>Measuring the Reliability of Mobile Broadband Networks</strong></p> <ul> <li>measure the <em>experienced</em> reliability on network, data, and application layers; think about the <em>user</em></li> <li>how do you define reliable? <ul> <li>ability to register on the network and establish a session, and how long it is available</li> <li>data: ability to send/recv packets</li> <li>useful connection; example, can use voip</li> <li>performance: throughput/goodput, to a reasonable bitrate</li> <li>reliability through multihoming</li> </ul> </li> <li>operators often share radio access networks, permitting greater visibility into where failures happen</li> <li>interesting: 25% of connections are down for more than 10 minutes per day; RAN (radio access network) is dominant factor in downtime; high correlation between downtime and SnR</li> <li>“higher than expected” loss rates; loss runs indicate most (~60%) runs are 1 packet in length. Spike around 5/6 packets, where session is erroneously marked as “idle” despite sending packets</li> <li>downloads fail most often because of inability to establish a TCP connection</li> <li>multihoming can give 99.999% availability</li> </ul> <p>Q&amp;A</p> <ul> <li>are your results affected by the volume of traffic you’re sending? A: we’re aware operators may apply different policies for high volume users, but we maintained a low bitrate. That’s why we were demoted to “idle”, because we weren’t transferring a high enough bitrate</li> <li>have you looked at latency? A: we’re looking at it, as ongoing work</li> </ul> <p><strong>Behind the Curtain - Cellular DNS and Content Replica Selection</strong></p> <ul> <li>DNS for CDN selection is usually the same as static networks, but</li> <li>client IPs are dynamically assigned, have no geographic anchor, and unstable anycast routing</li> <li>cellular networks have fewer egress points than traditional ISPs, though increasing in tandem with the deployment of 4G and its low latencies</li> <li>measured 6 cell networks (4 in US, 2 in South Korea); app: namehelp mobile; 350 devices, 280K experiments; samples every 30 minutes for five months</li> <li>assertion: cellular DNS is a poor location signal</li> <li>cellular DNS is highly dynamic, leading to CDNs returning different sets of replicas on a regular basis</li> <li>anycast routed public DNS resolvers also suffer from unstable mappings</li> </ul> <p>Q&amp;A</p> <ul> <li>when did you collect the data? A: march 2014 through october 2014</li> <li>does edns(0) modify the result? A: <em>probably</em> not given how dynamic client IPs are, but this is ongoing work</li> </ul> <p><strong>When the Internet Sleeps: Correlating Diurnal Networks With External Factors</strong></p> <ul> <li>we know traffic is diurnal (seen locally everywhere)</li> <li>what about ipv4 address usage, and can we see the global view?</li> <li>direct observation: count active addresses over time; find diurnal patterns; draw correlations on location, link type</li> <li>why study this? <ul> <li>sleep reflects policy</li> <li>sleep correlates with things such as GDP</li> <li>sleep affects outage detection; must not confuse “sleep” with “down”</li> <li>… how big is the internet?</li> </ul> </li> <li>contributions: new methods for analysis, and the application of those methods</li> <li>correlating diurnal with many factors: ANOVA (analysis of variance) <ul> <li>factors: GDP (strong correlation), electricity consumption (weak correlation), number of internet users per host, time of first block allocation, mean age of allocation (weak correlation; stricter policies on newer allocations may be enforcing recycling)</li> <li>link type: inferred from DNS (unexpected correlation: seemingly DSL lines correlate with diurnal patterns)</li> </ul> </li> </ul> <p>Q&amp;A</p> <ul> <li>this work hasn’t been compared to the questionable 2012 internet census</li> <li>why is the US so stable? Perhaps DSL policies, folks don’t care so much about energy consumption, etc</li> </ul> <p><strong>Need, Want, Can Afford - Broadband Markets and the Behavior of Users</strong></p> <ul> <li>goal: explire impact of capacity, price, cost of upgrading, and connection quality on broadband user’s behaviour</li> <li>challenges: requires a large dataset across a range of broadband markets, and requires scale to isolate confounding factors</li> <li>dataset includes aqualab’s Dasu (worldwide), and FCC/SamKnows (US), covering 53,000 users in 160 countries</li> <li>monthly cost translated into $ using purchasing power parity (PPP)</li> </ul> <p>Q&amp;A</p> <ul> <li>did you look at usage-based or variable pricing? A: we weren’t focussing on this, but it’d be an interesting direction</li> </ul> <h2 id="session-4-mobile-systems-and-networks">Session 4: Mobile Systems and Networks</h2> <p><strong>WiFi, LTE, or Both? Measuring Multi-homed Wireless Internet Performance</strong></p> <ul> <li>IP ID monoticity; windows and iOS have distinct patterns</li> <li>TCP Timestamp Option: huh, windows phone has TCP timestamp option disabled by default</li> <li>Clock frequency stability</li> </ul> <p>Session 5: Theory Underpinnings</p> <p><strong>Node Failure Localization via Network Tomography</strong></p> <ul> <li>under what conditions can this work uniquely localise failed nodes?</li> <li>how many failed nodes can be uniquely localised?</li> </ul> <p><strong>Efficient Large Flow Detection over Arbitrary Windows: An Algorithm Exact Outside An Ambiguity Region</strong></p> <ul> <li>large flow detection: flows that consume more than some threshold; e.g., dos attacks</li> <li>“arbitrary window model” checks “every possible time window in the past”; general solution, impossible for large flows to evade</li> </ul> <p><strong>Crossroads: A Practical Data Sketching Solution for Mining Intersection of Streams</strong></p> <ul> <li>identify significant performance anomaly events in real-time in a large cell network</li> <li>this can be viewed as a conventional association-rule mining problem, iff it was possible to record</li> </ul> <p><strong>OFSS: Skampling for the Flow Size Distribution</strong></p> <ul> <li>sampling &amp; sketching</li> <li>consider flow size distribution</li> <li>state of the art of netflow flow sampling is the great destroyer of the flow size distribution</li> <li>simple sketch onto a counter array</li> <li>flow sampling requires a flow table, impacting performance; sketching is very fast, but may have collisions</li> </ul> <h2 id="session-6-shedding-light-on-the-web">Session 6: Shedding Light on the Web</h2> <p>**Dissecting Web Latency in Ghana **</p> <ul> <li>the web in developing countries is slow</li> <li>connection speeds are increasing as average page sizes are increasing; server locations, routing configuration, and submarine cable layouts do not help</li> <li>in the example presented, DNS resolution in 2012 was a large contributor, but this has halved by 2014; connect() time has doubled between 2012 and 2014</li> <li>DNS lookup is dominant factor; 15-40% contribution to average page load time</li> <li>Redirects account for 80% of websites; 20-25% contribution</li> <li>TLS/SSL has an increasingly larger impact; 8-15% of requests required TLS/SSL</li> <li>require better caching schemes and/or new CDN architectures and/or redesigning web pages for better caching</li> </ul> <p>Q&amp;A</p> <ul> <li>how generalisable is this? A: this probably carries for other countries</li> </ul> <h2 id="session-7-internet-censorship">Session 7: Internet Censorship</h2> <p><strong>A Look at the Consequences of Internet Censorship Through an ISP Lens</strong></p> <ul> <li>require data snapshots before and after censorship events</li> <li>examines consequences of internet censorship in the context of a medium-sized ISP in pakistan</li> <li>data between October 11 and August 13</li> <li>nov’11; thousands of porn domains blocked; sep’12: youtube blocked</li> <li>entire analysis based on Bro protocol logs</li> <li>traces split into SOHO and residential</li> <li>network dumps captured in ISP’s core network</li> <li>example: consistent no response to domains on DNS implies censorship</li> <li>observation: no shift to public DNS resolvers for residential users</li> <li>observation: noticeable shift to public DNS resolvers for SOHO users</li> <li>collateral damage: after the youtube block, google docs traffic also dropped noticeably</li> </ul> <p>Q&amp;A</p> <ul> <li>they did not have the ability to associate traffic to particular users</li> </ul> <p><strong>Censorship in the Wild: Analyzing Internet Filtering in Syria</strong></p> <ul> <li>measuring censorship usually entails probing (generate requests, see what gets blocked)</li> <li>inherently limited by scale of measurement possible</li> <li>600GB of logs from 7 blue coat SG-9000 proxies leaked from syria in summer 2011 by telecomix</li> <li>data has flow-level identifiers (with source IP removed or hashed), plus HTTP details, plus results of filtering decision on device</li> <li>broadly: 93.2% of requests allowed; 6.3% denied (5.3% network error, and 1% “policy denied” (7M) or “policy redirect” (2K)); 0.5% proxied, response is cached someplace</li> <li>observation: false positives on keyword filtering: “proxy”, for example, is a common word</li> <li>observation: metacafe, skype, wikimedia, *.il, amazon.com, for example, blocked</li> <li>observation: social media: facebook.com often allowed, but not always; particular <em>pages</em> that may be politically sensitive on facebook are blocked</li> <li>observation: entire subnets filtered, representing Israel, Kuwait, Russia, etc</li> <li>anti-censorship tech: tor was not filtered during study (it is now); google cache was still being used to access censored content</li> <li>ethical considerations: this is sensitive data; encrypted at rest; aggregated stats only; IRB approval</li> </ul> <p>Q&amp;A</p> <ul> <li>tor traffic is identified as traffic traveling to known-public tor entry relays</li> </ul> <p><strong>Capturing Ghosts: Predicting the Used IPv4 Space by Inferring Unobserved Addresses</strong></p> <ul> <li>how much space is <em>actively</em> used?</li> <li>data collection -&gt; capture-recapture -&gt; population estimates</li> <li>collects IPv4 addresses from multiple (9?) different data sources</li> <li>regions that will run out first are LACNIC and APNIC, then AfriNIC, then RIPE, then ARIN</li> <li>estimates 1.2G IPv4 addresses used (45% of publicly routed space)</li> <li>6.2M /24 subnets used (60% publicly routed space)</li> <li>Significant unused space (especially legacy)</li> </ul> <h2 id="session-9-illuminating-malicious-behavior">Session 9: Illuminating Malicious Behavior</h2> <p><strong>Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild</strong></p> <ul> <li>20% of folks in the US believe their online accounts have been broken into</li> <li>Google’s hijack taxonomy: targeted (today’s focus) / manual (low volume, manual work) / automated (high volume, not much damage) hijacking</li> <li>focus: credential theft, account exploitation, and remission</li> <li>manual hijackers mainly use phishing to steal credentials</li> <li>phishing page efficiency: <em>average</em> success rate, 13.78%</li> <li>victims are lured to phishing pages via email; 99% of the http requests to phishing pages have no refer</li> <li>20% of decoy accounts accessed in less than 30 minutes; 50% within 7 hours</li> <li>number of accounts attempted per IP is really, low, and really stable</li> </ul> <p>Q&amp;A</p> <ul> <li>IPs come from tor, public proxies, VPNs; all the places you might expect</li> </ul> <h2 id="session-12-ssl-and-heartbleed">Session 12: SSL and Heartbleed</h2> <p><strong>The Matter of Heartbleed</strong></p> <ul> <li>experiment did not <em>exploit</em> the vulnerability</li> <li>45% of all sites support HTTPS; 60% of those support the heartbeat extension</li> <li>this doesn’t mean all those 60% were vulnerable, but estimate that 24-55% likely were</li> <li>11% of HTTP hsots on IPv4 supported heartbeat, and 6% of those hosts were vulnerable</li> <li>attack scene: no evidence of attack prior to disclosure; first scan traffic 22 hours after disclosure from university of latvia; observed 6000 probe attempts from 692 hosts</li> <li>only saw 11 hosts that hit all measurement points, therefore few hosts doing full internet scans</li> <li>two weeks after disclosure, 600,000 hosts remained vulnerable</li> <li>only 10.1% of sites who were vulnerable replaced their certs</li> <li>14$ of those who replaced their certs re-used their old private key</li> <li>4% revoked their vulnerable certs</li> </ul> <p><strong>Forced Perspectives: Evaluating an SSL Trust Enhancement at Scale</strong></p> <ul> <li>many SSL trust enhancements have been proposed to fix the CA trust model (DANE, Google’s cert transparency, network probes: convergence, perspectives)</li> <li>how do we evaluate performance of these trust alternatives when they have few users?</li> <li>performed a university-scale case study of convergence, with workloads synthesised from anonymised university-wide traces</li> <li>results on convergence notary performance; generated a workload by mapping one SSL handshake to one call to Convergence</li> <li>0.06% increase in traffic relative to SSL; low-cost</li> <li>one server supports entire university’s traffic</li> <li>client overhead is minimal (~250ms)</li> </ul> Sun, 09 Nov 2014 17:00:00 +0000 https://youshottheinvisibleswordsman.co.uk//2014/11/09/imc-2014.html https://youshottheinvisibleswordsman.co.uk//2014/11/09/imc-2014.html