IMC 2014 Notes

These are my scribbled notes from IMC 2014. They’re very incomplete and probably inaccurate at points, but they’re what I caught from papers interesting to me when I was in the conference hall. I don’t note down questions that don’t contribute much, but I try and note questions/answers that contribute beyond what was already covered in the talk. If I’ve misrepresented your work, please get in touch and I’ll fix things up!

Session 1: Interdomain Routing and Traffic

Inter-Domain Traffic Estimation for the Outsider

aim to shift focus from connectivity to traffic
traffic is all that matters for network engineering, anomaly detection, economics
but few available traffic datasets exist
analogy: popularity of paths from multiple connectivity measurements implies traffic volume
urban planning: some streets are more central than others; predict path traffic based on road structure
large traceroute datasets -> AS level connectivity -> apply structural analysis
ground-truth checks against real traffic: one global tier-1, and one large IXP
“new and adapted metrics from Space Syntax”
ranking of AS links by traffic

Q&A:

Q: traceroutes are collected at different dates/times; how does that affect traffic estimation? A: effectively used a two-month sample, albeit two years apart for comparison
comment: you could go back in time and correlate traffic dynamics with particular events

Challenges in Inferring Internet Interdomain Congestion

explores the challenges in developing a system to characterise the extent of interdomain congestion
prompted by public noise around peering disputes
method: TSLP, Time Sequence Latency Probes; build a time series of latency probes
want to avoid incorrectly inferring that a link is congested or uncongested given current interest
happen to have a good view of level3-Dallas indicating congestion on AT&T and verizon
challenge: AQM and WFQ
challenge: inferring interdomain links
challenge: asymmetric reverse paths; record-route options generally not supported
congestion trends indicate cogent and level3 congestion through 2013–March 2014, after which congestion dropped to zero

Q&A:

Assertion: 64MB queue necessary to satisfy the ~50ms inflation on a 10Gbit link (rough calc: correct)

Inferring Complex AS Relationships

Looking at more complex AS relationships than the standard simple model of p2c, p2p, s2s
new types:
- partial transit
- hybrid (dual transit/peering)
both can be inferred with a high level of confidence
partial transit is p2c with restricted scope, implying hierarchy of providers
hybrid implies ASes that establish different relationship types at different points of presense
requires inference of prefix export policies to evaluate based on how providers propagate prefixes
limitations: topology incompleteness (we can only model what we see); city-level geoloc (hybrid links within a city region may be hidden); difficult to neatly categorise more complex relationships
model indicates 3.3% of links inferred to partial transit, and 1.2% inferred to hybrid relationships
some data on the size of customer cones/traffic levels for hybrids
hybrid relationships can be unintentional

Q&A:

European ASes are over-represented in the results, partially explained by the different ecosystem

Peering at Peerings: On the Role of IXP Route Servers

questions: what are IXP route servers, how do they work, what peering opportunities do they offer
more peering leads to greater benefit for each member, but peerings require effort, coordination, push low-end routers
IXPs offer route servers as a solution; ISP establishes a single session with the route server, making peering easy
route server filters prefix import (avoid hijacking) and filters on export by each peer
route server prefix distribution is bimodal: lots of prefixes advertised to all members using the RS, and lots that are advertised to very few
in terms of traffic and coverage, the data indicates the majority of ASes use the RS but still have bilateral peering agreements with other networks to exchange data
using the RS can mask who is at fault when something fails (the RS or the other peer)
thus, bi-lateral peering is used for traffic-intensive peering arrangements

Q&A:

Q: are route servers built to be highly available? do peers have fallbacks? A: the IXPs they spoke to have 100% uptime, so … we don’t know.
Q: What do RSes mean for net neutrality? A: the RS will expose peering policies the peers apply to the RS, but not if they have bilateral agreement; bilateral agreements more likely if you want to violate net neutrality
Q: are IXPs in other regions trying to catch up with europe?

Session 2: Understanding (Mobile) Broadband Networks

Measuring the Reliability of Mobile Broadband Networks

measure the experienced reliability on network, data, and application layers; think about the user
how do you define reliable?
- ability to register on the network and establish a session, and how long it is available
- data: ability to send/recv packets
- useful connection; example, can use voip
- performance: throughput/goodput, to a reasonable bitrate
- reliability through multihoming
operators often share radio access networks, permitting greater visibility into where failures happen
interesting: 25% of connections are down for more than 10 minutes per day; RAN (radio access network) is dominant factor in downtime; high correlation between downtime and SnR
“higher than expected” loss rates; loss runs indicate most (~60%) runs are 1 packet in length. Spike around 5/6 packets, where session is erroneously marked as “idle” despite sending packets
downloads fail most often because of inability to establish a TCP connection
multihoming can give 99.999% availability

Q&A

are your results affected by the volume of traffic you’re sending? A: we’re aware operators may apply different policies for high volume users, but we maintained a low bitrate. That’s why we were demoted to “idle”, because we weren’t transferring a high enough bitrate
have you looked at latency? A: we’re looking at it, as ongoing work

Behind the Curtain - Cellular DNS and Content Replica Selection

DNS for CDN selection is usually the same as static networks, but
client IPs are dynamically assigned, have no geographic anchor, and unstable anycast routing
cellular networks have fewer egress points than traditional ISPs, though increasing in tandem with the deployment of 4G and its low latencies
measured 6 cell networks (4 in US, 2 in South Korea); app: namehelp mobile; 350 devices, 280K experiments; samples every 30 minutes for five months
assertion: cellular DNS is a poor location signal
cellular DNS is highly dynamic, leading to CDNs returning different sets of replicas on a regular basis
anycast routed public DNS resolvers also suffer from unstable mappings

Q&A

when did you collect the data? A: march 2014 through october 2014
does edns(0) modify the result? A: probably not given how dynamic client IPs are, but this is ongoing work

When the Internet Sleeps: Correlating Diurnal Networks With External Factors

we know traffic is diurnal (seen locally everywhere)
what about ipv4 address usage, and can we see the global view?
direct observation: count active addresses over time; find diurnal patterns; draw correlations on location, link type
why study this?
- sleep reflects policy
- sleep correlates with things such as GDP
- sleep affects outage detection; must not confuse “sleep” with “down”
- … how big is the internet?
contributions: new methods for analysis, and the application of those methods
correlating diurnal with many factors: ANOVA (analysis of variance)
- factors: GDP (strong correlation), electricity consumption (weak correlation), number of internet users per host, time of first block allocation, mean age of allocation (weak correlation; stricter policies on newer allocations may be enforcing recycling)
- link type: inferred from DNS (unexpected correlation: seemingly DSL lines correlate with diurnal patterns)

Q&A

this work hasn’t been compared to the questionable 2012 internet census
why is the US so stable? Perhaps DSL policies, folks don’t care so much about energy consumption, etc

Need, Want, Can Afford - Broadband Markets and the Behavior of Users

goal: explire impact of capacity, price, cost of upgrading, and connection quality on broadband user’s behaviour
challenges: requires a large dataset across a range of broadband markets, and requires scale to isolate confounding factors
dataset includes aqualab’s Dasu (worldwide), and FCC/SamKnows (US), covering 53,000 users in 160 countries
monthly cost translated into $ using purchasing power parity (PPP)

Q&A

did you look at usage-based or variable pricing? A: we weren’t focussing on this, but it’d be an interesting direction

Session 4: Mobile Systems and Networks

WiFi, LTE, or Both? Measuring Multi-homed Wireless Internet Performance

IP ID monoticity; windows and iOS have distinct patterns
TCP Timestamp Option: huh, windows phone has TCP timestamp option disabled by default
Clock frequency stability

Session 5: Theory Underpinnings

Node Failure Localization via Network Tomography

under what conditions can this work uniquely localise failed nodes?
how many failed nodes can be uniquely localised?

Efficient Large Flow Detection over Arbitrary Windows: An Algorithm Exact Outside An Ambiguity Region

large flow detection: flows that consume more than some threshold; e.g., dos attacks
“arbitrary window model” checks “every possible time window in the past”; general solution, impossible for large flows to evade

Crossroads: A Practical Data Sketching Solution for Mining Intersection of Streams

identify significant performance anomaly events in real-time in a large cell network
this can be viewed as a conventional association-rule mining problem, iff it was possible to record

OFSS: Skampling for the Flow Size Distribution

sampling & sketching
consider flow size distribution
state of the art of netflow flow sampling is the great destroyer of the flow size distribution
simple sketch onto a counter array
flow sampling requires a flow table, impacting performance; sketching is very fast, but may have collisions

Session 6: Shedding Light on the Web

**Dissecting Web Latency in Ghana **

the web in developing countries is slow
connection speeds are increasing as average page sizes are increasing; server locations, routing configuration, and submarine cable layouts do not help
in the example presented, DNS resolution in 2012 was a large contributor, but this has halved by 2014; connect() time has doubled between 2012 and 2014
DNS lookup is dominant factor; 15-40% contribution to average page load time
Redirects account for 80% of websites; 20-25% contribution
TLS/SSL has an increasingly larger impact; 8-15% of requests required TLS/SSL
require better caching schemes and/or new CDN architectures and/or redesigning web pages for better caching

Q&A

how generalisable is this? A: this probably carries for other countries

Session 7: Internet Censorship

A Look at the Consequences of Internet Censorship Through an ISP Lens

require data snapshots before and after censorship events
examines consequences of internet censorship in the context of a medium-sized ISP in pakistan
data between October 11 and August 13
nov’11; thousands of porn domains blocked; sep’12: youtube blocked
entire analysis based on Bro protocol logs
traces split into SOHO and residential
network dumps captured in ISP’s core network
example: consistent no response to domains on DNS implies censorship
observation: no shift to public DNS resolvers for residential users
observation: noticeable shift to public DNS resolvers for SOHO users
collateral damage: after the youtube block, google docs traffic also dropped noticeably

Q&A

they did not have the ability to associate traffic to particular users

Censorship in the Wild: Analyzing Internet Filtering in Syria

measuring censorship usually entails probing (generate requests, see what gets blocked)
inherently limited by scale of measurement possible
600GB of logs from 7 blue coat SG-9000 proxies leaked from syria in summer 2011 by telecomix
data has flow-level identifiers (with source IP removed or hashed), plus HTTP details, plus results of filtering decision on device
broadly: 93.2% of requests allowed; 6.3% denied (5.3% network error, and 1% “policy denied” (7M) or “policy redirect” (2K)); 0.5% proxied, response is cached someplace
observation: false positives on keyword filtering: “proxy”, for example, is a common word
observation: metacafe, skype, wikimedia, *.il, amazon.com, for example, blocked
observation: social media: facebook.com often allowed, but not always; particular pages that may be politically sensitive on facebook are blocked
observation: entire subnets filtered, representing Israel, Kuwait, Russia, etc
anti-censorship tech: tor was not filtered during study (it is now); google cache was still being used to access censored content
ethical considerations: this is sensitive data; encrypted at rest; aggregated stats only; IRB approval

Q&A

tor traffic is identified as traffic traveling to known-public tor entry relays

Capturing Ghosts: Predicting the Used IPv4 Space by Inferring Unobserved Addresses

how much space is actively used?
data collection -> capture-recapture -> population estimates
collects IPv4 addresses from multiple (9?) different data sources
regions that will run out first are LACNIC and APNIC, then AfriNIC, then RIPE, then ARIN
estimates 1.2G IPv4 addresses used (45% of publicly routed space)
6.2M /24 subnets used (60% publicly routed space)
Significant unused space (especially legacy)

Session 9: Illuminating Malicious Behavior

Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild

20% of folks in the US believe their online accounts have been broken into
Google’s hijack taxonomy: targeted (today’s focus) / manual (low volume, manual work) / automated (high volume, not much damage) hijacking
focus: credential theft, account exploitation, and remission
manual hijackers mainly use phishing to steal credentials
phishing page efficiency: average success rate, 13.78%
victims are lured to phishing pages via email; 99% of the http requests to phishing pages have no refer
20% of decoy accounts accessed in less than 30 minutes; 50% within 7 hours
number of accounts attempted per IP is really, low, and really stable

Q&A

IPs come from tor, public proxies, VPNs; all the places you might expect

Session 12: SSL and Heartbleed

The Matter of Heartbleed

experiment did not exploit the vulnerability
45% of all sites support HTTPS; 60% of those support the heartbeat extension
this doesn’t mean all those 60% were vulnerable, but estimate that 24-55% likely were
11% of HTTP hsots on IPv4 supported heartbeat, and 6% of those hosts were vulnerable
attack scene: no evidence of attack prior to disclosure; first scan traffic 22 hours after disclosure from university of latvia; observed 6000 probe attempts from 692 hosts
only saw 11 hosts that hit all measurement points, therefore few hosts doing full internet scans
two weeks after disclosure, 600,000 hosts remained vulnerable
only 10.1% of sites who were vulnerable replaced their certs
14$ of those who replaced their certs re-used their old private key
4% revoked their vulnerable certs

Forced Perspectives: Evaluating an SSL Trust Enhancement at Scale

many SSL trust enhancements have been proposed to fix the CA trust model (DANE, Google’s cert transparency, network probes: convergence, perspectives)
how do we evaluate performance of these trust alternatives when they have few users?
performed a university-scale case study of convergence, with workloads synthesised from anonymised university-wide traces
results on convergence notary performance; generated a workload by mapping one SSL handshake to one call to Convergence
0.06% increase in traffic relative to SSL; low-cost
one server supports entire university’s traffic
client overhead is minimal (~250ms)

Footnote

Posted by Stephen Strowes on Sunday, November 9th, 2014. You can follow me on twitter.

Stephen D. Strowes

IMC 2014 Notes

Session 1: Interdomain Routing and Traffic

Session 2: Understanding (Mobile) Broadband Networks

Session 4: Mobile Systems and Networks

Session 6: Shedding Light on the Web

Session 7: Internet Censorship

Session 9: Illuminating Malicious Behavior

Session 12: SSL and Heartbleed

Footnote

Recent Posts