Passive and Active Measurement Conference (PAM) 2013, day 1
Preamble
The notes that follow are a mixture of what each speaker said, or bullets listed on slides, or thoughts of my own. If you were at PAM and you spot errors (likely), feel free to point them out, and I’ll fix them. Not all papers are covered, but a good bunch of them are.
Conference Website: http://pam2013.comp.polyu.edu.hk/
Dates: 18 – 19 March, 2013. 74 papers were submited, 24 of those were accepted.
Session 1
Measurement Artifacts in NetFlow Data
Speaker: Rick Hofstede
Towards Fast and Efficient IP-level Network Topology Capture
Speaker: Thomas Bourgeau
Detecting Third-party Addresses in Traceroute Traces with IP Timestamp Option
Speaker: Pietro Marchetta
- Motivation: IP level topology Internet topology is essential for emulation, simulation, management, resource allocation, etc; BGP dervived AS level topologies are incomplete, traceroute is inaccurate.
- Third party addresses: an address which does not belong to any interface on the actual IP path toward the destination. (Origin: ICMP response source address is set to the IP address for the interface on the router on which the router chooses to emit its response; no guarantee of being the interface the original traffic arrived on.)
- Problem: Addresses may cause the inference of false AS-level links.
- Question: is an IP address discovered by traceroute a third-party address or is it part of the actual traversed path?
- Technique: send an ICMP echo request to Y with timestamps requested from YYYY; the response is considered classifiable only when it provides at least 1 timestamp but less than 4 timestamps in the ping reply. If classifiable, then target destination D with a UDP packet with timestamps requested from YYYY. If the response to the UDP packet contains at least one timestamp for Y, then Y is considered as on-path; otherwise, it is a third-party address.
- Hop classifiability: 51% of IPs are considered classifiable; 47.6% are non classifiable.
- Most classifiable hops appear in several paths from multiple vantage points toward multiple destinations. Paper considers one source with many destinations; many sources with one destination; many sources with many destinations.
- AS loops: third party addresses appear to be the cause of 37% of AS loops.
FlowSense: Monitoring Network Utilization with Zero Measurement Cost
Speaker: Curtis Yu
- SDNs allow centralised policy and reactive control of network. Reroute around congested links. Need to know when links are congested.
- Active measurement: for example, injection of SNMP probes
- Passive measurements: expensive instrumentation and infrastructure setup
- SDN measurements, switch polling; additional control traffic
- Flowsense: leverage existing control traffic to measure network. No additional traffic, network informs systems of changes. As accurate as switch polling.
- Openflow messages have utilisation information: PacketIn on first packet in a flow; FlowRemoved conains duration of entry in flow table, and the amount of traffic matched. Can infer utilisation contributed by flow on link.
- Post-hoc link utilisation. Log incoming utilisations from FlowRemoved notifications, and update checkpoints created at previous FlowRemoved timestamps.
- In the median case, total utilisation is known after around 100 seconds.
- However, their data indicates that 90% of the total utilisation can be reported after 10 seconds for 70% of the checkpoints.
Session 2
How to Reduce Smartphone Traffic Volume by 30%?
Speaker: Subhabrata Sen
- What is the effectiveness/feasibility of redundancy elimination techniques for smartphone data traffic?
- Study off-the-shelf RE techniques: their effectiveness if individually applied, when jointly applied, and their computational overhead.
- Techniques:
- Caching (http): 17% reduction in traffic volume if caching is fully utilised.
- Delta encoding (http)
- File compression (http)
- Packet stream compression (application agnostic)
- Effectiveness metric: compression ratio (CR) = traffic volume AFTER applying RE / traffic volume BEFORE applying RE
- Result: file compression results do not matter much
- Result: delta encoding is slightly better than caching; caching handles zero delta, so delta encoding brings limited additional benefits
- On under-utilisation of compression: many http requests do not contain Accept-Encoding; some servers do not compress even when Accept-Encoding has been sent by a client.
- Result: additional reduction in smartphone traffic by more than 30% with reasonable smartphone utilisation.
- Under utilisation of compression is a key culprit; gzip compression brings good traffic reduction with lowest overhead. Decompression is fast on the phone except for 7-zip, reasonably slow for bzip2, fast for other algorithms. Packet stream compression (MODP) is very useful.
Modeling Cellular User Mobility Using a Leap Graph
Speaker: Seunjoon Lee
- Short-term user mobility prediction allows mobile network providers to optimise resources (handover; pre-fetching).
- Existing approaches include GPS, wifi; issues exist with coverage, privacy, energy-consumption. Need an extra layer of mapping to get base-station level data.
- Challenges in handover detection: the active set of base stations is not unique for a single location, and a handset can exist in any combination of tens of sectors in densely covered regions. Not all handovers are due to mobility: load balancing, radio signal fluctuation.
- Significant noise == incorrect mobility prediction. How to extract actual user mobility?
- Mobility prediction using “leap graph”; adjacent base stations are potentially non-mobility induced; focus on non-adjacent base stations, “leap edges”.
- Determining leap treaces: identify overlapping sectors (via knowledge of configuration, empirical data from training period); create leap traces.
- Mobility prediction on leap graphs:
- higher prediction accuracy with higher-order markov models
- benefit from destination information is marginal
- The data analysis is not able to identify genuine mobility in handovers between adjacent sectors
Keynote: Endace and DAG Technology, 1995 – 2013
Ian Graham, endace
Session 3
Estimating TCP Latency Approximately with Passive Measurements
Speaker: Sriharsha Gangam
- Passive measurements in the middle of the network. Decompose path latency of TCP flows.
- Existing methods are accurate but expensive. SEQ/ACK matching.
- ALE: Approximate Latency Estimator. Goal: configurable tradeoff between accuracy and overhead.
- Sliding window of buckets (time intervals); each bucket is a counting bloom filter (CBF)
- Controlling error with ALE parameters: increase W: higher coverage; decrease w: higher accuracy.
- Process large and small latency flows simultaneously; absolute error is proportional to the latency.
- ALE-E: ALE-Exponential: variable buckets of width w; larger, older buckets shift slower.
- Error sources: bloom filters are probabilistic structures, with false positives and negatives. Artifacts from TCP: retransmits and out-of-sequence packets; ACK numbers not on SEQ boundaries (cumulative ACKs)
- Evaluation: backbone link traces from CAIDA; ground truth/baseline comparison by emulating TCP state machine. Compare latencies measured by ALE and tcptrace. Compare overhead (memory, computation) introduced by ALE and tcptrace.
- Memory overhead is interesting: tcptrace can take up to 468MB CSS, ALE-U(96) consumes 9.8MB regardless of sampling rate.
Effect of Competing TCP Traffic on Interactive Real-Time Communication
Speaker: Ilpo Jarvinen
- How well does VoIP work in the presence of competing TCP traffic? Especially interesting is web traffic, with transient and parallel TCP connections.
- Tested a variety of workloads: CBR-16kbps isolated; audio + bulk transfer; audio + web workload. Testing against a real HSPA network and a fixed server; multiple test iterations with wireless issues causing duplicates, reordering, consecutive losses, and long delay spikes.
- Results for the isolated audio with no other traffic are good; audio + bulk transfer with deep buffering causes delay increase, and interactivity is destroyed (delays of over a second); audio with one or two http flows are acceptable with the initial window set to 3 are okay, but there is more delay inherent with higher initial congestion windows or more flows.
- Jitter filter: jitter filter “drops” late arriving audio packet, mimics time-bound playback of media. Not lost physically, only delayed too much to be useful.
- Loss period level: loss period level is based on loss periods (rfc 3357) the codec encounters due to consecutive packets being “dropped”.
- IP packet delay variation confirms that worst-case delay spikes occur during initial window.
- Larger initial windows (of up to 10) is much worse for the competing media flow.
Performance Implications of Unilateral Enabling of IPv6
Speaker: Michael Rabinovich
- Question: what are the implications of unilateral IPv6 deployment.
- Plausible scenario: parallel v6 and v4 attempts, described in rfc 6555.
- Plausible scenario: sequential v6 then v4 attempts; inherent delay penalty.
- Macro-behaviour the result of complex interactions: browser, OS, DNS resolvers
- Experimental setup describes a sequence of DNS interactions, custom URLs to associate a DNS query with the resulting http request, and non-existent v6 addresses, to match the locations of DNS resolvers for a client, and measure the time to v4 failover. Ran a 28 day measurement.
- Conclusion: no evidence of performance penalty for unilateral ipv6 enabling
- Small increase in failure rate (from 0.0038% to 0.0064%)
- Study limitation: one-second time measurement granularity.