ACM Internet Measurement Conference (IMC) 2010: Day 1

Preamble

The notes that follow are a mixture of what each speaker said, or bullets listed on slides (verbatim, with minor US->UK deltas because I was touch-typing), or thoughts of my own. If you were at IMC and you spot errors (likely), do feel free to point them out, and I’ll fix them.

The amount of text written for each is proportional to certain variables: How tired I was during the talk, how interested I was in the talk, how much I knew about the subject material, and how good the speaker was. Note also that these are pretty raw. Questions are omitted if the question was not clear; many responses from the speaker have been shortened for brevity.

Conference Website: http://conferences.sigcomm.org/imc/2010/

Location: BMW Edge Ampitheatre, Melbourne, Australia

Dates: 1 – 3 November, 2010

This is the 10th IMC.
This IMC has the largest attendance of any IMCs “in an overseas venue”

Submissions breakdown:

211 submissions (largest ever, 15% increase from last year)
- 110 full (previous, 115)
- 101 short (previous, 68)
accepted: 24 long, 23 short
short:long ratio seen in submissions preserved to selection, not by design
11 papers were accepted before the tpc met
Best paper award went to: “network traffic characteristics of data centres in the wild”

Session 1: Services

Long paper: CloudCmp: Comparing Public Cloud Providers

Speaker: Ang Li

cloudcmp to mean “Cloud compare”
Framework to compare public cloud providers
Systematic comparator of cloud providers
Expecting massive growth
Many players brings problem of choice
Choosing the best cloud is hard.
Requirements for comparison:
- relevant to application performance
- comprehensive along multiple dimensions (e.g., locations, times of day)
- fair (independent from underlying differences between cloud providers)
- lightweight
Covers four common providers: Amazon, rackspace cloud, windows azure, google app engine
Focus on nine end-to-end metrics
- each is simple
- each has predictive value
- abstracts away implementation detail, can show variance
Snapshots from March to September
Results are inherently dynamic
Providers are anonymised :-(
Four common services:
- Compute cluster (elastic: virtual instances)
- Storage
- Intra-cloud network
- Wide-area network
Comparison:
- Instance performance, java benchmarks
- Cost effectiveness, monetary cost per benchmark
- Scaling performance, scaling latency (time taken to allocate new instance)
- Good performance results may be affected by how heavily loaded the cloud was by other work
- Larger instances not cost effective if code cannot utilise those cores (self evident
Scaling latency: Linux instances win, < 100seconds.
Storage comparison:
- Covers blob, table, and queue storage services
- Compare read and write, and query on tables
- Metrics: latency, cost, time to consistency
Wide area:
- Network latency to closest data centre
- Used 260 planetlab vantage points
Q: How do you collect the data? How is your comparison fair across all four platforms?
A: Have a bunch of public tools and custom tools to get read data. Fairness, deliberately choose metrics independent from the providers. Metric is not dependent on platform.
Q: Follow up: How many samples? Different times?
A: Tried many different samples, up to 100, at different times without much visible varation.
Q: 10 minutes scaling time is acceptable? Flash crowds?
A: Good compared to, say, setting up a new physical node. Obviously better launch times would be preferable.
Q: Cost effectiveness: Are those results for single-threaded or multi-threaded benchmarks?
A: We didn’t show multi-threaded benchmarks, but our results show they are still not competitive. But we think they are still not cost effective. They are costed by core count. They are also bounded by memory and I/O.

Short paper: Comparing DNS resolvers in the wild

Speaker: Bernard Ager

DNS performance is critical
DNS (mis)uses:
- Locality aware replies
- Dynamic load balancing
- NXDOMAIN catching (introduces advertising revenue for ISPs)
- Use as directory service
- Use of third-party resolvers (Google public DNS, openDNS)
- [Vixie ‘09]
Study: Compare DNS servers across content, locations, and resolvers
Compare DNS deployment of different ISPs and different resolvers
Metrics: Responsiveness and quality of reply.
Each tested 3 DNS resolvers: Google DNS, OpenDNS, and local resolvers
10k+ hostnames
- Popular (top 5000 from alexa)
- less popular (bottom 2000 from alexa)
- Many objects on websites (3000+ “embedded” hostnames)
Two back-to-back queries for each hostname for each resolver.
60 traces from all around the globe, >50 different ISPs
Result: First vs. Second query: small variance for second qurry due to caching
Local DNS apparently better than OpenDNS and GoogleDNS
Patterns change dependig on vantage point. Very different result. Load balancing on local resolvers? Google and openDNS appear similar, though baseline RTT has changed.
Google and openDNS often outperforms local DNS. For content locality, you have to use local DNS.
Impact of redirection: How many replies are in the same AS as the vantage point?
http://www.fg-inet.de/ to take part.
Q: Re: Server caching: Have you tested to determine how large the load balancing cluster is?
A: No, but it’s interesting. But is it useful? Does it matter to test?
Q: There are minimal thresholds for google and open. What establishes minimal threshold?
A: This is the minimum. Related to RTT to resolver.

Long paper: Improving content delivery using provider-aided distance information

Speaker: Ingmar Poese

Web and streaming dominate net traffic, both run over HTTP
CDN Caches: Inside ISPs
DNS based cache selection: Client queries -> Redirect to CDN -> CDN chooses cache(s) -> Return via resolver -> Connect to cache
Known metrics: Cache load, Content availability
Unknown: Exact position, Path properties
Data: 14 day trace from a POP in a large European ISP; 1.2 billion HTTP requests (89 million/day)
Examine top 10,000 hostnames
- Exposed location diversity
- Potential for content delivery
Very little diversity in responses:
- Most provide 1 IP address in response
- 7 or 8% offer two IP addresses in DNS response
Ubiquitous caches
- Serve anything
- Fetch missing
- Provides much location variation (?)
CDNs currently do not expose location diversity. Can content delivery be improved with location diversity?
PaDIS: Provider-aided Dinstance Information System
Can utilise diversity in paths to locations to reduce page load delay, reduce download time for large files
Steps: DNS query; Find auth. DNS server; Receive auth. DNS answer; Send answer to PaDIS; PaDIS aggregates and reorders known IPs; DNS resolver sends top ranked IPs back to client
PaDIS would be operated by the ISP. No architecture changes needed. Transparent to consumer and CDN.
Used large CDN with 124 global locations, and 11 different files of varying sizes.
Method: Repeatedly download files from all locations, compare CDN results.
Summary: PaDIS can expose and utilise diversity in cache locations. Localise traffic, Decrease delay and download times, Give poiwer back to the ISP.
Experiments show significant reduction in download time.
Q: CDN may be using multiple locations to load for specific large ISPs (?)
A: Yes, we can infer the caching strategy using this scheme. Need more data to do this, though.
Q: Is it for small files or big files?
A: Both. Depends on goal for content provider. If just trying to speed up downloads, only interested in delays. Small files, only care about latency.
Q: But is the maximum gain had by small or large files?
A: We see on one file, but we’re optimising for delay.
Q: Congestion seen on graph could be congestion in the last mile?
A: Suspect it’s the peering link being overloaded, based on consistent results throughout the week.
Q: The ISP does not know what to optimise for; it is not the content provider, as such.
A: But the ISP can know when certain links are overloaded. (This didn’t really answer question…)

Session 2: Security

Long paper: Detecting and characterizing social spam campaigns

Speaker: Hongyu Gao

Large scale experiment to confirm and quantify spam campaigns on online social networks
Uncover the characteristics of the campaign:
- They mainly use compromised accounts
- They Mostly conduct phishing attacks
Model each wall post as a (description, URL) pair.
Build post-simlarity graph. Edges connect wall posts that are “similar” in their model.
This reduces the problem of identifying potential campaigns to identifying connected subgraphs and therefore clusters.
Diurnal patterns: malicious posts follow a different pattern to “benign” posts, but the authors plot by percentages. The benign posts are at their most active around 3am, and late into the active portion of the diurnal cycle around 9pm.
Q: Resiliency of technique: How long after campaign starts do you do your analysis?
A: This is all offline. Done afterward.
Q: Hashing dependencies. They can quite easily construct things to generate different hashes.
Q: NLP along with URL. Have you tried using the URL alone?
A: We haven’t. Don’t know answer.
Q: bit.ly logs who clicks. Did you log this?
A: We didn’t, but it’d be interesting. Don’t have enough data.
Q: People delete spam posts. Does this affect the result?
A: Yes, users might delete these posts. Our measurement sets a lower bound for the spam posts. There could be many more. This is what is still there.

Long paper: Detecting Algorithmically generated malicious domain names

Speaker: Sandeep Yadav

Botnets may exhibit domain fluxing for evasion from the fastest C&C server detection techniques.
Automated domain fluxing requires algorithmic generation of domain names
Such domain names are composed of alphanumeric characters chosen randomly.
Thus, compare such a randomised distributed with non-malicious character distribution.
Can we exploit the randomness present in the domain names to determine whether they are malicious or not?
Perform a comparison on a random distribution, and non-random distributions, of characters. Non-malicious distribution different from malicious distribution.
K-L divergence: Compute the distance between test data and known distribution.
Conclusions: Domain fluxing requires randomised domain names which are detectable by statistical methods we use here.
Jaccard index proves to be the best measure.
Q: Internationalised domain names, could offer a “hiding place?”
A: Certainly these look random. ??
Q: You developed a few good methods to detect generated deomain names. Presumably these methods cna be used to ‘crack’ your algorithm.
A: Of course.
Q: Surely distribution across a URL, which is short, is extremely sparse. So presumably it’s difficult to accurately match?
A: ??

Long paper: Internet Background Radiation Revisited

Speaker: Eric Wustrow

Only 12 /8’s left for IPv4 allocation
Who gets 1.2.3.4? 1.0.0.0/24?
Are the remaining /8’s different?
Internet pollution: Volume of traffic, protocol and port distribution
Announced and collected: 1/8, 50/8, 107/8, 35/8.
Spatial study: Advertise the address blocks independently for periods of ~1 week.
Temporal study: 35/8 for about 4 years.
1/8 receives order of magnitude more traffic than the other /8’s.
Pretty much all /8’s see the same amount of TCP. 1/8 sees a lot more UDP.
Distribution across destinations are not uniform. Esepcially in 1/8, which attracts a long of traffic under 1.0.0.0/8.
Pollution traffic increases at a factor of ~2 each year.
Destination distribution: less uniform now. “Emergence of conficker”.
Recent trend toward larger number of sources.
Similar amount of backscatter and scanning across all /8’s, but 1/8 attracts lots of “Other”
Prefix 1.1.1.0/24 attracts 44.5% of pollution (to 1/8?)
0x8000 identifies RTP version 2; payload type 00 == PCMU unencrypted audio; Converted one stream to .au using Wireshark; SIP INVITE attack
Prefix 1.4.0.0 catches 17.5% of all 1/8; Interpreted as DNS queries for A, AAAA, MX; Potentially misconfigured secondary DNS
Prefix 1.x.168.192… Someone not doing proper htonl(ipaddr). (But source IP looks correct…) “RFC 33263”?
Prefix 1.0.0.0/24; 1.1.1.0/24; 1.2.3.0/24; 1.4.0.0/24; 1.10.10.0/24 … these be quarantined.
There is potential to cleanup.
Q: Was this entirely passive?
A: Yes
Q: Clipping on the traffic slides for the 35/8?
A: We’re not sure what’s causing the clipping. Something upstream might be filtering? Unable to track source of clipping down.
Q: Most 1/8 traffic sent by linux machines?
A: Yes
Q: Could you automate some of these things? Like fixing the byte ordering problems? Taxonomy of top-100 errors?
A: Yes. Interesting future work.

Session 3: Economics

Short paper: On Economic Heavy Hitters: Shapley Value Analysis of the 95th-Percentile Pricing

Speaker: Rade Stanojevic

Every year, ISPs carry 40% more traffic.
Revenues are not increasing; flat rate pricing.
Access ISPs need to become more efficient
- Use locality hashing
- Use CDNs
Content creators need to chip in
Customer pricing needs to be revised.
Need to formalise usage in a way that is intuitive; we need to quantify using real data.
Shapley value: Can use this to formalise the amount of unfairness induced by flat and other simply priving schemes
Shapley value is fair but hard to use in practice. Use same $ per MB for all customers but change $ depending on time of day.
Q: Cost of providing bandwidth apparently drops by 30 – 40% each year, in every field but wireless. Is this the best approach for wireless?
A: I think your numbers are wrong.
Q: Calculations based on when the network will be free are based on current access models; introducing new models affects when the network is used.
A: Yes. The algorithm here still works. We then use it to change the pricing.
Q: Most of the ISPs in Aus already do this. What’s different is that the pricing isn’t per-byte, it’s simply capped. Have you looked at shaping values for that sort of pricing also?
A: The main contribution is that you can use the Shapley value to formalise this process.

Short paper: Challenges in Measuring Online Advertising Systems

Speaker: Saikat Guha

privacy-preserving advertising sytems
Problem statement: Figure out what information is actually used to target ads today.
- Make assumption: X is used to today.
- Make two profiles. One with X, one without.
- See if ads differ. Sounds simple; anything but!
Problem: All ads that could be shown are not always shown: Limits on number of ads per page; frequency capping
Solution: For search ads, reload ~10 times (50 seconds) to capture a snapshot of the ads google could be showing you.
Comparison: Set overlap; Jaccard index
Problem: Systemic artifacts. Three identical browsers, two get same ads one gets different.
Solution: Use static hosts file; failsafe: Noise-level control (two identical profiles should always experience same set of ads if your experiment is running correctly.)
Google doesn’t do behavioural targetting: Watching for previous browsing habits to influence future search results.
Summary: This is a methodology paper. Methodology to measure what information is actually used in online ad trargeting today.
Q: Facebook seems to miss the mark more often than Google. Do you have a preference?
A: Facebook is a much younger company, collecting certain sorts of data only since 2009.
Q: Is there any way to know Google didn’t game our study.
A: We did use a bunch of different IP addresses in a short space of time, played around with proxy headers, etc. But no, we can’t say for sure.
Q: What does facebook infer from social graph?
A: All our test users either had no friends or just other test users. Again, this is a methodology paper, can be used to test your specific hypothesis.

Session 4: Methodology I

Long paper: Measurement of Loss Pairs in Network Paths

Speaker: Edmond Chan

Measurement with packet pairs
Loss pair: a packet pair with one lost packet and one residual packet.
Use residual packet’s delay to infer lost packet’s delay
Correlate lost packet’s delay and packet loss event
Until now, no loss pair measurement has ever been reported.
Contribution:
- Characterise first and second residual packet’s delay for inferring congested router’s queueing delay.
- Propose an active method for measuring loss pairs from a single endpoint.
- Conduct loss-pair measurement for 88 Internet paths.
Conclusions:
- Revisit the loss-pair measurement:
- characterising the properties of LP_01 and LP_10
- Exploiting additional path properties
- (Congested) path fingerprinting
Develop a non-cooperative method for measuring forward/reverse-path loss pairs.
Link: http://www.oneprobe.org/
Q: Probably measuring the properties of one of the subqueues in a router.
A: ??

Short paper: Measuring Path MTU Discovery Behaviour

Speaker: Matthew Luckie

Common perception that PMTUD is unreliable.
Similar measurement technique (TBIT)
Take home points:
Systems that advertise an MSS of 1380 (10.8% of population) fail at PMTUD dispropritionately (27.1%)

Long paper: Demystifying Service Discovery: Implementing an Internet-Wide Scanner

Speaker: Derek Leonard

Techniqnues for quickly discovering available services in the Internet benefit multiple areas
- Help characterise internet growth
- Distance esimation
- Understanding how worms create massive botnets
- Discovering and patching security flaws.
The paper chronicles the development of IRLScanner.
Propose:
- Maximise politeness at remove networks
- Allow scaninng in minutes or hours
Definitions: Assume M local machines. In some set F there are n = F targets.
Service discovery: Requests from local hosts are sent to targets in F, which are marked as alive if they respond.
Formalise politeness: Formal analysis of service discovery algorithms has not previously been attempted.
\Permutation goal: Spread probes to a subnet evenly throughout F
Define globally IP wide (GIW) to be a permutation that is IP-wide at all subnets.
All networks are probed at constant rate s /T
Evaluation
- Internet-wide service discovery are sparse in the literature
- Time and resources seem to be constraints
- Overwhelming number of complaints thwarts researchers (bad publicity, legal threats
Each target address is classified into one of four categories:
- open set (SYN_ACK)
- Closed set (RST)
- ????
- Dead (don’t respond at all)
OS Fingerprinting: Use distinguishing characteristics of network trasffic toi infer interesting information
Operating sytem is an important metric.
Estimate the global impact of known vulnerabilities
This has not been attempted internet-wide in the literature.
Fingerprinted 39.6M servers.
General purpose hosts dominate the set (82%): Windows: 50%; Linux: 40%
Removed any network whose administrator complained.
Blocking too many would render th e measurements useless.
0.23% of the routable space blocked

Session 5: Wireless

**Long paper: **Measurement and Analysis of Real-world 802.11 Mesh Networks

Speaker: Katrina LaCurts

We use measurements from more than 100 realw-rodl networks to answer questions:
- Can SNR-based bit rate adaptation overcome the scaling problems of probe-based methods?
- The benefits of opportunistics routing depends on the topology of the network. As deployed today, how much benefit do networks see?
- Previous results on the frequency of hidden terminals are contradicyory. How prevalent are terminals in real networks?
1407 APs total
Network size of 3 – 203 APs (median 13)
[…]
SNR-based bit rate adaptation works well for static APs is trained over a link and can help frame-based bit rate adaptation in 802.11n scale.
Opportunistic routing: Goal, to exploit broadcast nature of wireless networks.
How much time does it take to send a packet “normally” compared to the time is takes to send a packet opportunistically?
Calculating ExOR(s -> D) = time is takes to send a packet from s to d using opportunistic routing. [ExOR SIGCOMM 2005]
This works better with:
- longer paths.
- high path diversity.
- certain types of short paths.
These types of topologies are rare in today’s networks.
Hidden terminals: Three nodes, BAC. B and C can sense A, but not each other; frames from B or C to A can clash and interfere with each other.
How frequently do hidden triples occur?
65% of AP’s are involved in a hidden triple (on average). More prevalent than previosuly believed.
Q: How did you measure SNR?
A: Using madwifi drivers to report actual SNR values.
Q: Why are they not using 9Mbps?
A: Don’t know.
Q: Given that you used 24 hours of data, how representative do you think your data is?
A: Think fairly representative.
Q: Isn’t opportunistic routing most useful in mesh networks? Isn’t it pointless here?
A: Don’t think so […]
Q: Any results comparing probe-based methods vs. SNR methods
A: For this, not really. Not directly.

Long paper: Characterizing Radio Resource Allocation for 3G Networks

Speaker: Alexandre Gerber

Focus on UMTS
Limited radio resources in cell networks need to be efficiently managed.
Allocation of resources triggered by user data transmission activity.
Release of resources controlled by inactivity timers. Timeout value, called “tail time”.
State promotions have promotion delay
State demotions incur tail times (waste radio resources & energy)
State occupation time and tail times
- half of time in DCH, half of time in FASH (near 100% of data transferred in DCH)
- Spend 7% of time being promoted.
Promotion overhead == promotion time / total session duration
What-if analysis for inactivity timers
[…]
Streaming traffic: YouTube video streaming
Under-utilisation of bandwidth leads to long DCH session, leads to poor battery use.
Use fast dormancy to eliminate the tail on each chunk: Handset explicitely asks to be put into idle mode, saving battery.
Conclusion: Most radio resource and energy is consumed when not actually transmitting data. The RRC state machines trade-off is hard to balance, as timers are globally and statically set. Hard to adapt to the diversity of traffic patterns.
Two approaches to address the problem:
- Apps alter traffic paterns based on the state machien behaviour
- Apps cooperate with network in allocating radio resource.
Q: Have you considered tuning values by users, rather than applications? i.e., tune by user behaviour?
A: The fast dormancy stuff is good for the application to use, can predict based on past usage
Q: Don’t think you can do this by application: multi-tasking. It’s the aggregate traffic that determines what you need to do.
A: Agree.
Q: Also, transmitting IP traffic during a voice-call can be cost-free. So, opportunistic or delay-tolerant applications can be used here.

Long paper: On the Feasibility of Effective Opportunistic Spectrum Access

Speaker: Vinod Kone

Spectrum scarcity is a big problem! Reasons:
- static spectrum allocation
- Most of spectrum is licensed.
- But 95% of the spectrum is idle? McHenry’s NSF report from ‘05.
Opportunistic Spectrum Access (OSA). Key idea:
- Primary users (PU) - licensed users (e.g., cellular, TV)
- Secondary users (SU) - accesses spectrum when PU doesn’t
Challenges: Unpredictable PU behaviour; Obeying PU diruption threshold
Questions: How much of the available spectrum is accessible? Can OSA support existing applications?
Results: Spectrum availability != accessibility; Accessible spectrum is very low
OSA cannot support existing applications as is
Frequent interruptions and high delay
Spectrum traces from multiple locations (4 countries)
Wide frequency covered (20MHz to 6GHz) for 1-2 weeks
15 popular service bands (TV, cellular, etc)
Results:
Availability:
- Available is occupied less than 5% of the time
- Busy is occupied more than 95% of the time
- Partially available if occupancy == [5%,95%]
Extraction rate: %availeble spectrum accessible by SU:
- No knowlege: 10%
- Statistical knowledge: Max of 35%
i.e., availability != accessibility
Low spectrum extraction == frequent interruptions.
Frequency bundling: Combine multiple unreliable channels into one reliable channel.
Key challenges: How to bundle the channels? How to access a bundle?
How are channels correlated? There is a high percentage of channels that show low correlation. This means we can actually bundle the channels randomly.
Can we minimise the blocking time by bundling? Yes.
Conclusion:
- Significant partially available spectrum: ~26%
- Availability != accessibility.
- Frequent interruptions and high blocking times. OSA cannot support existing applications as-is.
Q: How scalable is your bundling scheme?
A: Since it is random, it is very scalable.
Q: What if everyone uses the bundling scheme?
A: ??

// End day 1.

Footnote

Posted by Stephen Strowes on Thursday, November 4th, 2010. You can follow me on twitter.

Stephen D. Strowes

ACM Internet Measurement Conference (IMC) 2010: Day 1

Footnote

Recent Posts